gagneurlab · nickhsmith · Apr 22, 2022 · Feb 10, 2021 · Feb 10, 2021 · Feb 10, 2021
diff --git a/README.md b/README.md
@@ -11,7 +11,7 @@ The manuscript is available in [Nature Protocols](https://www.nature.com/article
 DROP is available on [bioconda](https://anaconda.org/bioconda/drop).
 We recommend using a dedicated conda environment. (installation time: ~ 10min)
 ```
-mamba install -c conda-forge -c bioconda drop
+mamba create -n drop_env -c conda-forge -c bioconda drop
 ```
 
 Test installation with demo project
@@ -49,6 +49,14 @@ This shows you the rules of all subworkflows. Omit `-n` and specify the number o
 snakemake aberrantExpression --cores 10
 ```
 
+## Citation
+
+If you use DROP in research, please cite our [manuscript](https://www.nature.com/articles/s41596-020-00462-5).
+
+Furthermore, if you use the aberrant expression module, also cite [OUTRIDER](https://doi.org/10.1016/j.ajhg.2018.10.025); if you use the aberrant splicing module, also cite [FRASER](https://www.nature.com/articles/s41467-020-20573-7); and if you use the MAE module, also cite the [Kremer, Bader et al study](https://www.nature.com/articles/ncomms15824) and [DESeq2](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0550-8).
+
+For the complete set of tools used by DROP (e.g. for counting), see the [manuscript](https://www.nature.com/articles/s41596-020-00462-5).
+
 ## Datasets
 The following publicly-available datasets of gene counts can be used as controls.
 Please cite as instructed for each dataset.

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -13,6 +13,7 @@ Then, DROP can be executed in multiple ways (:doc:`pipeline`).
    installation
    prepare
    pipeline
+   output
    license
    help
 
@@ -24,7 +25,7 @@ We recommend using a dedicated conda environment. (installation time: ~ 10min)
 
 .. code-block:: bash
 
-    mamba install -c conda-forge -c bioconda drop
+    mamba create -n drop -c conda-forge -c bioconda drop
 
 Test installation with demo project
 

diff --git a/docs/source/output.rst b/docs/source/output.rst
@@ -0,0 +1,129 @@
+Results and Output of DROP
+===========================
+
+DROP is intended to help researchers use RNA-Seq data in order to detect genes with aberrant expression,
+aberrant splicing and mono-allelic expression. By simplifying the workflow process we hope to provide
+easy to read and interpret html files and output files. This section is dedicated to explaining the relevant
-easy to read and interpret html files and output files. This section is dedicated to explaining the relevant
+easy to read and interpret HTML files and output files. This section is dedicated to explaining the relevant
-easy to read and interpret html files and output files. This section is dedicated to explaining the relevant
+easy to read and interpret HTML files and output files. This section is dedicated to explaining the relevant
+results files. We will use the results of the ``demo`` to explain the files generated.::
-results files. We will use the results of the ``demo`` to explain the files generated.::
+results files. We will use the results of the ``demo`` to explain the files generated by the following commands:
-results files. We will use the results of the ``demo`` to explain the files generated.::
+results files. We will use the results of the ``demo`` to explain the files generated by the following commands:
+
+    #install drop
+    mamba create -n drop_env -c conda-forge -c bioconda drop
+    conda activate drop_env
+
+    mkdir drop_demo
+    cd drop_demo
+    drop demo
+
+    snakemake -c1
+
+Aberrant Expression
++++++++++++++++++++
+
+html file
-html file
+HTML file
-html file
+HTML file
+#########
+Looking at the resulting ``Output/html/drop_demo_index.html`` we can see the ``AberrantExpression`` 
+tab at the top of the screen. Following that the Overview tab contains links to the:  
+
+* Counting Summaries 
+    * For each aberrant expression group
+        * split of local vs external sample counts
-        * split of local vs external sample counts
+        * number of local vs external sample
-        * split of local vs external sample counts
+        * number of local vs external sample
+        * QC relating to reads and size factors for each sample
+        * histograms relating to mean count distribution with different conditions
-        * histograms relating to mean count distribution with different conditions
+        * histograms showing the mean count distributions based on different conditions
-        * histograms relating to mean count distribution with different conditions
+        * histograms showing the mean count distributions based on different conditions
+        * information about the expressed genes within each sample and as a dataset
-        * information about the expressed genes within each sample and as a dataset
+        * expressed genes within each sample and as a whole in the dataset
-        * information about the expressed genes within each sample and as a dataset
+        * expressed genes within each sample and as a whole in the dataset
+* Outrider Summaries
+    * For each aberrant expression group
+        * the number of aberrantly expressed gene per sample
-        * the number of aberrantly expressed gene per sample
+        * the number of aberrantly expressed genes per sample
-        * the number of aberrantly expressed gene per sample
+        * the number of aberrantly expressed genes per sample
+        * how batch correction is done and the resulting lack of batch effects
-        * how batch correction is done and the resulting lack of batch effects
+        * correlation between samples before and after the autoencoder correction
-        * how batch correction is done and the resulting lack of batch effects
+        * correlation between samples before and after the autoencoder correction
+        * which samples contain outliers
+        * results table
+* Files
+    * OUTRIDER files for each aberrant expression group
+        * For each of these files you can follow the `OUTRIDER vignette for individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/OUTRIDER/inst/doc/OUTRIDER.pdf>`_. 
+    * tsv files
+        * For each aberrant expression group
+            * results.tsv
+                * this tsv file contains only the significant genes and samples that meet the cutoffs defined in the ``config.yaml`` for ``padjCutoff`` and ``zScoreCutoff``
-* Files
-    * OUTRIDER files for each aberrant expression group
-        * For each of these files you can follow the `OUTRIDER vignette for individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/OUTRIDER/inst/doc/OUTRIDER.pdf>`_. 
-    * tsv files
-        * For each aberrant expression group
-            * results.tsv
-                * this tsv file contains only the significant genes and samples that meet the cutoffs defined in the ``config.yaml`` for ``padjCutoff`` and ``zScoreCutoff``
+* Files (for each aberrant expression group)
+    * OUTRIDER data files (RDS)
+        * You can follow the `OUTRIDER vignette for further individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/OUTRIDER/inst/doc/OUTRIDER.pdf>`. 
+    * results files (TSV)
+                * the result file contains only the significant genes and samples that meet the cutoffs defined in the ``config.yaml`` for ``padjCutoff`` and ``zScoreCutoff``
-* Files
-    * OUTRIDER files for each aberrant expression group
-        * For each of these files you can follow the `OUTRIDER vignette for individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/OUTRIDER/inst/doc/OUTRIDER.pdf>`_. 
-    * tsv files
-        * For each aberrant expression group
-            * results.tsv
-                * this tsv file contains only the significant genes and samples that meet the cutoffs defined in the ``config.yaml`` for ``padjCutoff`` and ``zScoreCutoff``
+* Files (for each aberrant expression group)
+    * OUTRIDER data files (RDS)
+        * You can follow the `OUTRIDER vignette for further individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/OUTRIDER/inst/doc/OUTRIDER.pdf>`. 
+    * results files (TSV)
+                * the result file contains only the significant genes and samples that meet the cutoffs defined in the ``config.yaml`` for ``padjCutoff`` and ``zScoreCutoff``
+
+Local result files
+##################
+Additionally the ``aberrantExpression`` module creates the file ``Output/processed_results/aberrant_expression/{annotation}/outrider/{drop_group}/OUTRIDER_results_all.Rds`` this file is the Rds object containing the entire OUTRIDER results table regardless of significance.
+
+Aberrant Splicing
++++++++++++++++++
+
+html file
+##########
+Looking at the resulting ``Output/html/drop_demo_index.html`` we can see the ``AberrantSplicing`` 
+tab at the top of the screen. Following that the Overview tab contains links to the:  
+
+* Counting Summaries 
+    * For each aberrant splicing group
+        * split of local (from internal BAM files) vs external sample counts
+        * split of local vs merged with external sample splicing/intron counts
+        * comparison of local and external log mean counts
-        * comparison of local and external log mean counts
+        * comparison of local and external mean counts
-        * comparison of local and external log mean counts
+        * comparison of local and external mean counts
+        * histograms relating to junction expression before and after filtering and variability
+* FRASER Summaries
+    * For each aberrant splicing group
+        * the number of samples, introns, and splice sites 
+        * how batch correction is done and the resulting lack of batch effects
+        * result table
+* Files
+    * FRASER files for each aberrant splicing group
+        * For each of these files you can follow the `FRASER vignette for individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/FRASER/inst/doc/FRASER.pdf>`_. 
+    * tsv files
+        * For each aberrant splicing group
+            * results_per_junction.tsv 
+                * this tsv file contains only significant junctions that meet the cutoffs defined in the ``config.yaml`` they are aggregated at the junction level. 
+
+Local result files
+##################
+Additionally the ``aberrantSplicing`` module creates the following file ``Output/processed_results/aberrant_splicing/results/{annotation}/fraser/{drop_group}/results.tsv``. 
+This tsv file contains only significant junctions that meet the cutoffs defined in the ``config.yaml`` they are aggregated at the gene level. Any sample/gene pair is represented by only the most significant junction.
+
+Mono-allelic Expression
++++++++++++++++++++++++
+
+html file
+##########
+Looking at the resulting ``Output/html/drop_demo_index.html`` we can see the ``MonoallelicExpression`` 
+tab at the top of the screen. Following that the Overview tab contains links to the:  
+
+* Results
+    * For each mae group
+        * the number of samples, unique genes, and aberrant events
+        * a cascade plot that shows additional filters
+            * MAE for REF: the monoallelic expression favors the reference allele 
+            * MAE for ALT: the monoallelic expression favors the alternative allele 
+            * rare: 
+                * if ``add_AF`` is set to true in ``config.yaml`` must meet minimum AF set by ``max_AF``
+                * additionally it must meet the inner-cohort frequency ``maxVarFreqCohort`` cutoff
+        * histogram of inner cohort frequency
+        * summary of cascade plots and results table
+* Files
+    * Allelic counts
+        * a directory containing the allelic counts of heterozygous variants
+    * Results data tables of each sample (.Rds)
+        * Rds objects containing the full results table regardless of MAE status
+    * Significant MAE results tables
+        * For each mae group
+            * a link to the results tsv file.
+            * Only contains significant MAE results based on ``config.yaml`` cutoffs for the alternative allele
+* Quality Control
+    * QC Overview
+        * For each mae group QC checks for DNA/RNA matching
+* Analyze Individual Results
+    * An example analaysis that can be run using the Rds objects linked in the files subsection
+    * performed on the first mae sample 
+
+Local result files
+##################
+Additionally the ``mae`` module creates the following files:
+
+* ``Output/processed_results/mae/{drop_group}/MAE_results_all_v29.tsv.gz``
+    * this file is the tsv results of all heterozygous variants regardless of significance
+* ``Output/processed_results/mae/{drop_group}/MAE_results_v29.tsv``
+    * this is the file linked in the html document and described above
+* ``Output/processed_results/mae/{drop_group}/MAE_results_v29_rare.tsv``
+    * this file is the subsetted tsv of ``MAE_results_v29.tsv`` with only the variants that pass the rare cutoffs
+        * if ``add_AF`` is set to true in ``config.yaml`` must meet minimum AF set by ``max_AF``
+        * inner-cohort frequency must meet ``maxVarFreqCohort`` cutoff