Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated M2 WDL README with Funcotator info. #5892

Merged
merged 2 commits into from
Apr 18, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 10 additions & 6 deletions scripts/mutect2_wdl/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,18 @@ This file has reasonable default parameters.
- "broadinstitute/gatk-protected:1.0.0.0-alpha1.2.4" (This is a private image! Recommended use ``gatk_jar`` as ``/root/gatk.jar``)
- "broadinstitute/genomes-in-the-cloud:2.2.4-1469632282" (You must specify a ``gatk4_jar_override``)

### Functional annotation (Oncotator)
### Functional annotation (Funcotator)

The M2 WDL can optionally run oncotator for functional annotation and produce a TCGA MAF from the M2 VCF. *Oncotator is not a GATK4 tool and is provided in the M2 WDL as a convenience.* There are several notes and caveats
- Several parameters should be passed in to populate the TCGA MAF metadata fields. Default values are provided, though we recommend that you specify the values. These parameters are ignored if you do not run oncotator.
- Several fields in a TCGA MAF cannot be generated by M2 and oncotator, such as all fields relating to validation alleles. These will need to be populated by a downstream process created by the user.
- Oncotator does not enforce the TCGA MAF controlled vocabulary, since it is often too restrictive for general use. This is up to the user to specify correctly.
Funcotator (**FUNC**tional ann**OTATOR**) is a functional annotation tool in the core GATK toolset and was designed to handle both somatic and germline use cases. It analyzes given variants for their function (as retrieved from a set of data sources) and produces the analysis in a specified output file. Funcotator reads in a VCF file, labels each variant with one of twenty-three distinct variant classifications, produces gene information (e.g. affected gene, predicted variant amino acid sequence, etc.), and associations to information in datasources. Default supported datasources include GENCODE (gene information and protein change prediction), dbSNP, gnomAD, and COSMIC (among others). The corpus of datasources is extensible and user-configurable and includes cloud-based datasources supported with Google Cloud Storage. Funcotator produces either a Variant Call Format (VCF) file (with annotations in the INFO field) or a Mutation Annotation Format (MAF) file.

Funcotator allows the user to add their own annotations to variants based on a set of data sources. Each data source can be customized to annotate a variant based on several matching criteria. This allows a user to create their own custom annotations easily, without modifying any Java code.

By default the M2 WDL runs Funcotator for functional annotation and produce a TCGA MAF from the M2 VCF. There are several notes and caveats
- Several parameters should be passed in to populate the TCGA MAF metadata fields. Default values are provided, though we recommend that you specify the values. These parameters are ignored if you do not run Funcotator.
- Several fields in a TCGA MAF cannot be generated by M2 and Funcotator, such as all fields relating to validation alleles. These will need to be populated by a downstream process created by the user.
- Funcotator does not enforce the TCGA MAF controlled vocabulary, since it is often too restrictive for general use. This is up to the user to specify correctly.
*Therefore, we cannot guarantee that a TCGA MAF generated here will pass the TCGA Validator*. If you are unsure about the ramifications of this statement, then it probably does not concern you.
- More information about Oncotator can be found at: http://archive.broadinstitute.org/cancer/cga/oncotator
- More information about Funcotator can be found at: https://gatkforums.broadinstitute.org/dsde/discussion/11193/funcotator-information-and-tutorial/

### Parameter descriptions

Expand Down