Output files and formats

CTAT-mutations Output Files and Formats

The primary output files generated by the pipeline include the following:

variants.HC_init.wAnnot.vcf : the initially predicted variants
variants.HC_hard_cutoffs_applied.vcf : variants after applying hard cutoffs to remove likely false positives. The hard cutoffs applied via 'GATK VariantFiltration' are: " -window 35 -cluster 3 -filter FS > 30 -filter QD < 2.0 -filter SPLICEADJ < 3 "
cancer.vcf : the subset of variants that are considered most relevant to cancer biology. These are selected based on the variant annotations requiring: gnomad AF < 0.01 and (CHASM or VEST pVal < 0.05, FATHMM in ["CANCER", "PATHOGENIC"], or clinvar =~ /pathogenic/i )
igvjs_viewer.html : self-contained web-application for interactively navigating the cancer variants.

If the RVBLR boosting method is applied, the final variants file should appear as:

The variant annotations and descriptions include:

Column	Description
CHROM	Chromosome
POS	The 1-based position of the variation on the given sequence.
REF	Base(s) at position in the reference genome (hg38)
ALT	Alternate base(s)
GENE	Gene name DP - combined depth across samples
QUAL	A quality score associated with the inference of the given alleles.
MQ	RMS mapping quality
RNAEDIT	A known or predicted RNA-editing site

RPT	Repeat family from UCSC Genome Browser Repeatmasker Annotations

SPLICEADJ	Variant is within specified distance of a reference exon splice boundary

FATHMM	FATHMM (Functional Analysis through Hidden Markov Models). 'Pathogenic':Cancer or damaging 'Neutral':Passanger or Tolerated.

CHASM_PVALUE	Empirical p-value (probability that passenger variant is misclassified as a driver).
CHASM_FDR	False discovery rate expected (Benjamini-Hochberg multiple testing correction).
VEST_PVALUE	Empirical p-value (probability that benign variant is misclassified as pathogenic).
VEST_FDR	Composite false discovery rate (Benjamini-Hochberg multiple testing correction) for non-silentvariants in the gene combined with Stoufferâ€™s Z-score method.
MuPIT	MuPIT 3D structure variant link