Skip to content

CTAT mutations installation

Brian Haas edited this page Mar 14, 2019 · 20 revisions

CTAT-mutations Installation

CTAT-mutations Software

The CTAT-mutations software can be installed via Conda (simplest), more manually using the conventional approach or download using Galaxy Toolshed. All options are described below.

Option 1: Installation via Conda

For those that are Conda users, installation can be as simple as running the following:

   conda install ctat-mutations

Option 2: Installation from GitHub

CTAT-mutations makes use of the following companion utilities, which should be separately installed if not already available.

  1. GATK4
  2. PICARD tools
  3. STAR Aligner
  4. Samtools
  5. Bcftools
  6. Tabix

Download and install the latest release of CTAT-mutations from the GitHub CTAT-mutations release page

Option 3: Installation via Galaxy toolshed

Please follow instructions on Galaxy toolshed installation.

Required Environmental Variables

The CTAT-mutations pipeline depends on the following environmental variables being set:

   export GATK_HOME=/path/to/GATK
   export PICARD_HOME=/path/to/picard
   export CTAT_GENOME_LIB=/path/to/installed/ctat_genome_lib

Note, for routine use, you should set these values permanently in your ~/.bashrc file or have other means to ensure that they are set prior to each invocation of the CTAT-Mutations Pipeline.

CTAT-mutations Genome Lib Installation

Step 1: Genome lib setup

The CTAT-mutation lib integrates with the standard CTAT genome lib that's leveraged as part of the Trinity CTAT project. Instructions below describe how to set up your CTAT genome lib and integrate the mutation lib. It's pretty easy.

Trinity CTAT supports both human genome builds hg19 and hg38, so you can choose whichever version you prefer to operate from, or separately install both so you have flexibility to use either.

Hg19 setup

  1. Download GRCh37_v19_CTAT_lib_Feb092018

Note, if you already have a CTAT genome lib installed for use with other Trinity CTAT project utilities, feel free to use the genome lib you already have.

  1. Download ctat mutation resource for hg19

  2. Uncompress GRCh37_v19_CTAT_lib_Feb092018.plug-n-play.tar.gz

    tar -xvf GRCh37_v19_CTAT_lib_Feb092018.plug-n-play.tar.gz

  3. Move mutation_lib.hg19.tar.gz into GRCh37_v19_CTAT_lib_Feb092018/

OR

Hg38 setup

  1. Download GRCh38_v27_CTAT_lib_Feb092018

Note, if you already have a CTAT genome lib installed for use with other Trinity CTAT project utilities, feel free to use the genome lib you already have.

  1. Download ctat mutation resource for hg38

  2. Uncompress GRCh38_v27_CTAT_lib_Feb092018.plug-n-play.tar.gz

    tar -xvf GRCh38_v27_CTAT_lib_Feb092018.plug-n-play.tar.gz

  3. Move mutation_lib.hg38.tar.gz into GRCh38_v27_CTAT_lib_Feb092018/

Step 2: Download Cosmic Resources

Due to licensing requirements, we cannot simply provide Cosmic data resources as part of our mutation lib. You'll need to obtain them separately, but it's fairly straightforward to do, and is free for academics.

Next download COSMIC resources required in this directory. Depending on the version of genome you need you can install either COSMIC's hg38 or COSMIC's hg19. You will need to download 2 sets of files: COSMIC Mutation Data (CosmicMutantExport.tsv.gz) and COSMIC Coding Mutation VCF File (CosmicCodingMuts.vcf.gz). Please note, for download to succeed you will need to register and login to their service.

Step 3: Mutation lib integration

Once you have downloaded CosmicMutantExport.tsv.gz AND CosmicCodingMuts.vcf.gz (hg38 or hg19), proceed with mutation lib integration step which will integrate the mutation resource with CTAT_GENOME_LIB (This corresponds to "GRCh37_v19_CTAT_lib_Feb092018" or "GRCh38_v27_CTAT_lib_Feb092018" downloaded in Step 1). You will find this script in ctat-mutations repo in 'src' directory.

#Keep Picard in PICARD_HOME environmental variable like so
export PICARD_HOME=/path/to/picard

#Integrate CTAT mutations lib with CTAT genome library
python ctat-mutations/mutation_lib_prep/ctat-mutation-lib-integration.py \
     --CosmicMutantExport CosmicMutantExport.tsv.gz \
     --CosmicCodingMuts CosmicCodingMuts.vcf.gz \
     --genome_lib_dir GRCh37_v19_CTAT_lib_Feb092018/ # OR GRCh38_v27_CTAT_lib_Feb092018/

Congratulations! Your CTAT genome lib is now fully configured for use with the ctat-mutations pipeline.