Skip to content
parasitehunter edited this page Sep 4, 2014 · 9 revisions

Link to Bowtie2 manual.

Bowtie2 is the aligner of choice for our application (eukaryotic small genomes) according to our collaborators (JB and DD). Here is a collection of thoughts on how Bowtie2 should be implimented in our context:

  • Discovering that SAM/BAM headers are very important for GATK. Specifically, the @RG line is crucial, and is not included in the SAM header by default. Bowtie2 should be tole explicitly to include the most important components of the @RG line: RGID, RGPL, RGLB, RGSM. Adding the header at this step means we don't have to add important header information later, using Picard's AddOrReplaceReadGroups, for example.
  • If you view a SAM file using cat foo.sam | more, the header is obvious (painfully so if there are thousands of partially assembled contigs in your genome reference). However, if you view a BAM file using samtools view foo.bam | more, the header isn't there. However, that doesn't mean that the header isn't present. You can view the header by specifying samtools view -h foo.bam | more.
Clone this wiki locally