Skip to content
Christian Parobek edited this page Oct 15, 2015 · 37 revisions

Welcome to the cambodiaWGS wiki!

I will document my analysis through this wiki. The associated scripts (and results?) will be kept in this repo.

###Getting Additional Populations:

###Steps to good SNP-call data:

  • HaplotypeCaller - decided that I should look into GATK's HaplotypeCaller
  • genomeCoverage - using bedtools and/or GATK's DepthOfCoverage.
  • variantCalling - documented as part of the gatk_pipeline repo and the HaplotypeCaller updates are documented here.
  • weakestLinks - remove the lowest-coverage samples prior to downstream analysis. Gives us a shot at getting full haplotypes for all SNPs.
  • variantFiltering - modeled loosely after Manske et al.
  • [characterizeCoverage]

###Population Differentiation:

###Selective Sweeps:

  • hapFLK - Andrew found this and recommended. Looks like it is sensitive and specific for sweeps even in complicated demographic backgrounds, because it makes extended haplotypes, then does Fst on those, somehow.

###Cool Images:

  • circos - Cool, perl-based circular image maker.

###Bottlenecks:

  • Loss of Rare Alleles - If we find a greater loss of rare alleles in P. falciparum than in P. vivax, then maybe that's evidence that P. falciparum has undergone a greater genetic bottleneck than P. vivax.

###Demography

  • dadi - Inference of population demographic parameters. What Hartl used in their MBE paper.

###Miscellaneous

  • snpEff - Use it to predict SNP functions. Will be useful for subsetting variants for class analysis.
Clone this wiki locally