-
Notifications
You must be signed in to change notification settings - Fork 3
myLD
After attempting for quite a while to code my own bootstrapping and LD calculator in R, I gave up and found out that VCFtools
already has this functionality. See the ld_calc.sh
script for the VCFtools
commands, and the ld_plotter.r
for the plotting commands I ended up using.
As a reminder, I'm bootstrapping from the entire Pf Cambodia WGS population and calculating r2 for each of these bootstrap values, just to prove that the CP1, 3, 4 groups are more clonal than would be expected by chance.
Once I got this up and running, I thought that it is odd that r2 for CP2 never decays all the way to zero (at least out to 200000 bps apart). It decays to like 0.1 or 0.2 or something, but not below that. Could either be because this CP2 population (the "founder" population) is still pretty clonal in some areas of the genome or because what I've done is somehow messed up (but I feel like it's roughly correct because it's giving the expected results and we see an immediate decay from 0 on out).
Anyway, so I'm testing this pipeline out on my P. vivax VCF as well. It has many more SNPs, so the filesizes get real big real fast, but I want to see whether it decays to zero between SNPs that are far away. Plotting decay, it does seem that Pv LD decays to zero or nearly zero.