-
Notifications
You must be signed in to change notification settings - Fork 3
We can use Fws to assess the multiplicity of infection from P. falciparum and P. vivax whole-genome data. For more info on Fws, see here.
Because the Fws metric was introduced without an associated tool for calculating it, I had to program it myself in R
.
An initial look at our P. falciparum and P. vivax genomes (post depth filtering and quality filtering, but still haven't received our last batch of P. falciparum genome data) reveals that P. vivax infections are much more complex than P. falciparum infections, with only 7/41 multiclonal P.falciparum infections and ~40/69 P. vivax infections... and remember we purposfully chose MOI=1 P. vivax infections as much as possible to WGS at the outset of this project.
This big difference could perhaps be due to two things:
- Decreased read depth in the P. falciparum samples compared to the P. vivax samples? Need to test this... is there a correlation between sample coverage and Fws? It looks like there is not a correlation at all between read depth and coverage. UPDATE: I bootstrapped over SNPs to solve this. Conclusions stood up.
- Most P. falciparum variants that pass filter fall in genes, which is not as much the case for P. vivax? While P. falciparum is enriched for coverage over exons and depleted for coverage over non-coding regions, this effect is roughly seen even when you just examine the non-coding regions.
How could I test these hypotheses?