fwdpy11 manual: https://molpopgen.github.io/fwdpy11/intro.html#
moments manual: https://momentsld.github.io/moments/
You need to perform the following on computecanada to install fwdpy11 for simulation
module load StdEnv/2020
module load python/3.9.6
####Minimum python>=3.9
python3 -m venv virtual_env
source virtual_env/bin/activate
module load gsl/2.6
module load rust/1.70.0
export PATH=~/.cargo/bin:$PATH
####This is to get cbindgen in the path
pip install --upgrade pip
pip install -r requirements.txt
If you only want to install moments, just remove fwdpy11 from the requirement.txt and perform the last step is enough.
src/simulation_moments.py
src/simulation_sweep.py
parameters.yaml
requirements.txt
pastCode/zarrMoment1000G.py
This is the functions I used to compute Dz from 1000G VCFs. It's probably not in a understandable format in its current stage.
zarrMoment1000G.py requires an input bed file that contains the SNP position of the 1000Genomes data over a pre-defined range.
The pre-defined range can separate 1) different functional regions 2) job parallelization.
From the region, the genotype VCF is read by scikit-allel dask function. More on scikit-allel refer to this tutorial: http://alimanfoo.github.io/2017/06/14/read-vcf.html
Scikit-allel has stopped its maintenance. An alternative would be sgkit: https://pystatgen.github.io/sgkit/latest/
Prior to Moments calculation, there are 3 levels of window-parsing in various functions in zarrMoment1000G.py:
- pre-defined region
- parse the pre-defined region into 0.04 cM bins
- if the 0.04 cM bin contains too many SNPs, the memory consumption would be huge. Further subset the 0.04 cM bin into smaller bins (less than 5000 SNPs) and combine the results together.
- The functions in zarrMoment1000G.py are composite and mashed up together. Separate into smaller and more resuable functional units.
- Combine all window-parsing in a simpler chunked array function, and perform sanity check.
- Follow "_" naming style instead of Camel, give clear variable and function name. Put more documentation.
- Add type hinting in the functions.