Code, scripts and additional data pertaining to "A large and diverse autosomal haplotype is associated with sex-linked colour polymorphism in the guppy"
https://www.nature.com/articles/s41467-022-28895-4
Please contact me at parisjosephine<@>gmail.com with any queries
DNA sequencing data are available at the European Nucleotide Archive (ENA) under the Study Accession PRJEB36506: Pool-seq Iso-Y data (SAMEA6512722-SAMEA6512725); whole-genome sequencing data for Paria (SAMEA8750557-SAMEA8750565); long-read pacbio data for Iso-Y6 (SAMEA8795870-SAMEA8795872). Whole-genome sequencing data for Upper Marianne individuals are available from the ENA under the Study Accession PRJEB10680 (SAMEA3649957-SAMEA3649973).
Please refer to the file sample_metadata.txt
for more information on read names and sample accessions
SNP calling was performed using the pipeline described here: https://github.com/josieparis/gatk-snp-calling
Colormesh can be found here: https://github.com/J0vid/Colormesh
Publication with example data is published in https://onlinelibrary.wiley.com/doi/full/10.1002/ece3.7992
01_DAPC_heatmaps.R
02_phenotype_data_PCA_plots.R
03_phenotype_perm.R
The LG12 coordinates were lifted over from the genome available in Fraser et al 2020 GBE (https://doi.org/10.1093/gbe/evaa187) to add additional contigs placed by Deborah Charlesworth's genetic maps:
The function for liftover of chr12 coordinates is: update_chr12_liftover.R
The updates to the files are run in an R script (and bash to sort the vcf files afterwards)
chr12_liftover.R
Poolfstat is available here: https://cran.r-project.org/web/packages/poolfstat/index.html
Script to calculate pairwise FST and allele freqs using poolfstat: poolfstat_FST_AFs.R
Z_FST_PCA_analysis.R
polarise_plot_AFs.R
polarised_AF_densities.R
CPD_analysis.R
raw_pairwise_fst.R
To calculate pi in 10kb windows we use a custom function (written by Jim Whiting) pool_pi.R
To run the function and create the figures: Z_pi_PCA_analysis.R
LDheatmap package can be found here: https://cran.r-project.org/web/packages/LDheatmap/index.html
LD_heatmap.R
Lostruct can be found here: https://github.com/petrelharp/local_pca
lostruct_localPCA.R
PoopGenome can be found here: https://cran.r-project.org/web/packages/PopGenome/index.html
popgenome_nat_data.R
https://github.com/josieparis/interchromLD
https://github.com/josieparis/genotype_plot
GO_KEGG_analysis_LG1.R
interchromLD