GAMMA: Gene expression plasticity, genetic variation and fatty acid remodelling in divergent populations of a tropical bivalve species
An integrated worklow based on genome-mapping and DE gene assessment to conduct RNA-seq data analyses on cluster machines
For this section, you will need to modify the cluster parameters and the $WORKDIR variable as well as the path to all sofwares according to your own specificities.
We used Trimmomaticv0.36 For this, run:
We used GSNAP (GMAPv2021.08.25) For this, run:
qsub 00_scripts/01_diff_expression/02_gmap_index_genome.pbs
We used GMAPv2021.08.25
For this, run:
We used hstseqv0.9.1
For this, run:
For this section we followed GATK best practices for SNPs identification from RNAseq data. Please see:
We used GATK4.0.3.0
For this, run:
qsub 00_scripts/02_snps/01_gatk_prepare_ref.pbs
qsub 00_scripts/02_snps/05_combine_gvcf.pbs
We used VCFtoolsv0.1.16 and Beaglev4.0_06Jun17, respectively
We used for filtration following thresholds:
- Keep only SNPs pattern (without complex events)
- A minimum depth (DP) of 10 reads per locus per genotype within an individual (under that, the genotype is transformed to "NA").
- A minor allele frequency (MAF) of at least 10% in the sampleset
- Less than 10% missing data (miss) at a locus (over that, the locus is removed)
- Only loci that are biallelic
qsub 00_scripts/02_snps/06_variant_filtering.pbs
qsub 00_scripts/02_snps/04_combine_gvcf.pbs
We provided all the scripts necessary to explore further the data (Differential expression, co-expression network analysis, Outlier SNPs, Plasticity quantification)