-
Notifications
You must be signed in to change notification settings - Fork 31
COVID Phylogeny
Creating a tool for phylogenetic analysis of COVID-19 sequence data
For the time being, there is a #phylogeny channel on the Slack group (check out the [email protected] group for the invitation link). During the BioHackathon, we'll update this section.
Please check out the Datasets and Tools page.
Any new resources you might have in mind, please add them there directly.
Tools (brainstorm section) - particular tools can be added in the Resources page
- multiple sequence alignment tools, e.g. clustal omega, muscle, mafft
- phylogenetic inference tools, e.g. PhyML, RAxML, IQ-TREE, MrBayes (Bayesian), BEAST or BEAST2 (Bayesian)
- sequence rate evolution analysis tools, e.g. PAML, HyPhy (Phyphy: Python wrapper for Hyphy)
- Visualization of trees, e.g. ETE toolkit (Python API; has a wrapper for PAML)
- Working on the phylogeny of COVID 19 (similar to this analysis, and more connected to this article in terms of receptors and conserved sites).
- To be implemented as a rerunnable workflow for when new sequence data come available
- Easily deployable, runnable in public cloud
- Connected to other COVID 19 analysis workflows and their emerging I/O standards
- Comparing phylogenies and compositional features (e.g. G+C, k-mer, and codon composition)
The current list of SARS-CoV-2 sequences GenBank can be used for this purpose, and, if developed as a workflow, it can connect to the "main" public sequence resource deliverable/task - possibly also to the biostatistics and the Machine Learning ones.
As for technical implementation, it would make sense to implement this as a rerunnable workflow (e.g. Snakemake or CWL) that is therefore connected to the Workflows activity. As available sequence data continues to grow, some of the analysis steps are going to become computationally expensive. (For example, running BranchSiteREL or similar analyses.) Hence, we should plan for scaling out to HPC cloud infra.
- Fotis Psomopoulos
- Rutger Vos
- Tomas Masson
- Haruo Suzuki
- Festus Nyasimi
- Salvador Capella-Gutierrez
- Yrjö Koski
- Erik Garrison