COVID Phylogeny

Creating a tool for phylogenetic analysis of COVID-19 sequence data

Communication

For the time being, there is a #phylogeny channel on the Slack group (check out the [email protected] group for the invitation link). During the BioHackathon, we'll update this section.

Resources

Workflows

Data

Tools (brainstorm section)

multiple sequence alignment tools, e.g. muscle, mafft
phylogenetic inference tools, e.g. RAxML
sequence rate evolution analysis tools, e.g. PAML, HyPhy

Ideas for projects

Working on the phylogeny of COVID 19 (similar to this analysis, and more connected to this article in terms of receptors and conserved sites).
To be implemented as a rerunnable workflow for when new sequence data come available
Easily deployable, runnable in public cloud
Connected to other COVID 19 analysis workflows and their emerging I/O standards

The current list of SARS-CoV-2 sequences GenBank can be used for this purpose, and, if developed as a workflow, it can connect to the "main" public sequence resource deliverable/task - possibly also to the biostatistics and the Machine Learning ones.

As for technical implementation, it would make sense to implement this as a rerunnable workflow (e.g. Snakemake or CWL) that is therefore connected to the Workflows activity. As available sequence data continues to grow, some of the analysis steps are going to become computationally expensive. (For example, running BranchSiteREL or similar analyses.) Hence, we should plan for scaling out to HPC cloud infra.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

COVID Phylogeny

Communication

Resources

Ideas for projects

Participants

Clone this wiki locally