Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lineage-1A and lineage-2 phylogenetic trees #61

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

j23414
Copy link
Collaborator

@j23414 j23414 commented Jan 21, 2025

Description of proposed changes

Add lineage-1A and lineage-2 trees

Current Draft

The rooting for Lineage-1A looks particularly terrible to me:

These builds corresponds to the following config file:

Suggestions for rooting, molecular rate, are welcome.

You can run the build using:

git clone https://github.com/nextstrain/WNV.git
cd WNV
git checkout add-lineage1a-lineage2-trees

# Build lineage-1A and lineage-2 trees
nextstrain build phylogenetic --configfile defaults/config.yaml defaults/lineage-1A/config.yaml
nextstrain build phylogenetic --configfile defaults/config.yaml defaults/lineage-2/config.yaml

Related issue(s)

Checklist

  • Pick a lineage-2 reference, flag any INDEL regions
  • Pick an appropriate root for lineage-1A phylogenetic tree
view alignment WNV_reference_comparison
  • Checks pass

Update config values that are specific to lineage-1A and lineage-2 including
* Adding auspice_config.json with differen Titles
* Subsetting the include.txt to the individual lineages
* Adding the lineage-2 reference.gb file from RefSeq
* Clear any fixed clock value from the global build
Added two GitHub Actions in order to support building both lineage-1A and
lineage-2 phylogenetic trees. For the most part these take in the config values
from the global (possibly renamed to all-lineages) config file but use different
references, different clock rates, different include strains, and different
build titles.
@j23414 j23414 linked an issue Jan 21, 2025 that may be closed by this pull request
@DOH-LMT2303
Copy link
Collaborator

Lineage 2
Lineage 2 was primarily found in sub-Saharan Africa and Madagascar, but emerged in Europe in 2004.
Visser, et al. 2024
Hungary/04 ((DQ116961) represents the first isolate of this clade
Barzon, et al. 2015
WNV lineage 2 strains are divided into clades 2a and 2b, of which 2a is further subdivided into clusters A and B. WNV lineage 2a is currently dominant in Europe; sub-cluster A strains are found in northwest Europe, and sub-cluster B strains in southeast Europe.
Visser, et al. 2024

Datasets with genomes classified into clades
Used two datasets with over 200 sequences already classified in clades in other papers to propose new nomenclature
Santos, et al. 2023

@j23414
Copy link
Collaborator Author

j23414 commented Jan 28, 2025

Thanks, @DOH-LMT2303, for reading the literature to identify the earliest samples per lineage in the effort to find a suitable root!

I noticed that DQ116961 had a metadata date of XXXX-XX-XX, which I've updated to 2004-XX-XX. As you continue your literature reading, feel free to add date (or other) annotations to:

No pressure, though --- I'd suggest prioritizing the rooting search (especially for lineage 1a). If you have the time and energy, updating annotations in the data would also be helpful.

@j23414
Copy link
Collaborator Author

j23414 commented Feb 1, 2025

This work is on hold, till we get clarification of priorities across the different working teams.

@j23414
Copy link
Collaborator Author

j23414 commented Feb 25, 2025

Just a note that this PR is not blocked but is on hold until the next release of Augur, which will include the prune-outgroup functionality.

The Lineage 1a tree looks much better with a defined outgroup:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use lineages for build targets
2 participants