Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplication of Netherlands/Overijssel_2/2020 #174

Closed
pvanheus opened this issue Mar 14, 2020 · 2 comments
Closed

Duplication of Netherlands/Overijssel_2/2020 #174

pvanheus opened this issue Mar 14, 2020 · 2 comments

Comments

@pvanheus
Copy link

The virus name Netherlands/Overijssel_2/2020 is associated with both EPI_ISL_414550 and EPI_ISL_414460, with the former being dated later than the latter. I don't see an easy way to exclude one of what appear to be duplicates, but the duplicate name causes processing problems.

@trvrb
Copy link
Member

trvrb commented Mar 15, 2020

Hi @pvanheus. We've just merged #59 that should fix this duplicates issue. After downloading from GISAID, run

./scripts/normalize_gisaid_fasta.sh data/gisaid_cov2020_sequences.fasta data/sequences.fasta

and then proceed with snakemake or nextstrain build, but let us know if this doesn't resolve things.

@trvrb trvrb closed this as completed Mar 15, 2020
@pvanheus
Copy link
Author

Thanks, I have my own Python script that does the same as that script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants