-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adds pangolin 4.3.1 with new pdata 1.30 #1052
Conversation
Does anyone have the bandwidth to review this PR? |
It looks like the tests work.
|
# download assembly for a BA.1 from Florida (https://www.ncbi.nlm.nih.gov/biosample?term=SAMN29506515 and https://www.ncbi.nlm.nih.gov/nuccore/ON924087) | ||
# run pangolin in usher analysis mode | ||
RUN datasets download virus genome accession ON924087.1 --filename ON924087.1.zip && \ | ||
unzip ON924087.1.zip && rm ON924087.1.zip && \ | ||
mv -v ncbi_dataset/data/genomic.fna ON924087.1.genomic.fna && \ | ||
rm -vr ncbi_dataset/ README.md && \ | ||
pangolin ON924087.1.genomic.fna -o ON924087.1-usher && \ | ||
column -t -s, ON924087.1-usher/lineage_report.csv | ||
|
||
# test specific for new lineage, XBB.1.16, introduced in pangolin-data v1.19 | ||
# using this assembly: https://www.ncbi.nlm.nih.gov/nuccore/2440446687 | ||
# biosample here: https://www.ncbi.nlm.nih.gov/biosample?term=SAMN33060589 | ||
# one of the sample included in initial pango-designation here: https://github.com/cov-lineages/pango-designation/issues/1723 | ||
RUN datasets download virus genome accession OQ381818.1 --filename OQ381818.1.zip && \ | ||
unzip -o OQ381818.1.zip && rm OQ381818.1.zip && \ | ||
mv -v ncbi_dataset/data/genomic.fna OQ381818.1.genomic.fna && \ | ||
rm -vr ncbi_dataset/ README.md && \ | ||
pangolin OQ381818.1.genomic.fna -o OQ381818.1-usher && \ | ||
column -t -s, OQ381818.1-usher/lineage_report.csv | ||
|
||
# testing another XBB.1.16, trying to test scorpio functionality. Want pangolin to NOT assign lineage based on pango hash match. | ||
# this test runs as expected, uses scorpio to check for constellation of mutations, then assign using PUSHER placement | ||
RUN datasets download virus genome accession OR177999.1 --filename OR177999.1.zip && \ | ||
unzip -o OR177999.1.zip && rm OR177999.1.zip && \ | ||
mv -v ncbi_dataset/data/genomic.fna OR177999.1.genomic.fna && \ | ||
rm -vr ncbi_dataset/ README.md && \ | ||
pangolin OR177999.1.genomic.fna -o OR177999.1-usher && \ | ||
column -t -s, OR177999.1-usher/lineage_report.csv | ||
|
||
## test for BA.2.86 | ||
# virus identified in MI: https://www.ncbi.nlm.nih.gov/nuccore/OR461132.1 | ||
RUN datasets download virus genome accession OR461132.1 --filename OR461132.1.zip && \ | ||
unzip -o OR461132.1.zip && rm OR461132.1.zip && \ | ||
mv -v ncbi_dataset/data/genomic.fna OR461132.1.genomic.fna && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this list of tests is getting very long, and some of what we WERE testing for, we aren't being tested in this image.
What if we did something like
# testing the following lineages:
# XBB.1.16 OR177999.1
# BA.2.86 OR461132.1
# JN.2 (BA.2.86 sublineage) JN.2 is an alias of B.1.1.529.2.86.1.2 OR598183.1 (NY CDC Quest sample)
# Q.1 (BA.2.86.3 sublineage); JQ.1 is an alias of B.1.1.529.2.86.3.1 OR716684.1
# JN.1.22 (BA.2.86.x sublineage; full unaliased lineage is B.1.1.529.2.86.1.1.22) PP189069.1
# JN.1.48 (BA.2.86.x sublineage; full unaliased lineage is B.1.1.529.2.86.1.1.48) PP218754.1
# and so on...
RUN datasets download virus genome accession OQ381818.1,OR177999.1,OR461132.1,OR598183.1,OR716684.1,PP189069.1,PP218754.1,PP770375.1,PQ073669.1,PQ034842.1 && \
unzip ncbi_dataset.zip && \
pangolin ncbi_dataset/data/genomic.fna && \
column -t -s, lineage_report.csv
Thank you for putting this together! I'm going to deploy this to dockerhub and quay to staphb/pangolin using the tags '4.3.1-pdata-1.30' and 'latest' |
Changes from previous dockerfile
unzip -o
option to overwrite files named the same (likemd5sum.txt
that comes with genomes downloaded using ncbidatasets
). These files are unnecessary to keep around so I just opted to overwrite them each time a.zip
file is unzipped & files are extracted.code diff:
Pull Request (PR) checklist:
docker build --tag samtools:1.15test --target test docker-builds/samtools/1.15
)spades/3.12.0/Dockerfile
)shigatyper/2.0.1/test.sh
)spades/3.12.0/README.md
)