-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scorpio lineage replacement results in "Unassigned" lineage. #449
Comments
We observed something similar. With the Constellations v0.1.10 update, we saw the number of "Unassigned" sequences in our database triple. A breakdown of how many of each lineage went to "Unassigned" with the latest update is below. Can you verify that this is working as intended? The changes seem more far reaching than the v0.1.10 update note suggests and the increase in "Unassigned" sequences is substantial. Also, could you clarify under what conditions Scorpio will override a lineage call with "Unassigned"? I was under the impression that Scorpio/Constellations were focused on refining VoC classifications, but it seems to be doing quite a bit more here.
|
We removed the 'probable' constellation definition for BA.* sublineages- this was originally intended to avoid false negatives when Omicron was first spreading and the SNP profile of the sublineages was a lot more distinguishable. At this stage, particularly when BA.2, BA.4 and BA.5 are so similar, we now need to prioritise avoiding false positives in the case of missing SNPs. If you check some of the more recent issues with constellations you'll see people reporting that probable definitions from scorpio had been leading to mis-calls and inappropriate overwriting of UShER assignments. This is why we've now removed probable definitions. The sequences that will have switched to unassigned are those that don't meet the SNP thresholds defined within scorpio constellations, and previously may have been picked up by the 'probable' definitions but now cannot be. |
Hi Áine, thank you for the explanation. Just to be clear – the "Unassigned" I was referring to was the
If I understood correctly, in cases where scorpio calls where "Unassigned" it should fallback to UShER calls and not overwrite it? it seems that scorpio is still overwriting UShER assignments, but instead it is now overwriting with "Unassigned". In the example I posted UShER assigns BA.2 (3/3), but the lineage field shows "Unassigned". I believe an unintended effect of this change is that pangolin still replaces the Is this an intended effect of this change? If so, could you elaborate more why an UShER assignment should not be trusted if the scorpio cannot pinpoint a lineage? Thank you! |
We've been seeing this for unambiguous Usher calls as well - we almost have more samples with no reported conflict affected than ones which do have conflict. |
Hi team--is there any update on this issue--the final |
Wanted to post this here. At Colorado we are seeing the issue that previously assigned lineages are now unassigned as has been discussed above. However we are only seeing this issue on our ONT runs using the V4.1 artic primers, not on our Illumina runs using the V3 artic primers. We have also noticed that the sequences being called as unassigned have low coverage (~10x read depth) between 22,475-22,775 which corresponds to AA 305-405 in the Spike protein. Also when using the --skip-scorpio flag it seems like most of these unassigned sequences get assigned to BA.2 or it's sublineages and a handful get assigned to BA.4. We are prioritizing getting these sequencing up on GISAID so we can provide GISAID ids. Here are those gisaid ids. |
We are are also seeing sequences with "Unassigned" for the lineage. ~1/3 of our recent sequences that passed QC having an unassigned lineage. Were using midnight 1200 primers, sequencing on illumina miseq and run pangolin with the staph-b docker image. Some of our sequence with this problem GISAID IDs |
We might be seeing a trend towards fewer unassigned with ONT's Midnight V3 primers, but it's a little early to tell. |
It's really difficult to find a combination of mutation lists, thresholds and specific allele rules to get both sensitivity and specificity when distinguishing between BA.2, BA.4 and BA.5 in the presence of Ns and false reversions in different regions from different sequencing methods. And @aineniamh has many demands on her time, but is actively working on this now, and I will try to help in what ways I can, being less familiar with scorpio/constellations. Thanks @molly-hetheringtonrauth and @karenbobier for the specific examples! Those will be really helpful for testing. In the meantime, one thing we've been considering is possibly making scorpio override pangoLEARN but not UShER. If you are running pangolin with the default UShER mode, consider adding the |
In the latest release (4.1) we no longer overwrite usher calls if they don't meet the scorpio checks as we're confident in the usher calls. In pangoLEARN mode scorpio is still in place as before and will elimiate false positives. Closing this issue as I believe the latest version of pangolin will resolve your issue. |
Dear pangolin developers and users,
After updating to the latest version (4.0.6), I started to observe a high number of "Unassigned" samples, that had been assigned lineages successfully in previous versions. This unassigned calls occurs even when the samples pass QC check.
I noticed that when Scorpio replaces usher lineage inference, the "lineage" field becomes "Unassigned". To confirm this, I disabled scorpio and the lineage field was properly populated (see bellow).
With Scorpio:
Without Scorpio:
Is anyone else experiencing this?
Thank you.
Versions:
pangolin (v4.0.6)
pangolin-data (v1.8)
constellations (v0.1.10)
scorpio (v0.3.17)
pangolin-assignment (v1.8)
The text was updated successfully, but these errors were encountered: