Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/36558/st george old import mising records #123

Open
wants to merge 29 commits into
base: develop
Choose a base branch
from

Conversation

NImeson
Copy link
Collaborator

@NImeson NImeson commented Oct 31, 2024

What?
Adding in functionality to import UCS variants add in the test status 10 value when recording the status of variant results. Expanding the regular expressions to account for missing SRIs. Deal with missing SRIs where there is an abornal variant but no gene to assign it to.

Why?
There are 64 records that are not currently being imported from the raw files into MBIS / CASREF. These records are large;ly abnormal results that which we really do want to capture.

Need to broaden the regex that assigns a teststatus as 2. To include fields that contain strings such as 'IVS' or 'Del X 5-7'.
Include functionality to import UCS variants and give them teststatus 10 outcome
Incorpoate functionality to import abnormal variants where the gene is unknown (also create additional empty tests for BRCA1+2 where these are full screen)

How?
Added in method to ctreate BRCA1/2 records as standard when we don't know what the FS gene in question is .
Added in method to handle a variant when we don't know the gene it belongs to
Expanded to exon regex to account for additional variant nomenclatures eg. IVS, DEL X 5-7
Added in functionality to import UCS variants and give them test status 10 value

Testing?

All missing SRIs are now being being imported with the exception of one, which has a 'nil' field (would need more substantial changes to importer). Variants for each missing SRI are being imported to some extent (it's difficult to fully import some as the way they are written conflicts with importer rules).

Variant counts between existing importer and this version show additional SRIs across the board, mostly the expected missing SRIs but also exra ones. Extra counts have been investigated and importer is performing correctly with these records.

Additional QA checks completed looking at number of variant entries across a varierty of metrics.testing has shown that the changes now include more cDNA variants, protein variants, exonintron variants than previously.

@NImeson NImeson marked this pull request as draft October 31, 2024 10:50
@NImeson NImeson marked this pull request as ready for review January 27, 2025 11:07
Copy link
Contributor

@shilpigoeldev shilpigoeldev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are also missing the test case discussed today multi_variants_no_gene fullscreen case ..

@NImeson NImeson requested a review from shilpigoeldev January 30, 2025 14:17
Copy link
Contributor

@shilpigoeldev shilpigoeldev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now , but pending Fiona confirmation on QA so can approve and merge once Fiona's approval comes in. Thanks @NImeson !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants