-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VS 396 clinvar grabs too many values #7823
Conversation
Codecov Report
@@ Coverage Diff @@
## ah_var_store #7823 +/- ##
================================================
Coverage ? 51.397%
Complexity ? 26413
================================================
Files ? 2170
Lines ? 164837
Branches ? 17775
================================================
Hits ? 84721
Misses ? 74715
Partials ? 5401 |
616eed2
to
a624d03
Compare
a624d03
to
52e7468
Compare
|
||
def test_clinvar_inclusion(self): | ||
clinvar_swap = [{ 'id': 'RCV01', \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to add a test case for a case where the id does NOT start with 'RCV'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and I think that line 88 should handle that case
@@ -143,6 +143,7 @@ def get_gnomad_subpop(gnomad_obj): | |||
max_an = None | |||
max_af = None | |||
max_subpop = "" | |||
## TODO will there ever be unexpected values in gnomad_subpop (values not in gnomad_ordering) ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO more like "defend against possible unexpected values appearing in gnomad_subpop"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I think that's a good distinction
row["clinvar_classification"] = ordered_significance_values # special sorted array | ||
updated_dates.sort(key=lambda date: datetime.strptime(date, "%Y-%m-%d")) # note: method is in-place, and returns None | ||
row["clinvar_last_updated"] = updated_dates[-1] # most recent date | ||
row["clinvar_phenotype"] = sorted(phenotypes) # union of all phenotypes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent inconsistency issues
scripts/variantstore/wdl/extract/create_variant_annotation_table.py
Outdated
Show resolved
Hide resolved
ordered_significance_values.extend(values_not_accounted_for) # add any values that aren't in significance_ordering to the end | ||
row["clinvar_id"] = clinvar_ids # array | ||
row["clinvar_classification"] = ordered_significance_values # special sorted array | ||
updated_dates.sort(key=lambda date: datetime.strptime(date, "%Y-%m-%d")) # note: method is in-place, and returns None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the comment, I forget that every time I've been away from Python for a while and it never fails to bite me 🙂
…le.py Co-authored-by: Miguel Covarrubias <[email protected]>
…gatk into rc-vs-396-clinvar-greedy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
currently clinvar grabs all values in the variants section of the annotation json---this is wrong because nirvana includes clinvar values from all overlapping variants.
this change limits the clinvar values to the correct variant only
and tests the change as well