-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
False Duplication VCF Export #570
base: main
Are you sure you want to change the base?
Conversation
still some to do, but unfurling and the existing header code could use a lookover now |
@@ -51,6 +51,136 @@ def filter_liftover_to_false_dups( | |||
return ht | |||
|
|||
|
|||
def _v4_false_dup_unfurl_annotations( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this an unused function that needs to be deleted?
"--overwrite", | ||
help="Option to overwrite existing custom liftover table.", | ||
action="store_true", | ||
) | ||
parser.add_argument( | ||
"--test", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neither overwrite or test is referenced anywhere in the script
logger = logging.getLogger("false_dup_genes") | ||
logger.setLevel(logging.INFO) | ||
|
||
FALSE_DUP_GENES = ["KCNE1", "CBS", "CRYAA"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constant is already defined in create_false_dup_liftover.py. Constants that already exist should be imported rather than redefined. However, I feel like all the false dup code can just be combined into one script, with arguments that can be supplied to either create the Table or export the VCF.
:param ht: Release Hail Table | ||
:param vcf_info_reorder: Order of VCF INFO fields | ||
:return: Hail Table prepared for validity checks and export | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:param ht: Release Hail Table | |
:param vcf_info_reorder: Order of VCF INFO fields | |
:return: Hail Table prepared for validity checks and export | |
""" | |
:param ht: Release Hail Table of false dup genes. | |
:param vcf_info_reorder: Order of VCF INFO fields. | |
:return: Hail Table prepared for validity checks and export. | |
""" |
:return: Hail Table prepared for validity checks and export | ||
""" | ||
logger.info( | ||
"Unfurling nested gnomAD frequency annotations and add to INFO field..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Unfurling nested gnomAD frequency annotations and add to INFO field..." | |
"Unfurling nested gnomAD frequency annotations and adding to INFO field..." |
return vcf_info_dict | ||
|
||
|
||
def _joint_filters(ht: hl.Table) -> hl.Table: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_joint_filters -> prepare_joint_filters
variant_qc_filter="RF", | ||
) | ||
|
||
custom_filter_dict = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see previous note about considering missing to be PASS
} | ||
|
||
|
||
def populate_subset_info_dict( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused function?
return vcf_info_dict | ||
|
||
|
||
def populate_info_dict( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused function?
def main(args): | ||
ht = hl.read_table(get_false_dup_genes_path(release_version="4.0")) | ||
ht = prepare_false_dup_ht_for_validation(ht) | ||
header_dict = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does header_dict get used? why call prepare_vcf_filter_header twice?
STILL NEED: rest of populated headers for info fields and subsets
Spaghetti Code for FalseDup
Code to take False Duplication (of three chr21 genes) Hail Table and convert it to a VCF, verify, and export.
03-07-24: still needs verification and header added, but just wanted to get the PR opened