-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VS-776. Update to latest version of VQSR Lite. #8269
Conversation
…c annotations right now.
Correct for the newly renamed output file.
…ateToLatestVQSRLite # Conflicts: # .dockstore.yml # scripts/variantstore/wdl/GvsCreateFilterSet.wdl
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## ah_var_store #8269 +/- ##
================================================
Coverage ? 42.749%
Complexity ? 23842
================================================
Files ? 2197
Lines ? 167119
Branches ? 18006
================================================
Hits ? 71442
Misses ? 90265
Partials ? 5412 |
…ateToLatestVQSRLite
…ateToLatestVQSRLite
WRONG! FYI - the It was failing because I hadn't ported the test data. Now it passes! |
…ateToLatestVQSRLite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming all the changes outside of scripts/variantstore/wdl
are exactly what's on master
?
gatk_docker = "us.gcr.io/broad-gatk/gatk:4.4.0.0", | ||
annotations = ["AS_QD", "AS_MQRankSum", "AS_ReadPosRankSum", "AS_FS", "AS_MQ", "AS_SOR"], | ||
resource_args = "--resource:hapmap,training=true,calibration=true gs://gcp-public-data--broad-references/hg38/v0/hapmap_3.3.hg38.vcf.gz --resource:omni,training=true,calibration=true gs://gcp-public-data--broad-references/hg38/v0/1000G_omni2.5.hg38.vcf.gz --resource:1000G,training=true,calibration=false gs://gcp-public-data--broad-references/hg38/v0/1000G_phase1.snps.high_confidence.hg38.vcf.gz --resource:mills,training=true,calibration=true gs://gcp-public-data--broad-references/hg38/v0/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz --resource:axiom,training=true,calibration=false gs://gcp-public-data--broad-references/hg38/v0/Axiom_Exome_Plus.genotypes.all_populations.poly.hg38.vcf.gz", | ||
extract_extra_args = "-L ${interval_list} --use-allele-specific-annotations", | ||
score_extra_args = "-L ${interval_list} --use-allele-specific-annotations", | ||
extract_runtime_attributes = {"command_mem_gb": 27}, | ||
train_runtime_attributes = {"command_mem_gb": 27}, | ||
score_runtime_attributes = {"command_mem_gb": 15}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot more hard-coded here than I would have expected, are we confident we'll never want to override any of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All set - added ability to pass runtime block as input.
Added test files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one nit and a question, otherwise 👍🏻 now that tests are passing
|
||
# These are the SNP and INDEL annotations used for VQSR Classic, the order matters. | ||
Array[String] vqsr_classic_indel_recalibration_annotations = ["AS_FS", "AS_ReadPosRankSum", "AS_MQRankSum", "AS_QD", "AS_SOR"] | ||
Array[String] vqsr_classic_snp_recalibration_annotations = ["AS_QD", "AS_MQRankSum", "AS_ReadPosRankSum", "AS_FS", "AS_MQ", "AS_SOR"] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since these are being hard-coded, maybe don't even pass them as inputs, just have them hard-coded in the task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I like that. It wasn't really clear to me why these were parameterized. It would seem unlikely that we would change this sort of thing.
|
||
# reference files | ||
# Axiom - Used only for indels | ||
# Classic: known=false,training=true,truth=false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this comment (and the ones like it, below) mean?
Classic: known=false,training=true,truth=false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was from when I was trying to make sure that I had the right settings for both classic and lite. I don't think it's informative here (it was more of a one time idiot check I did), so will get rid of them.
Update to latest version of VQSR Lite
Refactor GvsCreateFilterSet.wdl to move VQSR Classic code to its own WDL