-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performed a round of ablation on new annotation-based filtering tools. #8131
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Codecov Report
@@ Coverage Diff @@
## master #8131 +/- ##
===============================================
+ Coverage 86.362% 86.626% +0.264%
+ Complexity 39551 38919 -632
===============================================
Files 2362 2336 -26
Lines 186121 182603 -3518
Branches 20305 20062 -243
===============================================
- Hits 160738 158181 -2557
+ Misses 18236 17379 -857
+ Partials 7147 7043 -104 |
2268ee6
to
61250ef
Compare
@koncheto-broad this is one of the VQSR-lite PRs you will want to eventually rebase on. It's still awaiting review (I was waiting until the dust from updating to Java 17 in #8035 settles), but if anyone from your team wants to take a first crack, feel free! Not too many code changes, so hopefully it should be pretty manageable. Just so it's all in one place: your #8157 GVS branch is currently rebased on #7954, which contains the "serial SNP-then-indel" version of the Joint Genotyping WDL (written by Megan for Ultima) and the Java code for the tools. Some minor updates were made to the Java code in #8049 and the WDL was rewritten by me to do SNPs and indels in a single pass in #8074. (EDIT: I was originally confused here, the WDL that was replaced in this PR simply ran SNPs and indels separately, rather than serially—thanks to George for correcting me here.) The PR here makes relatively minor updates to both the Java code and the WDL and might require very minor updates to GVS code or JSON configurations. And finally, the larger PR at #8132 adds a Pure Java BGMM backend. As we discussed during my mobbing presentation, this is provided merely as a convenience for those users that might not be able to control their python environment (hopefully a small number, these days!), so getting it merged is probably less urgent and should not affect any GVS work. |
Oh, also note that there might be some variable-name references in the Javadocs for the VQSR-lite tools that are not rendered properly in online docs; see #8146 for more context. However, if you're just looking at the Javadocs via IntelliJ, everything should look fine. |
Ooh, thanks for the heads up Sam! It looks like, after George is done and we have done some head to head comparisons, this is definitely one of the PRs we'll need to yank over |
61250ef
to
bda84e0
Compare
bda84e0
to
d1dbe69
Compare
…n and fixed other minor documentation issues.
@meganshand I think this PR should go in the same GATK release needed for your broadinstitute/warp#1076 PR. Would you mind reviewing? There are a couple of very minor argument changes that will bubble up to the WDL level---namely, the changes to the arguments for specifying the python scripts and allele-specific mode outlined in the bullets above. @koncheto-broad @jonn-smith @droazen @MonicaDou24 might also want to take note. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! A few very minor questions.
...g/broadinstitute/hellbender/tools/walkers/vqsr/scalable/LabeledVariantAnnotationsWalker.java
Show resolved
Hide resolved
...g/broadinstitute/hellbender/tools/walkers/vqsr/scalable/LabeledVariantAnnotationsWalker.java
Show resolved
Hide resolved
@@ -121,34 +121,33 @@ | |||
* | |||
* <ul> | |||
* <li> | |||
* Scored VCF file and index. The VCF will not be gzipped if the {@value DO_NOT_GZIP_VCF_OUTPUT_LONG_NAME} | |||
* Scored VCF file and index. The VCF will not be gzipped if the "--do-not-gzip-vcf-output" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming there's a reason that all of the links were replaced in the documentation and that wasn't a mistake, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see #8146 and discussion linked above. I left the @link
tags since they are more parseable, even when not rendered. Unfortunately, I think removing these tags will make the docs harder to maintain, but it seems like that issue is a wontfix at this point! :(
Thanks for the speedy review, @meganshand! |
#8131) * Performed a round of ablation on new annotation-based filtering tools. * Removed Javadoc tags unsupported by Barclay in VETS tool documentation and fixed other minor documentation issues.
useAlleleSpecificAnnotations
argument; we instead infer whether to run in allele-specific mode from the VCF header.