-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added warning for unsorted genotype field #7887
Conversation
@@ -472,6 +478,16 @@ protected GenomicsDBOptions getGenomicsDBOptions() { | |||
public void onTraversalStart() { | |||
final Map<String, VCFHeader> vcfHeaders = Collections.singletonMap(getDrivingVariantsFeatureInput().getName(), getHeaderForVariants()); | |||
|
|||
List<String> genotypeField = getHeaderForVariants().getGenotypeSamples(); | |||
if(!ParsingUtils.isSorted(genotypeField)){ | |||
logger.warn("Detected unsorted genotype fields on input. This could result in very slow traversal for large files."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#7732 give how dangerous the risk of 10x slower runs for very large VCF files I would recommend making this warning much louder and hard to miss. I would recommend making the warning behave like the warning in HaplotypeCaller.onTraversalStart()
so it is much more visible. I would also change the text here "This could result in very slow traversal for large files" -> "SelectVariants will sort the genotypes on output which will result in very slow traversal on large inputs" or something along those lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this warning could be contextual and only happen if we would otherwise not unpack the genotypes, but that might be difficult to check for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly tying the warning to the filesize/the number of genotypes in the first place might also be worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, make the size of the warning font be dependent on the number of genotypes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report
@@ Coverage Diff @@
## master #7887 +/- ##
===============================================
- Coverage 86.936% 86.935% -0.000%
- Complexity 36945 36953 +8
===============================================
Files 2221 2221
Lines 173803 173821 +18
Branches 18775 18777 +2
===============================================
+ Hits 151097 151112 +15
- Misses 16075 16077 +2
- Partials 6631 6632 +1
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
@@ -489,8 +510,7 @@ public void onTraversalStart() { | |||
|
|||
// Look at the parameters to decide which analysis to perform | |||
discordanceOnly = discordanceTrack != null; | |||
if (discordanceOnly) { | |||
logger.info("Selecting only variants discordant with the track: " + discordanceTrack.getName()); | |||
if (discordanceOnly) {logger.info("Selecting only variants discordant with the track: " + discordanceTrack.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should fix the spacing/indentation here before merging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^I agree with david can you revert this please?
@@ -489,8 +510,7 @@ public void onTraversalStart() { | |||
|
|||
// Look at the parameters to decide which analysis to perform | |||
discordanceOnly = discordanceTrack != null; | |||
if (discordanceOnly) { | |||
logger.info("Selecting only variants discordant with the track: " + discordanceTrack.getName()); | |||
if (discordanceOnly) {logger.info("Selecting only variants discordant with the track: " + discordanceTrack.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^I agree with david can you revert this please?
Fixes #7732