Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValidateVariants silently does no validation in use common cases #5862

Closed
cmnbroad opened this issue Apr 5, 2019 · 8 comments
Closed

ValidateVariants silently does no validation in use common cases #5862

cmnbroad opened this issue Apr 5, 2019 · 8 comments

Comments

@cmnbroad
Copy link
Collaborator

cmnbroad commented Apr 5, 2019

There are several cases where ValidateVariants does no actual validation, and issues no warning message. This includes the default case, where the minimal set of required args is provided (these are examples from the doc, which should be updated when this is fixed):

gatk ValidateVariants -V some.vcf
gatk ValidateVariants -V some.vcf -R some.fasta

Either of these silently results in no validation and no warning message, despite the entire VCF being decoded and traversed, because the default validation type is "ALL", which includes validation type "IDS". But IDS requires a dbsnp arg, and none was provided, so the code short-circuits out. The default case should probably do whatever validation it can, but at a minimum a warning should be logged. Ironically, if you provide an exclusion on the command line via --validation-type-to-exclude IDS, then validation is done.

Another no-op case is --validation-type-to-exclude ALL (also recommended in the doc), which also should probably be rejected, or at least logged, since it silently does no validation and reports no errors. This tripped up this user, and resulted in a downstream BCF issue.

@jonn-smith
Copy link
Collaborator

When we test this, we should make sure it flags AD when there is only 1 field in there.

@samuelklee
Copy link
Contributor

Not sure if this is related to #4642, but might be good to look into that as well!

@jonn-smith
Copy link
Collaborator

I'm guessing that they are related (given the description @cmnbroad gave of #4642)

@cmnbroad
Copy link
Collaborator Author

cmnbroad commented Apr 8, 2019

Yup, its another variation of the same issue.

@jonn-smith
Copy link
Collaborator

I recommend we error out when provided with --validation-type-to-exclude ALL. It doesn't make sense - why call the validator if you're not going to validate?

@cmnbroad
Copy link
Collaborator Author

cmnbroad commented Jun 4, 2019

I didn't realize before that there is no "include" (opt-in) validation type arg, only "exclude". So I'm not sure what the purpose of having "ALL" is in the first place, if the only thing you can usefully do with it is exclude it.

I think the best longer term fix would be to add an "--validation-type-to-include" arg, and have it default to the everything except for IDs, and then construct the actual types based on merging include/exclude args. But thats a bigger change then just fixing the current (silent do-nothing) default behavior, and requires more error checking for conflicting args.

Lets start with changing it so that in the default (no args) case, we log a warning message saying that IDs will be left out since no IDS were provided, and proceed with the remaining validations. Then if we want to get more ambitious we can talk about making the bigger change.

Also, as part of the initial fix, it might be a good idea to change calculateValidationTypesToApply so that it doesn't modify the excludeTypes list directly, since this is the list provided by the user, and instead uses it's own temporary list.

@cmnbroad
Copy link
Collaborator Author

cmnbroad commented Jun 4, 2019

And one more instance of a user hitting this.

@jamesemery
Copy link
Collaborator

Closed Via #5984

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants