Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added an explicit override to RMSMappingQuality annotation to enable old code fallback. #6060

Merged
merged 4 commits into from
Jul 31, 2019

Conversation

jamesemery
Copy link
Collaborator

After spending some time to resolve this users issue, https://gatkforums.broadinstitute.org/gatk/discussion/24134/gatk4-rmsmappingquality-results-differ-between-v4-0-0-0-and-v4-1-1-0/p1, it became clear that the issue was that the user simply mismatched her versions of gatk, which seems to have caused their MQ annotations to tank. The user didn't notice the warnings of this fact until we had already nearly found the issue by debugging. I propose that we upgrade the warning to an exception with explicit override to make it harder for this issue to slip past people in the future.

@jamesemery jamesemery requested a review from ldgauthier July 26, 2019 20:11
@jamesemery jamesemery force-pushed the je_makeRMSMappingQualityWarningEVENLOUDER branch from 0eafbd1 to bae4381 Compare July 26, 2019 20:13
@jamesemery
Copy link
Collaborator Author

@ldgauthier Do you have any objection to this branch? I would like to save ourselves the trouble of having to debug issues related to this mismatching in the future.

Copy link
Contributor

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back to @jamesemery with my comments

@@ -55,6 +57,10 @@
public static final int NUM_LIST_ENTRIES = 2;
public static final int SUM_OF_SQUARES_INDEX = 0;
public static final int TOTAL_DEPTH_INDEX = 1;
public static final String RMS_MAPPING_QUALITY_OLD_BEHAVIOR_OVERRIDE_ARGUMENT = "allow-old-rms-mapping-quality-annotation-data";

@Argument(fullName = RMS_MAPPING_QUALITY_OLD_BEHAVIOR_OVERRIDE_ARGUMENT, doc="Override to allow old RMSMappingQuality annotatated VCFs to function", optional=true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

annotatated -> annotated

@@ -124,6 +130,13 @@ private String makeRawAnnotationString(final List<Allele> vcAlleles, final Map<A
rawMQdata = vc.getAttributeAsString(getRawKeyName(), null);
}
else if (vc.hasAttribute(getDeprecatedRawKeyName())) {
if (!allowOlderRawKeyValues) {
throw new UserException.BadInput("Presence of '-"+getDeprecatedRawKeyName()+"' annotation is detected. This GATK version expects key "
+ getRawKeyName() + " with an long tuple of sum of squared MQ values and total reads over variant "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an long tuple -> a tuple

@@ -326,6 +327,7 @@ private void runGenotypeGVCFSAndAssertSomething(String input, File expected, Lis
final ArgumentsBuilder args = new ArgumentsBuilder();
args.addReference(new File(reference))
.addArgument("V", input)
.addArgument(RMSMappingQuality.RMS_MAPPING_QUALITY_OLD_BEHAVIOR_OVERRIDE_ARGUMENT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an actual test case covering the new argument, instead of just adding it unconditionally to this helper method for the GenotypeGVCFs tests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^I will add a test asserting that it spits out a user exception, its worth noting that the GenotypeGVCFs integration tests have a mixture of old and new RMSMappingQuality annotations attached to them, so this behavior has already been tested per se.

throw new UserException.BadInput("Presence of '-"+getDeprecatedRawKeyName()+"' annotation is detected. This GATK version expects key "
+ getRawKeyName() + " with an long tuple of sum of squared MQ values and total reads over variant "
+ "genotypes as the value. This could indicate that the provided input was produced with an older version of GATK. " +
"Use the argument '--"+RMS_MAPPING_QUALITY_OLD_BEHAVIOR_OVERRIDE_ARGUMENT+"' to override and attempt the deprecated MQ calculation.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you add a warning that this is unlikely to work, and is likely to produce incorrect results?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be able to produce correct results, but I must have screwed something up. Clearly it doesn't work 100% of the time, so that warning is a great idea. I'll put it on my tech debt list to see if I can fix it.

Copy link
Contributor

@ldgauthier ldgauthier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@jamesemery
Copy link
Collaborator Author

@droazen responded to your comments

@jamesemery jamesemery assigned droazen and unassigned jamesemery Jul 30, 2019
Copy link
Contributor

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 looks good -- merge once tests pass

@droazen droazen assigned jamesemery and unassigned droazen Jul 31, 2019
@jamesemery jamesemery merged commit baab529 into master Jul 31, 2019
@jamesemery jamesemery deleted the je_makeRMSMappingQualityWarningEVENLOUDER branch July 31, 2019 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants