Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of harness-level errors in wptrunner #10444

Merged
merged 1 commit into from
Apr 15, 2018

Conversation

jgraham
Copy link
Contributor

@jgraham jgraham commented Apr 12, 2018

Split the internal handling of errors during a test into two cases;
INTERNAL-ERROR that is produced if there's an exception in the harness
and ERROR that is for exceptions reported by the test. Both are still
reported as ERROR to the user, but in the INTERNAL-ERROR case the
runner is always restarted, just like EXTERNAL-TIMEOUT, since we
assume that the internal state is compromised somehow.


This change is Reviewable

@jgraham jgraham requested a review from gsnedders April 12, 2018 13:13
@jgraham
Copy link
Contributor Author

jgraham commented Apr 12, 2018

@jugglinmike I think you might be interested in this.

Copy link
Member

@foolip foolip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % docs, but I'd also like @jugglinmike to have a look.

Does any documentation of the different statuses exist? Having a description of "INTERNAL-ERROR" under docs/ (added in this PR) would make it easier to review, and benefits everyone trying to understand it in the future.

@foolip
Copy link
Member

foolip commented Apr 12, 2018

Also, do we need to run this through Gecko's CI to have confidence that this works?

@jgraham
Copy link
Contributor Author

jgraham commented Apr 12, 2018

We don't currently have much documentation on the wptrunner internals, so there's nowhere obvious to add this. It isn't ever exposed to the user (it gets mapped down to ERROR before output because mozlog only supports a limited set of statuses, in order for consistency between users).

@jgraham
Copy link
Contributor Author

jgraham commented Apr 12, 2018

https://treeherder.mozilla.org/#/jobs?repo=try&revision=890c9d39ed8c4239340954a1aed87fa3c02ba38e&selectedJob=173288584 is the try run started by our bot, although that doesn't include all tests by default. I can do another try run with everything if that looks OK.

@jgraham jgraham force-pushed the wptrunner_internal_error branch from 2d1b290 to 2bd39e3 Compare April 12, 2018 16:00
@@ -563,10 +563,12 @@ def test_ended(self, test, results):
# TODO: consider changing result if there is a crash dump file

# Write the result of the test harness
result_subns = {"INTERNAL-ERROR": "ERROR",
"EXTERNAL-TIMEOUT": "TIMEOUT"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "result" is being applied to the object that contains this status string. What do you think of status_subns or status_map?

@jugglinmike
Copy link
Contributor

@foolip's request for documentation seems reasonable to me, but I also understand @jgraham's reluctance to bloat usage instructions with implementation details. A few ideas:

James writes, "It isn't ever exposed to the user (it gets mapped down to ERROR before output because mozlog only supports a limited set of statuses, in order for consistency between users)." If this detail was included as an in-line comment, we could address Philip's concern about "everyone trying to understand it in the future" without adding to the information we present to the average user

This patch overloads the string "ERROR" a bit. Depending on the context, it could mean "test error, not harness error", but it could also mean "test error or harness error". This could be confusing for others. Since the latter definition is unrestricted, it more closely matches the string. Would it be too onerous to introduce a new value dedicated to the "test error, not harness error" status? (This would also help future contributors get their bearings since the distinction James is making would be explicitly reflected in the code.)

Thanks for taking the lead on this, James!

@jgraham
Copy link
Contributor Author

jgraham commented Apr 12, 2018

Adding a new error status to mozlog is hard, because it involves updating it, making a release, updating all formatters to ensure that they don't choke on the new value, and hoping you didn't miss any external ones.

It's about now you realise why it is that people like languages with enums and irrefutable pattern matching… mozlog would be great in Rust ;)

I'm happy to add a comment about what's going on in the source though.

Split the internal handling of errors during a test into two cases;
INTERNAL-ERROR that is produced if there's an exception in the harness
and ERROR that is for exceptions reported by the test. Both are still
reported as ERROR to the user, but in the INTERNAL-ERROR case the
runner is always restarted, just like EXTERNAL-TIMEOUT, since we
assume that the internal state is compromised somehow.
@jgraham jgraham force-pushed the wptrunner_internal_error branch from 2bd39e3 to 712be3d Compare April 12, 2018 17:17
@jugglinmike
Copy link
Contributor

Adding a new error status to mozlog is hard, because it involves updating it,
making a release, updating all formatters to ensure that they don't choke on
the new value, and hoping you didn't miss any external ones.

I expected the path to updating mozlog would be a difficult one, but I also
don't think it's necessarily appropriate. The distinction we're discussion
doesn't necessarily make much sense in a general-purpose logging utility. What
I meant to suggest was the introduction of a new string value for internal use,
one to mirror "INTERNAL-ERROR" (maybe "TEST-ERROR") that would be likewise
translated to "ERROR" for integration with mozlog. My thinking was when reading
the source (and debugging runtime values), seeing that value would be more
helpful because you wouldn't have to wonder, "is this describing a test error?
Or has the mozlog translation already occurred, meaning it might be describing
an internal error?"

It's about now you realise why it is that people like languages with enums
and irrefutable pattern matching… mozlog would be great in Rust ;)

Still looking for an excuse to learn that language...

@jgraham
Copy link
Contributor Author

jgraham commented Apr 12, 2018

Oh, I see. The "ERROR" string comes pretty much directly from testharness.js, so I'd prefer not to change that internally just to change it back again later.

@jugglinmike
Copy link
Contributor

This looks good to me. I don't know how to interpret the Treeherder UI, though. Does it make sense to move forward with that complete test run, now?

@jgraham jgraham merged commit 6736e3f into master Apr 15, 2018
@gsnedders gsnedders deleted the wptrunner_internal_error branch April 17, 2018 01:12
@gsnedders
Copy link
Member

We don't currently have much documentation on the wptrunner internals, so there's nowhere obvious to add this.

tools/wptrunner/docs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants