-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wptrunner: Always restart on test error #10186
Conversation
Because a harness error may describe an unrecoverable browser state, the browser should always be restarted when such an error is encountered (even in the presence of the `--no-restart-on-unexpected` flag).
Am in favour. |
Build PASSEDStarted: 2018-03-27 00:57:17 View more information about this build on: |
I suspect this will have performance implications. The typical cause of an I think if you want to do this, you should make a Incidentially the example where the harness window gets closed is supposed to be handled. Does that not work correctly? Also if you know of specific problematic tests it's possible to set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(see other comment)
Until now, I had considered "ERROR" to be reserved for cases where the test If "ERROR" is an acceptable status for failing tests, how does it differ from
It does not. In case it helps:
In the WPT Dashboard project, we do not control the test |
A single test failing means that an
That situation is not very different from the one that a vendor is in. I'm not suggesting you put expectations in your epectation metadata (but you could!), but that you use the other features e.g. the ability to disable specific problematic tests or to force a restart after known problematic tests. Having control of the implementation doesn't affect the process around using these things. |
Speaking theoretically, would you agree that it would be better if tests did
You're right, of course. We can document the rationale for a given expectation, |
Yep, much of the design of testharness.js was about avoiding this (or at least isolating different |
So, I'm trying to work out what the core of the problem is here. Do harness errors and some other bad things all result in "ERROR" as the test-level status, or are we talking about the harness status being "ERROR" instead of "OK"? Can reftests be "ERROR"? Is this stuff documented anywhere? :) I actually thought that "always restart on test error" was already the default, but clearly that's not the case. For most existing harness errors probably are no more suggestive of needed restarted than a failed test. Is it possible to separate out the "test window gone" case and treat that as a more serious kind of error and restart for just that? Alternatively, how many harness errors are there, can we fix 90% of them, and does it then make sense to restart on "ERROR" always? |
The WPTDashboard project enabled the WPT CLI's
To quote James above:
So the problem we're experiencing with Edge in Safari is definitely a violation (By the way, James: I can no longer reproduce the behavior using Firefox, so |
I created #10444 as a better alternative here. There may be better ways to deal with the general problem of e.g. the window being closed unexpectedly, but I think that approach will improve the reliability without too much additional overhead in the case where the errors are at the test level. |
Closing this on the basis that the INTERNAL-ERROR changes will fix things. |
Because a harness error may describe an unrecoverable browser state, the
browser should always be restarted when such an error is encountered
(even in the presence of the
--no-restart-on-unexpected
flag).@jgraham @gsnedders If
wpt run --no-restart-on-unexpected
is used to executea test file like this:
Then after that test fails with the status "ERROR", all subsequent tests fail
for the same reason. By interpreting the error as unrecoverable, we can avoid
invalidating test runs in these cases.
That's a contrived example, though. As an alternative, we might be able to
address it by detecting a missing window and applying the status "CRASH". I
started researching this possibility, but it started to look like this logic
would be dependent on the browser, executor, and/or transport. I'm also
thinking that there are many more subtle ways the environment could become
corrupted, and we'd probably never be able to detect them all.
In contrast, what I'm proposing here is much more heavy-handed. There are many
kinds of harness errors that don't require full restarts, so this patch will
make running WPT with
--no-restart-on-unexpected
slower. My inclination is toprefer correctness and completeness over speed since test execution time
(unlike correctness) can be improved by consumers by using more machines. I
also get the sense that not many consumers have a use for this particular flag,
so I'm guessing that we wouldn't want to take on the complexity overhead of a
more nuanced solution.
What do you think?
This change is