-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "expects" (not just "asserts") to testharness #9180
Comments
FYI @foolip @alvestrand we talked about this, I filed this bug. |
@henbos, would this pattern work for you? async_test(t => {
var track = something;
track.onload = t.step_func(event => {
assert_stuff();
test(() => {
assert_things_about(event.transceiver);
}, 'inner test');
assert_more_stuff();
});
}, 'outer test'); It would show up as two tests, but if the transceiver bits fail the outer test will still continue. |
Oh that's interesting! That would probably do it, yes. |
@henbos, can you try it and see if it's good enough? Trouble with adding "expects" is that we'd have to figure out how to report at least failures separately, and this does achieve that. I'm going to label this is roadmap, and if when it's revisited in 60 days it seems good enough, we'll close it. |
I advise against authoring tests in such a way where the number of sub-tests is dependent on the implementation status. Doing so makes results very hard to interpret:
The number of subtests can generally be expected to increase with each commit made to WPT, so "correctness" is always a moving target. But for the use cases above, we're holding the WPT revision as constant. It's important that "number of subtests" is a consistent denominator in these cases. That way, everyone shares the same expectations. Of the alternatives listed by @henbos, the first seems most conducive for both of those use cases. Writing multiple tests that reach some intermediate state before testing different expectations may involve more code, but any maintainability concerns for this practice are traditionally addressed through the definition of "setup" functions. To be honest, I don't know if the pattern @foolip suggested is used frequently in WPT. It may be that the ship is already sailed on this. But it's something I'd be interested in investigating in the coming weeks, especially in terms of the https://wpt.fyi project. |
#8894 is an example of this being done in practice, which I had no hand in encouraging, FWIW. In this case, the number of subtests doesn't depend on implementation status. However, I also think the ship has pretty much sailed on achieving a consistent number of subtests irrespective of implementation status, it is just too easy to write tests that fail to achieve this. My rule of thumb is rather that there should be some failing test if all is not well. If fixing that test causes new failing tests to appear, that may be surprising, but if there is a true dependency, the alternative is to set up all test up front and fail them in the right places (or just rely on timeout, making it slower). With wpt.fyi, since we can take the union of all tests and subtests, I think we have a path forward for making such comparisons that still allows writing tests with dependencies in the most straight-forward manner. I for one would rather LGTM a single test with such a dependency than two with a shared setup code if that setup code is async and not super fast. But it's up to the people maintaining each part of WPT to make a judgement on this, I don't think we need to have a policy as such. |
It should be easy to make subtests fail if not supported rather than skipping them entirely, it's just a matter of where to place the prerequisite condition - in the subtest or outside of it. |
Sorry the event might not fire, so it is a concern. But can this be solved with a helper utility function maybe? Hmm |
Is there anything preventing the design of something like?
As long as you don't have subtests inside of subtests... |
Thinking out loud - how about something like this?
|
@henbos, those patterns already work, except it's not called subtest but just test, async_test or promise_test. You can do |
That's what I was thinking about with "set up all test up front". Again, the downside is that you have to explicitly cause then to fail later if they can't be run, or it'll be a timeout. I tend to think just creating the tests at the end of another is better overall. |
I think the conclusion is that this is supported by existing constructs using the right patterns? |
Reopening to reconsider this per https://groups.google.com/a/chromium.org/forum/m/#!topic/ecosystem-infra/HuqN39mcBNk |
There has been some similar concerns in WebKit since this is one thing that js-test harness is supporting. Wrapping every assertion in a test() would do the trick. |
Talking with @jgraham, we might want to promote this issue as RFC issue once there will be RFC issues. |
@youennf, I think that would make sense, yes. https://github.com/foolip/rfcs is the proposal for how that will look. A concrete change proposal to evaluate would be great even before that process is in place, though. |
friendly ping. @foolip it looks like this is |
Downgrading. web-platform-tests/rfcs#1 is in progress and then that might be the forum to hammer this out, and then it'll need a champion who isn't me :) |
In WPT it would be useful to expect something - resulting in a PASS or FAIL line in the -expected.txt - without necessarily aborting the test if it FAILs like "asserts" do.
For example, in WebRTC, features are often shipped with partial support. In particular, WebRTC has gone through three stages for how to handle media - "addStream & getLocalStreams" (deprecated), "addTrack & getSenders" (current) and now also "addTransceiver & getTransceivers" (recent addition).
I'm simplifying it here for the sake of argument, but the majority of behaviors can be tested with just "was a track added?"-type of tests, but a full spec-compliant test is "was a track and a transceiver added?"-type of tests.
When an "assert" aborts the test we lose all the test coverage for the related behaviors. For example, Chrome fails all tests related to "ontrack" firing because the test for "ontrack fires when a track is added" (which is supported) also asserts "track events contain a transceiver" (Chrome does not support transceivers yet).
Another example is "when a track is added we get track stats" failing if "all stats related to tracks are fully implemented" fails.
In order to use WPT as our main test tool we have to either:
I've focused on 1) and rewrote a large amount of tests related to tracks. Being able to do 2) would be nice in some cases. I've seen examples of 3) but this is bad because you end up with in-code TODOs and limiting what is tested based on particular browser implementations.
The text was updated successfully, but these errors were encountered: