-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treat 'PRECONDITION_FAILED' as 'PASS' for interop scoring purposes #178
base: main
Are you sure you want to change the base?
Conversation
'PRECONDITION_FAILED' means that the condition tested by `assert_implements_optional` is `false`. For instance, a test might test multiple times the same feature with different video/audio codecs that are optional. If the codec is not supported, 'PRECONDITION_FAILED' would be returned as status. https://web-platform-tests.org/writing-tests/testharness-api.html#optional-features This is different from the API not being supported, in that case, `assert_implements_optional` would fail with an exception (since it couldn't evaluate the condition), and the status would be 'ERROR'.
This change considers every Is there merit in making a fix to aggregate this differently for the WebCodecs category? I'd just like to be certain that there aren't scenarios where |
Usually we recommend |
Just treating these like passes makes me nervous; it seems very misleading to claim that not supporting a feature amounts to passing tests for that feature (in particular for something like webcodecs, if you only support, say, one media format out of four, you could get a 75% score without supporting the feature at all). We could just ignore those subtests with a precondition failed result, which would make the per-browser score work better (you'd score out of the features that you do support rather than from the features you don't). However it would "break" the overall interop score (since that's computed based on the total number of subtests). For webcodecs in particular, I wonder if the best solution would be to have an "interop" mode for the tests which picks a supported video format and then uses that for all the subtests (this is approximately what authors are expected to do). We could keep the per-format variants in the main wpt suite in case there are format-specific problems. |
In the case of WebCodecs, the precondition contains calls to the WebCodecs API, so if the browser does not support WebCodecs, you'd get an error instead of PRECONDITION_FAILED.
I'm also fine with this if this is an easy solution (but it sounds like it isn't).
This sounds similar to just excluding from Interop variants of tests for codecs that are not widely supported across all browsers (which is somewhat what I'm suggesting in #375). This solution is also fine to me. |
We also don't correctly handle tests that return
If there's a single format we expect all participating engines to support I think only including that format in interop is the right way forward here. |
I'm also fine with this, though I have a slight preference for keeping all of the formats that are expected to be supported widely, instead of just one. Reason being that different formats sometimes have slightly different decoding/encoding rules that are reflected in the test, so it would be nice to have that test coverage. I honestly have no strong opinion though., I would also just be fine with just keeping one format. |
Yes, sorry s/one/one or more/. I just mean that if the intersection of formats that all participants will implement is not null we should restrict the Interop-included tests to that intersection rather than trying to figure out how to handle the cases where some implementations intentionally don't implement a format. |
How can you get that without supporting the feature at all? If you don't support the feature at all, you'll get 0%, provided the tests fail for non-support of Web Codecs but precondition-fail for non-support of a given codec. Or are you considering the media format as part of the feature?
I think, yes, that would be ideal—but that also requires more work than altering how the scoring works (and we still have this problem in the codebase—we need to deal with |
Hypothetically one could just implement the API for checking codec support, always return unsupported, and then it would look like you were passing every test. I think done deliberately that would be considered bad faith behaviour, but one can imagine similar situations arising if we unconditionally treat If there's a set of codecs that all of Blink/Gecko/WebKit support, I think the easiest solution here would just be to remove tests for other codecs from the Interop set, since it's apparent that we're not going to get Interop for codecs that not every browser implements. Unless that leaves us with no tests I think it's likely to be easier than the solution of creating a set of tests that pick a supported codec in each browser and use that. |
It seems important to resolve this if it's affecting Interop 2023 scores in an unreasonable way, but I don't see a clear conclusion in web-platform-tests/interop#383. Did y'all agree on something that would work here that we can go and implement? Of the options I skimmed, I think updating/splitting the tests or filtering out PRECONDITION_FAILED seems the most practical. But that would leave the question about what wpt.fyi should show, and it should be in sync with interop scoring scripts. |
I felt the conclusion that we parameterize / separate tests with optional features and come to consensus on what's included in interop seemed reasonable since Interop is already a project of consensus. |
Yes. I don't think we can include tests for optional features in Interop without getting additional agreement on what all participants will actually implement, and targeting the tests specifically at that agreement (e.g. we might have agreement that no one will implement the optional feature, or everyone will, or that there will be two behaviours allowed by the tests in Interop). |
'PRECONDITION_FAILED' means that the condition tested by
assert_implements_optional
isfalse
.For instance, a test might test multiple times the same feature with different video/audio codecs that are optional. If the codec is not supported, 'PRECONDITION_FAILED' would be returned as status.
https://web-platform-tests.org/writing-tests/testharness-api.html#optional-features
This is different from the API not being supported, in that case,
assert_implements_optional
would fail with an exception (since it couldn't evaluate the condition), and the status would be 'ERROR'.