Fix behaviour of failing conformance tests with `--accept`. #6714

kwxm · 2024-11-27T13:44:48Z

The Agda evaluator isn't always completely up to date with the Haskell one, and because of this we have a feature that allows us to mark some conformance tests as being expected to fail. This uses expectFail from Test.Tasty.ExpectedFailure. For example, some of the bitwise builtins use a special size measure for budgeting that the metatheory doesn't know about at the time of writing and when you run cabal test agda-conformance you get things like this

Test suite agda-conformance: RUNNING...
UPLC evaluation tests
  evaluation
    builtin
      semantics
        writeBits
          case-32
            case-32 (evaluation): OK (0.07s)
            case-32 (budget):     FAIL (expected) (0.09s)
            [etc. etc. etc.]

(the problem being that there are golden files containing the expected budget and the Agda evaluator currently produces results that disagree with these). When you run the haskell-confomance tests these all pass.

However (as discovered by @Unisay), if you run cabal test agda-conformance --test-option --accept then the incorrect budget results are accepted and the golden files are updated accordingly. This is arguably a bug in Test.Tasty.ExpectedFailure.

Now Test.Testy.ExpectedFailure also provides ignoreTest and when you use that instead of expectFail the output changes to

       ...
       writeBits
          case-32
            case-32 (evaluation): OK (0.06s)
            case-32 (budget):     IGNORED

Futhermore, if you use --accept then the golden files are not accepted, which is what we want. The disadvantage to doing this is that with expectFailure, when we fix the Agda evaluator to behave identically to the Haskell one, the tests which are expected to fail now start to pass, which causes an error and reminds us to remove the formerly problematic tests from the list of tests which are expected to fail; with ingoreTest they just continue to be ignored, so we might forget to update the list of problematic tests.

We should workout whether the behaviour with expectFail and --accept really is a bug and report it if so. In the meantime we should maybe use ignoreTest instead of expectFail.

The text was updated successfully, but these errors were encountered:

kwxm · 2024-12-02T13:28:08Z

Issue reported to maintainers here: nomeata/tasty-expected-failure#27.

kwxm added Test status: triaged Internal labels Nov 27, 2024

kwxm mentioned this issue Dec 3, 2024

Kwxm/conformance/improve filename check #6730

Open

kwxm self-assigned this Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix behaviour of failing conformance tests with `--accept`. #6714

Fix behaviour of failing conformance tests with `--accept`. #6714

kwxm commented Nov 27, 2024 •

edited

Loading

kwxm commented Dec 2, 2024 •

edited

Loading

Fix behaviour of failing conformance tests with --accept. #6714

Fix behaviour of failing conformance tests with --accept. #6714

Comments

kwxm commented Nov 27, 2024 • edited Loading

kwxm commented Dec 2, 2024 • edited Loading

Fix behaviour of failing conformance tests with `--accept`. #6714

Fix behaviour of failing conformance tests with `--accept`. #6714

kwxm commented Nov 27, 2024 •

edited

Loading

kwxm commented Dec 2, 2024 •

edited

Loading