Integrate Test262 #622

jugglinmike · 2017-11-13T16:26:56Z

Integrate Test262

This patch is intended to resolve gh-510. It introduces three new dependencies:

Test262 (fetched by git SHA via npm)
test262-stream - A Node.js API for traversing the Test262 test suite (this creates the test code based on the contents of the Test262 directory)
results-interpreter - A Node.js transform stream for interpreting streaming test results (this implements the code necessary to interpret the results in terms of a whitelist file)

These projects were created to reduce duplication (both in code and in maintenance effort) between Test262 consumers. They're based on my experience integrating that project with JSHint and Babylon. Referring to Test262 via a git SHA allows this project's maintainers to control when the tests should be updated.

For this initial version, I've simply appended the test execution to the project's existing npm test command. On my system, Acorn takes about 40 seconds to run all the tests, and I'm not sure if that's "fast enough" to warrant its inclusion here. Alternatively, we could define a dedicated npm script and configure the CI environment to run it in a separate job. I'll hold off on any optimizations until I hear back from the maintainers.

Retrieve the Test262 project from its canonical location on GitHub.com.
Use the test262-stream module to create tests in accordance with that
project's interpretation guidelines. Use the results-interpreter
project to maintain a "whitelist" file of tests which Acorn is currently
expected to fail.

Retrieve the Test262 project from its canonical location on GitHub.com. Use the `test262-stream` module to create tests in accordance with that project's interpretation guidelines. Use the `results-interpreter` project to maintain a "whitelist" file of tests which Acorn is currently expected to fail.

adrianheine · 2017-11-13T17:07:27Z

Hi, thanks for the PR! A few observations:

I don't think we want to run that as part of the normal tests, since they are really fast and already cover a lot. I think we will rather add new cases to the normal test suite where necessary. Running test262 on the CI is great, though.
You added all failing tests to the white-list, while I skipped all tests with features that acorn does not support on purpose. The latter is much faster and less arbitrary. Tests on the white-list actually correspond to bugs in my PR.

jugglinmike · 2017-11-14T02:51:34Z

I don't think we want to run that as part of the normal tests, since they are
really fast and already cover a lot. I think we will rather add new cases to
the normal test suite where necessary. Running test262 on the CI is great,
though.

Got it. I've reverted my changes to the test npm script and inserted a new script, test-262. I've updated the project's .travis.yml file to use the "Build Matrix" feature. This will cause TravisCI to run npm test in all the Node.js environments which it previously ran that command, and it will also run Test262 in dedicated build using Node.js version 8 specifically.

You added all failing tests to the white-list, while I skipped all tests with
features that acorn does not support on purpose. The latter is much faster
and less arbitrary. Tests on the white-list actually correspond to bugs in my
PR.

I've implemented equivalent behavior using the list of excluded feature flags from your patch. Note that this patch's whitelist is much smaller. I've compare the console output from both, and I think the other version has some extra entries.

Here's the output from gh-615;

Testing complete.
Summary:
 ✔ 43747 valid programs parsed without error
 ✔ 2713 invalid programs produced a parsing error
 ✔ 352 invalid programs did not produce a parsing error (and allowed by the whitelist file)
 ✔ 109 valid programs produced a parsing error (and allowed by the whitelist file)
 ✔ 9888 programs were skipped

And here's the output from this patch:

Testing complete.
Summary:
 ✔ 43747 valid programs parsed without error
 ✔ 2713 invalid programs produced a parsing error
 ✔ 352 invalid programs did not produce a parsing error (and allowed by the whitelist file)
 ✔ 109 valid programs produced a parsing error (and allowed by the whitelist file)

Finally, I've modified the git URL used to reference Test262 to ensure that it can be successfully retrieved from the TravisCI environment.

Thanks for the review!

jugglinmike · 2017-11-14T15:40:49Z

@adrianheine Thinking a bit more about this, I'm wondering if it would be
better to run the tests for those feature flags despite the lack of support. As
you've pointed out, this will introduce a number of additional entries in the
whitelist file. It's true that these entries won't describe bugs in Acorn, but
that fact can be communicated in the whitelist file itself using dedicated
comments.

The reason I feel this might be a better solution is that it will help with
maintenance in the long term. I'm thinking of the day when (for instance)
BigInt reaches stage 4, and Acorn implements support for it. At that time, the
folks involved with the project may forget that the appropriate Test262 tests
have been skipped. They will of course add Acorn-specific tests, but they will
move on without the benefit of the now-relevant Test262 tests.

If we extend the whitelist with expected failures here today, then when support
is added, the new Test262 "job" will fail on TravisCI (since tests that we
marked as "expected to fail" will begin to pass). At that point, they can be
automatically removed using the --update-whitelist flag, and from then on,
Acorn will move forward enjoying any additional coverage provided by Test262.

The only drawback to this is that every time Acorn updates its version of
Test262, there may be new expected failures to account for. This is a bit of an
additional maintenance burden, but it is largely mitigated by that same
--update-whitelist flag.

What do you think?

jugglinmike · 2017-11-16T03:25:24Z

Hi there Kenny! I should start by saying that this is an attempt to satisfy a
request of the project maintainers (see gh-510), so they may have different
reasons for wanting this. But I'm happy to share my own perspective in case
you're curious.

Many of the tests are for features that have not yet reached stage 4. Other
projects have different requirements when it comes to implementation, so they
are particularly interested in those early-stage tests. As we've been
discussing above, it is Acorn's prerogative to whitelist (or even skip
outright) whatever class of tests aren't appropriate for its goals.

All the tests include meta-data describing how implementers should interpret
them. You can read more about this in the file named INTERPRETING.md of the
Test262 project, but the test262-stream module was designed so that folks
wouldn't have to study that so closely to benefit from Test262.

A vast majority of the tests are for runtime semantics. From a parser's
perspective, the semantics are impossible to verify. But that test material is
still valuable as a sort of "positive syntax" test.

When it comes to triaging issues from confused contributors, it's probably best
to take a "wait and see" approach. The situation you've described sounds like a
problem that could be addressed by clear documentation, but we'll be in a
better position to write that if/when we learn what is confusing about the
whitelist file.

Regarding the unfound files: it looks as though you are running the tests from
a Windows machine. Because the whitelist hard-codes file paths using the POSIX
separator (the forward slash character--/), then the paths generated by folks
running from Windows will not match up. I think we can resolve this in the
test-262.js script's ID generation logic--something like .replace(/\\/g, '/'). I can experiment from a Windows VM this weekend.

jugglinmike · 2017-12-03T23:17:33Z

Hi @isiahmeadows. I'm wondering if/how our recent conversation reflects your interest in running Test262 within Acorn. Your participation has already set some improvements in motion, and I'm sure that formally integrating the test suite in Acorn will lead to still more. Actually, that's one of my reasons for making this contribution: many ECMAScript interpreters (both runtimes and parsers) stand to benefit from Acorn's participation. Is there anything more I can do to help see this through to master?

dead-claudia · 2017-12-03T23:47:34Z

@jugglinmike You're probably talking to the wrong person. I have little to no involvement in any active parser project; I only wrote the script for a friend, and it was my troubles in adapting the tests that spurred me to file the Test262 issue.

The script itself in the gist, linked to from that issue, does come with basic instructions on how to integrate it. It's pretty straightforward.

jugglinmike · 2017-12-04T01:28:07Z

Oh, my mistake! I guess maybe we should hear from @marijnh then.

Marijn: this is my attempt to resolve gh-510. Isiah has identified a few patterns in Test262 where parsing tests are expressed with eval. The effect on Acorn is that those tests don't improve coverage. They are still technically correct from a parsing standpoint, however. Because of this, I don't think either Isiah or myself consider the issue to be a blocker (please correct me if I'm wrong there, Isiah). The test execution code is configured to tolerate failures based on a whitelist, and the efficacy of this strategy is proven by the new "job" in the project's continuous integration system.

What do you think?

adrianheine · 2017-12-06T13:43:12Z

As it currently stands, this implementation takes 3-4 times as long as #615, but this could be fixed with bocoup/test262-stream#2. I still think skipping is better than whitelisting for tests for features acorn does not support, but I see your point with regards to not forgetting to remove the skip once the feature is implemented. I would just hope that people keep these tests in mind. Aside from that there is not much of a difference between both pull requests.

dead-claudia · 2017-12-06T14:37:27Z

@adrianheine For context, the script that's being referenced here is in this gist. And although it wasn't mentioned here, there is an option to skip various features as well as an option to skip entire globs (relative to the test262 directory). Just thought I'd clear that up.

adrianheine · 2017-12-06T14:42:34Z

I was referring to this pull request, not your script. Thanks for the link, though.

marijnh · 2017-12-28T17:24:53Z

I've decided to go with #615 instead, since it appears to involve less new moving parts.

jugglinmike · 2018-01-02T01:27:58Z

Yeah, I'm interested in limiting complexity, too. I recently submitted a patch to halve the dependencies in test262-stream (upon which Adrian's implementation now relies). Like any programmer, I'd like to limit duplication in code and in effort, but the important thing is that Acorn is running Test262! That's sure to make both projects better (and by extension, all the other consumers of Test262). I'll keep you posted about advances in test262-stream that this project could benefit from.

jugglinmike mentioned this pull request Nov 13, 2017

Add script for running test262 tests #615

Closed

fixup! Integrate Test262

155ce4b

jugglinmike force-pushed the test262 branch from 4f44e15 to bd94e0f Compare November 14, 2017 02:22

fixup! Integrate Test262

3442d84

jugglinmike force-pushed the test262 branch from bd94e0f to 3442d84 Compare November 14, 2017 02:24

fixup! Integrate Test262

2a47a7a

jugglinmike force-pushed the test262 branch from bf5b002 to 2a47a7a Compare November 14, 2017 02:41

jugglinmike mentioned this pull request Nov 19, 2017

Integrate Test262 jquery/esprima#1880

Closed

marijnh closed this Dec 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate Test262 #622

Integrate Test262 #622

jugglinmike commented Nov 13, 2017

adrianheine commented Nov 13, 2017

jugglinmike commented Nov 14, 2017

jugglinmike commented Nov 14, 2017

jugglinmike commented Nov 16, 2017

jugglinmike commented Dec 3, 2017

dead-claudia commented Dec 3, 2017

jugglinmike commented Dec 4, 2017

adrianheine commented Dec 6, 2017 •

edited

Loading

dead-claudia commented Dec 6, 2017 •

edited

Loading

adrianheine commented Dec 6, 2017

marijnh commented Dec 28, 2017

jugglinmike commented Jan 2, 2018

Integrate Test262 #622

Integrate Test262 #622

Conversation

jugglinmike commented Nov 13, 2017

adrianheine commented Nov 13, 2017

jugglinmike commented Nov 14, 2017

jugglinmike commented Nov 14, 2017

jugglinmike commented Nov 16, 2017

jugglinmike commented Dec 3, 2017

dead-claudia commented Dec 3, 2017

jugglinmike commented Dec 4, 2017

adrianheine commented Dec 6, 2017 • edited Loading

dead-claudia commented Dec 6, 2017 • edited Loading

adrianheine commented Dec 6, 2017

marijnh commented Dec 28, 2017

jugglinmike commented Jan 2, 2018

adrianheine commented Dec 6, 2017 •

edited

Loading

dead-claudia commented Dec 6, 2017 •

edited

Loading