Integrate Test262 #83

jugglinmike · 2018-09-23T21:20:42Z

Retrieve the Test262 project from its canonical location on GitHub.com.
Use the test262-stream module to create tests in accordance with that
project's interpretation guidelines. Use the test-interpreter project
to maintain a expectations file of tests which Esprima is currently
expected to fail.

This patch introduces two new npm dependencies during development:

test262-stream - A Node.js readable stream for traversing the Test262 test suite (this creates the test code based on the contents of the Test262 directory)
results-interpreter - A Node.js writable stream for interpreting streaming test results (this implements the code necessary to interpret the results in terms of an expectations file)

These projects were created to reduce duplication (both in code and in maintenance effort) between Test262 consumers. They're both used by JSHint, Babel, and Esprima. Acorn uses just test262-stream, along with many folks testing JavaScript runtimes through test262-harness.

As noted in the process output, the expectations file can be automatically updated by specifying the --update-expectations command-line flag. It's also safe to manually insert comments, so you folks can keep notes about the failures in-line.

Here's the summary of the results:

✔ 49969 valid programs parsed without error
✔ 2493 invalid programs produced a parsing error
✔ 2879 invalid programs did not produce a parsing error (and allowed by the expectations file)
✔ 11223 valid programs produced a parsing error (and allowed by the expectations file)

This is meant as a starting point. In addition to the feedback you folks have for me, there are a few details which I'd like to work out:

Where should the Test262 repository be placed? The "examples" directory would make sense conceptually, but all of its contents are currently interpreted as valid literal program code, so that won't work with the current setup.
How can we parse script code? I haven't been able to find documentation for determining the top-level parsing goal, though it seems like it currently defaults to "module code."

Retrieve the Test262 project from its canonical location on GitHub.com. Use the `test262-stream` module to create tests in accordance with that project's interpretation guidelines. Use the `test-interpreter` project to maintain a expectations file of tests which Esprima is currently expected to fail.

jugglinmike · 2019-03-30T02:07:49Z

@lee-dohm @maxbrunsfeld This patch can no longer be merged to do conflicting changes on the master branch. I can see about resolving them, but are you still interested in integrating Test262?

jugglinmike · 2019-06-08T23:48:57Z

@lee-dohm @maxbrunsfeld this patch was meant as a follow-up to our discussion in the language-javascript project. Is there someone else I should be talking to about it?

mjambon · 2021-04-14T00:24:06Z

Hey @jugglinmike. I'm following up on this very old issue and pull request.

A couple of us working on semgrep are now members of the tree-sitter-javascript and tree-sitter-typescript projects, as well as grammars for various other languages. We are particularly interested in having reliable parsers and fixing errors before our customers encounter them.

I think we would greatly benefit from testing tree-sitter-javascript against Test262. Let me see if I can answer your original questions:

Where should the Test262 repository be placed? The "examples" directory would make sense conceptually, but all of its contents are currently interpreted as valid literal program code, so that won't work with the current setup.

Given that test and examples already have a standard structure across all tree-sitter projects, it should go to another folder. Maybe we just call it test262 unless someone has a suggestion for something more abstract like misc or extra or other-tests. We can always rename it later since it not something users should rely on.

How can we parse script code? I haven't been able to find documentation for determining the top-level parsing goal, though it seems like it currently defaults to "module code."

I don't know about javascript scripts. How are they different from any other javascript file? Is it just that first line is a magic number like #!/usr/bin/env node? At the moment, tree-sitter supports only one entry point per grammar, which the first rule occurring in the grammar, as generated by grammar.js. Here, it's program:

  rules: {
    program: $ => seq(
      optional($.hash_bang_line),
      repeat($.statement)
    ),

An optional hashbang line is supported so maybe this answers the question about scripts.

I'd be in favor of moving forward with this PR. Our shared concerned is maintenance. I don't see an issue here since this is not an intrusive change. Even if running the Test262 suite for some reason becomes a problem, we can always disable it.

Let's see what others have to say. I'll follow up in a week if there's no reaction.

Cc: my colleague at r2c @aryx, and @maxbrunsfeld

jugglinmike · 2021-04-14T04:01:32Z

How can we parse script code? I haven't been able to find documentation for
determining the top-level parsing goal, though it seems like it currently
defaults to "module code."

I don't know about javascript scripts. How are they different from any other
javascript file? Is it just that first line is a magic number like
#!/usr/bin/env node?

Hi there, @mjambon. "Script" and "Module" are two parsing goals for JavaScript
programs. The first is how JavaScript interpreters (including browsers,
Node.js, and parsers) historically parsed all JavaScript programs. The second
was introduced by ES2015. The biggest differences between the two are: Modules
allow export/import declarations (Scripts do not), and Modules are always
strict mode code (Scripts require a "use strict" directive to be interpreted
in strict mode).

These two goals cannot be differentiated by inspecting the program source code
itself; interpreters have to receive that information out-of-band. For
browsers, that's done by adding type="module" to the HTML <script> element.
For Node.js, that's done either through the use of the .mjs filename
extension or through configuration in the package.json manifest. Parsers have
their own ways of accepting this information.

Some Test262 tests are written specifically for the Module parsing goal. When
running the tests, you'll need to tell make tree-sitter-javascript aware of
this. For instance, Test262 might include one test for Script code which is
expected to produce a SyntaxError:

export default 0;

And it might have another test with the same source, but that's expected to
pass as Module code.

In order to pass both of those tests, you'll nee to make
tree-sitter-javascript aware of the parsing goal.

You can read Test262's INTERPRETING.md
document to learn
more about how to interpret the tests. The test262-stream project that this
patch uses takes care of some of those details (e.g. making "strict mode" and
"non strict mode" versions of tests), but the top-level parsing goal is
something that needs a custom integration. Here's an example for how we did
that with the JSHint
linter.

aryx · 2021-04-14T06:57:34Z

How long does it take to parse test262? I remember this was a really huge repository with a huge number of files (like > 10 000 if I remember correctly).

aryx · 2021-04-14T06:58:55Z

test262 could be part of a parsing testsuite, like the one we have in semgrep with a list of top 20 github repo for each language.

mjambon · 2021-04-14T19:54:55Z

Thanks for the explanation, @jugglinmike.

For instance, Test262 might include one test for Script code which is expected to produce a SyntaxError

Tree-sitter is focused on text-editor syntax highlighting, and semgrep uses it for searching for patterns in code. In general, we're ok with invalid code that parses without an error. A problem would be valid code that doesn't parse or valid code that's misinterpreted, producing an incorrect parse tree.

That's why I think the tree-sitter-javascript parser, with its single entrypoint, should be able to parse both scripts and modules. If a user wants to make sure there are no export and such in a script, this can be done after parsing, by inspecting the parse tree (or CST - concrete syntax tree) produced by tree-sitter.

Earlier, you showed this result:

[TP] ✔ 49969 valid programs parsed without error
[TN] ✔ 2493 invalid programs produced a parsing error
[FP] ✔ 2879 invalid programs did not produce a parsing error (and allowed by the expectations file)
[FN] ✔ 11223 valid programs produced a parsing error (and allowed by the expectations file)

So if we run our grammar on Test262, we'll pay attention only to lines TP and FN. We'll try minimize FN (false negatives) and maximize TP (true positives). In other words, we don't care whether invalid programs are reported as valid or invalid.

For IDE support, it's actually best to parse as much as possible of an invalid program. Tree-sitter has an error recovery mechanism which allows producing a partial parse tree, with special ERROR nodes for the portions that it can't parse. This mechanism is also useful to tolerate errors due to the grammar being incomplete.

mjambon · 2021-04-26T18:25:13Z

script/test262.js

@@ -0,0 +1,172 @@
+#!/usr/bin/env node


Before we can accept this PR, this source file should include a description of:

what it's intended for (including when should it run, who should run it),

what it does,

how it's achieving it,

how it should be maintained and evolve (including how to manage the file(s) listing expected failures)

I think the most important is to ensure:

a PR is blocked when there's a regression or a progression;

print what to do in case of a regression or unexpected failure: how to locate the source snippet, how to mark it as an expected failure if needed;

print what to do if a test that used to fail now passes: how to remove it from the list of expected failures

jugglinmike mentioned this pull request Sep 23, 2018

Testing with Test262 atom/language-javascript#593

Open

mjambon reviewed Apr 26, 2021

View reviewed changes

amaanq force-pushed the master branch from 87d35c5 to d9b3a58 Compare April 5, 2024 13:01

amaanq closed this Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate Test262 #83

Integrate Test262 #83

jugglinmike commented Sep 23, 2018

jugglinmike commented Mar 30, 2019

jugglinmike commented Jun 8, 2019

mjambon commented Apr 14, 2021

jugglinmike commented Apr 14, 2021

aryx commented Apr 14, 2021

aryx commented Apr 14, 2021

mjambon commented Apr 14, 2021 •

edited

Loading

mjambon Apr 26, 2021

mjambon Apr 26, 2021

Integrate Test262 #83

Integrate Test262 #83

Conversation

jugglinmike commented Sep 23, 2018

jugglinmike commented Mar 30, 2019

jugglinmike commented Jun 8, 2019

mjambon commented Apr 14, 2021

jugglinmike commented Apr 14, 2021

aryx commented Apr 14, 2021

aryx commented Apr 14, 2021

mjambon commented Apr 14, 2021 • edited Loading

mjambon Apr 26, 2021

Choose a reason for hiding this comment

mjambon Apr 26, 2021

Choose a reason for hiding this comment

mjambon commented Apr 14, 2021 •

edited

Loading