Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Test262 #83

Closed
wants to merge 1 commit into from
Closed

Conversation

jugglinmike
Copy link

Retrieve the Test262 project from its canonical location on GitHub.com.
Use the test262-stream module to create tests in accordance with that
project's interpretation guidelines. Use the test-interpreter project
to maintain a expectations file of tests which Esprima is currently
expected to fail.


This patch introduces two new npm dependencies during development:

  • test262-stream - A Node.js readable stream for traversing the Test262 test suite (this creates the test code based on the contents of the Test262 directory)
  • results-interpreter - A Node.js writable stream for interpreting streaming test results (this implements the code necessary to interpret the results in terms of an expectations file)

These projects were created to reduce duplication (both in code and in maintenance effort) between Test262 consumers. They're both used by JSHint, Babel, and Esprima. Acorn uses just test262-stream, along with many folks testing JavaScript runtimes through test262-harness.

As noted in the process output, the expectations file can be automatically updated by specifying the --update-expectations command-line flag. It's also safe to manually insert comments, so you folks can keep notes about the failures in-line.

Here's the summary of the results:

✔ 49969 valid programs parsed without error
✔ 2493 invalid programs produced a parsing error
✔ 2879 invalid programs did not produce a parsing error (and allowed by the expectations file)
✔ 11223 valid programs produced a parsing error (and allowed by the expectations file)

This is meant as a starting point. In addition to the feedback you folks have for me, there are a few details which I'd like to work out:

  • Where should the Test262 repository be placed? The "examples" directory would make sense conceptually, but all of its contents are currently interpreted as valid literal program code, so that won't work with the current setup.
  • How can we parse script code? I haven't been able to find documentation for determining the top-level parsing goal, though it seems like it currently defaults to "module code."

Retrieve the Test262 project from its canonical location on GitHub.com.
Use the `test262-stream` module to create tests in accordance with that
project's interpretation guidelines. Use the `test-interpreter` project
to maintain a expectations file of tests which Esprima is currently
expected to fail.
@jugglinmike
Copy link
Author

@lee-dohm @maxbrunsfeld This patch can no longer be merged to do conflicting changes on the master branch. I can see about resolving them, but are you still interested in integrating Test262?

@jugglinmike
Copy link
Author

@lee-dohm @maxbrunsfeld this patch was meant as a follow-up to our discussion in the language-javascript project. Is there someone else I should be talking to about it?

@mjambon
Copy link
Contributor

mjambon commented Apr 14, 2021

Hey @jugglinmike. I'm following up on this very old issue and pull request.

A couple of us working on semgrep are now members of the tree-sitter-javascript and tree-sitter-typescript projects, as well as grammars for various other languages. We are particularly interested in having reliable parsers and fixing errors before our customers encounter them.

I think we would greatly benefit from testing tree-sitter-javascript against Test262. Let me see if I can answer your original questions:

Where should the Test262 repository be placed? The "examples" directory would make sense conceptually, but all of its contents are currently interpreted as valid literal program code, so that won't work with the current setup.

Given that test and examples already have a standard structure across all tree-sitter projects, it should go to another folder. Maybe we just call it test262 unless someone has a suggestion for something more abstract like misc or extra or other-tests. We can always rename it later since it not something users should rely on.

How can we parse script code? I haven't been able to find documentation for determining the top-level parsing goal, though it seems like it currently defaults to "module code."

I don't know about javascript scripts. How are they different from any other javascript file? Is it just that first line is a magic number like #!/usr/bin/env node? At the moment, tree-sitter supports only one entry point per grammar, which the first rule occurring in the grammar, as generated by grammar.js. Here, it's program:

  rules: {
    program: $ => seq(
      optional($.hash_bang_line),
      repeat($.statement)
    ),

An optional hashbang line is supported so maybe this answers the question about scripts.

I'd be in favor of moving forward with this PR. Our shared concerned is maintenance. I don't see an issue here since this is not an intrusive change. Even if running the Test262 suite for some reason becomes a problem, we can always disable it.

Let's see what others have to say. I'll follow up in a week if there's no reaction.

Cc: my colleague at r2c @aryx, and @maxbrunsfeld

@jugglinmike
Copy link
Author

How can we parse script code? I haven't been able to find documentation for
determining the top-level parsing goal, though it seems like it currently
defaults to "module code."

I don't know about javascript scripts. How are they different from any other
javascript file? Is it just that first line is a magic number like
#!/usr/bin/env node?

Hi there, @mjambon. "Script" and "Module" are two parsing goals for JavaScript
programs. The first is how JavaScript interpreters (including browsers,
Node.js, and parsers) historically parsed all JavaScript programs. The second
was introduced by ES2015. The biggest differences between the two are: Modules
allow export/import declarations (Scripts do not), and Modules are always
strict mode code (Scripts require a "use strict" directive to be interpreted
in strict mode).

These two goals cannot be differentiated by inspecting the program source code
itself; interpreters have to receive that information out-of-band. For
browsers, that's done by adding type="module" to the HTML <script> element.
For Node.js, that's done either through the use of the .mjs filename
extension or through configuration in the package.json manifest. Parsers have
their own ways of accepting this information.

Some Test262 tests are written specifically for the Module parsing goal. When
running the tests, you'll need to tell make tree-sitter-javascript aware of
this. For instance, Test262 might include one test for Script code which is
expected to produce a SyntaxError:

export default 0;

And it might have another test with the same source, but that's expected to
pass as Module code.

In order to pass both of those tests, you'll nee to make
tree-sitter-javascript aware of the parsing goal.

You can read Test262's INTERPRETING.md
document
to learn
more about how to interpret the tests. The test262-stream project that this
patch uses takes care of some of those details (e.g. making "strict mode" and
"non strict mode" versions of tests), but the top-level parsing goal is
something that needs a custom integration. Here's an example for how we did
that with the JSHint
linter.

@aryx
Copy link
Contributor

aryx commented Apr 14, 2021

How long does it take to parse test262? I remember this was a really huge repository with a huge number of files (like > 10 000 if I remember correctly).

@aryx
Copy link
Contributor

aryx commented Apr 14, 2021

test262 could be part of a parsing testsuite, like the one we have in semgrep with a list of top 20 github repo for each language.

@mjambon
Copy link
Contributor

mjambon commented Apr 14, 2021

Thanks for the explanation, @jugglinmike.

For instance, Test262 might include one test for Script code which is expected to produce a SyntaxError

Tree-sitter is focused on text-editor syntax highlighting, and semgrep uses it for searching for patterns in code. In general, we're ok with invalid code that parses without an error. A problem would be valid code that doesn't parse or valid code that's misinterpreted, producing an incorrect parse tree.

That's why I think the tree-sitter-javascript parser, with its single entrypoint, should be able to parse both scripts and modules. If a user wants to make sure there are no export and such in a script, this can be done after parsing, by inspecting the parse tree (or CST - concrete syntax tree) produced by tree-sitter.

Earlier, you showed this result:

[TP] ✔ 49969 valid programs parsed without error
[TN] ✔ 2493 invalid programs produced a parsing error
[FP] ✔ 2879 invalid programs did not produce a parsing error (and allowed by the expectations file)
[FN] ✔ 11223 valid programs produced a parsing error (and allowed by the expectations file)

So if we run our grammar on Test262, we'll pay attention only to lines TP and FN. We'll try minimize FN (false negatives) and maximize TP (true positives). In other words, we don't care whether invalid programs are reported as valid or invalid.

For IDE support, it's actually best to parse as much as possible of an invalid program. Tree-sitter has an error recovery mechanism which allows producing a partial parse tree, with special ERROR nodes for the portions that it can't parse. This mechanism is also useful to tolerate errors due to the grammar being incomplete.

@@ -0,0 +1,172 @@
#!/usr/bin/env node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we can accept this PR, this source file should include a description of:

  • what it's intended for (including when should it run, who should run it),
  • what it does,
  • how it's achieving it,
  • how it should be maintained and evolve (including how to manage the file(s) listing expected failures)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the most important is to ensure:

  • a PR is blocked when there's a regression or a progression;
  • print what to do in case of a regression or unexpected failure: how to locate the source snippet, how to mark it as an expected failure if needed;
  • print what to do if a test that used to fail now passes: how to remove it from the list of expected failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants