-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support weval-based ahead-of-time compilation of JavaScript. #91
Conversation
32cd5eb
to
57c9b75
Compare
When the `WEVAL` option is turned on (`cmake -DWEVAL=ON`), this PR adds: - Integration to the CMake machinery to fetch a PBL+weval-ified version of SpiderMonkey artifacts; - Likewise, to fetch weval binaries; - A rule to pre-build a compilation cache of IC bodies specific to the StarlingMonkey build, so weval can use this cache and spend time only on user-provided code on first run; - Integration in `componentize.sh`. When built with: ``` $ mkdir build/; cd build/ $ cmake .. -DCMAKE_BUILD_TYPE=Release -DUSE_WASM_OPT=OFF -DWEVAL=ON $ make ``` We can then do: ``` $ build/componentize.sh file.js --aot -o file.wasm $ wasmtime serve -S cli=y file.wasm ``` Using the Richards Octane benchmark adapted slightly with a `main()` for the HTTP server world [1], I get the following results: ``` % build/componentize.sh richards.js --aot -o weval.wasm Componentizing richards.js into weval.wasm [ verbose weval progress output ] % wasmtime serve -S cli=y weval.wasm Serving HTTP on http://0.0.0.0:8080/ stdout [0] :: Log: Richards: 676 stdout [0] :: Log: ---- stdout [0] :: Log: Score (version 9): 676 % wasmtime serve -S cli=y base.wasm Serving HTTP on http://0.0.0.0:8080/ stdout [0] :: Log: Richards: 189 stdout [0] :: Log: ---- stdout [0] :: Log: Score (version 9): 189 ``` [1]: https://gist.github.com/cfallin/4b18da12413e93f7e88568a92d09e4b7
I believe this is now ready for review, except possibly for CI testing integration: I haven't yet added a job to run all tests with wevaling enabled. Happy to hear early feedback though (e.g.: should we require a separate |
@cfallin I've pushed two new commits here - one to enable testing the new weval option, and one to add a new CI run that tests against it. I'm getting 7/10 passes here with the failing tests as: Integration test timers?setInterval-handler-parameter-not-callable Which seems to be related to catching an error. And two e2e tests when there is invalid JS syntax / a top-level error:
where the output gets stuck on (per tests/e2e/tla-err/stderr.log):
but then it doesn't actually print the expected error:
We definitely need to improve the above error printing in general, we should at least ensure some error printing here. We can also diverge the error printing between Wizer and Weval no problem, and create a separate expectation for Weval, happy to help with that too. |
Thanks @guybedford! At least the expected-output failure above is due to weval's progress-output verbosity; I can silence that by default. I'll look into the setInterval failure as well. |
This commit updates to weval v0.2.7 which has no output by default, allowing the expected-failure tests checking syntax error messages to pass; we now pass 9/10 integration tests. It also updates the flags on `componentize.sh`: a `--verbose` flag to allow the user to see weval progress messages (perhaps useful if the build is taking a long time, or to see the stats on function specializations); and removal of the `--aot` flag, because there is only one valid setting for a build configuration and it is not baked into the shell script automatically.
Two issues discovered while debugging the testsuite in StarlingMonkey: - bytecodealliance/gecko-dev#54: fixes handling of conditionals by weval when LLVM applies if-conversion in a new place; use of weval intrinsic for value-specialization / subcontext splitting should make this more robust. - bytecodealliance/gecko-dev#55: fixes missing interpreter-PC values in stack frames up the stack during unwind because of too-aggressive optimization trick in weval'd bodies. With these two changes, this version of SpiderMonkey allows the StarlingMonkey integration test suite to pass in bytecodealliance/StarlingMonkey#91.
I've got all integration tests passing locally with the changes in bytecodealliance/spidermonkey-wasi-embedding#19 (and a weval update pushed here); there are three review-merge-rebase cycles required to get this PR up to date (bytecodealliance/gecko-dev#54, bytecodealliance/gecko-dev#55, bytecodealliance/spidermonkey-wasi-embedding#19, then update commit hash here) but once all that's done this should be green too. |
Two issues discovered while debugging the testsuite in StarlingMonkey: - bytecodealliance/gecko-dev#54: fixes handling of conditionals by weval when LLVM applies if-conversion in a new place; use of weval intrinsic for value-specialization / subcontext splitting should make this more robust. - bytecodealliance/gecko-dev#55: fixes missing interpreter-PC values in stack frames up the stack during unwind because of too-aggressive optimization trick in weval'd bodies. With these two changes, this version of SpiderMonkey allows the StarlingMonkey integration test suite to pass in bytecodealliance/StarlingMonkey#91.
@guybedford tests are green now, I think ready for final review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cfallin previously when I tried to build the web platform tests test case I was getting this error:
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at src/eval.rs:1362:25:
PC is a runtime value: Runtime(Some(v12146))
now when I try to build the web platform tests I get a silent failure without any stderr.
Was this issue resolved, or did the work to remove the logging also disable legitimate stderr logging?
@guybedford the error was resolved by the fix in bytecodealliance/gecko-dev#54; none of my changes should have disabled legitimated stderr output; I can take a look at WPT, I hadn't tried running those tests yet. |
Thanks for confirming the panic was also resolved. In that case, it must be something else unrelated. Of course, it would be great to get the WPT passing as well, but since we've established no regressions let's go ahead and land for now. |
This PR pulls in my work to use "weval", the WebAssembly partial evaluator, to perform ahead-of-time compilation of JavaScript using the PBL interpreter we previously contributed to SpiderMonkey. This work has been merged into the BA fork of SpiderMonkey in bytecodealliance/gecko-dev#45, bytecodealliance/gecko-dev#46, bytecodealliance/gecko-dev#47, bytecodealliance/gecko-dev#48, bytecodealliance/gecko-dev#51, bytecodealliance/gecko-dev#52, bytecodealliance/gecko-dev#53, bytecodealliance/gecko-dev#54, bytecodealliance/gecko-dev#55, and then integrated into StarlingMonkey in bytecodealliance/StarlingMonkey#91. The feature is off by default; it requires a `--enable-experimental-aot` flag to be passed to `js-compute-runtime-cli.js`. This requires a separate build of the engine Wasm module to be used when the flag is passed. This should still be considered experimental until it is tested more widely. The PBL+weval combination passes all jit-tests and jstests in SpiderMonkey, and all integration tests in StarlingMonkey; however, it has not yet been widely tested in real-world scenarios. Initial speedups we are seeing on Octane (CPU-intensive JS benchmarks) are in the 3x-5x range. This is roughly equivalent to the speedup that a native JS engine's "baseline JIT" compiler tier gets over its interpreter, and it uses the same basic techniques -- compiling all polymorphic operations (all basic JS operators) to inline-cache sites that dispatch to stubs depending on types. Further speedups can be obtained eventually by inlining stubs from warmed-up IC chains, but that requires warmup. Important to note is that this compilation approach is *fully ahead-of-time*: it requires no profiling or observation or warmup of user code, and compiles the JS directly to Wasm that does not do any further codegen/JIT at runtime. Thus, it is suitable for the per-request isolation model (new Wasm instance for each request, with no shared state).
This PR pulls in my work to use "weval", the WebAssembly partial evaluator, to perform ahead-of-time compilation of JavaScript using the PBL interpreter we previously contributed to SpiderMonkey. This work has been merged into the BA fork of SpiderMonkey in bytecodealliance/gecko-dev#45, bytecodealliance/gecko-dev#46, bytecodealliance/gecko-dev#47, bytecodealliance/gecko-dev#48, bytecodealliance/gecko-dev#51, bytecodealliance/gecko-dev#52, bytecodealliance/gecko-dev#53, bytecodealliance/gecko-dev#54, bytecodealliance/gecko-dev#55, and then integrated into StarlingMonkey in bytecodealliance/StarlingMonkey#91. The feature is off by default; it requires a `--enable-experimental-aot` flag to be passed to `js-compute-runtime-cli.js`. This requires a separate build of the engine Wasm module to be used when the flag is passed. This should still be considered experimental until it is tested more widely. The PBL+weval combination passes all jit-tests and jstests in SpiderMonkey, and all integration tests in StarlingMonkey; however, it has not yet been widely tested in real-world scenarios. Initial speedups we are seeing on Octane (CPU-intensive JS benchmarks) are in the 3x-5x range. This is roughly equivalent to the speedup that a native JS engine's "baseline JIT" compiler tier gets over its interpreter, and it uses the same basic techniques -- compiling all polymorphic operations (all basic JS operators) to inline-cache sites that dispatch to stubs depending on types. Further speedups can be obtained eventually by inlining stubs from warmed-up IC chains, but that requires warmup. Important to note is that this compilation approach is *fully ahead-of-time*: it requires no profiling or observation or warmup of user code, and compiles the JS directly to Wasm that does not do any further codegen/JIT at runtime. Thus, it is suitable for the per-request isolation model (new Wasm instance for each request, with no shared state).
This PR pulls in my work to use "weval", the WebAssembly partial evaluator, to perform ahead-of-time compilation of JavaScript using the PBL interpreter we previously contributed to SpiderMonkey. This work has been merged into the BA fork of SpiderMonkey in bytecodealliance/gecko-dev#45, bytecodealliance/gecko-dev#46, bytecodealliance/gecko-dev#47, bytecodealliance/gecko-dev#48, bytecodealliance/gecko-dev#51, bytecodealliance/gecko-dev#52, bytecodealliance/gecko-dev#53, bytecodealliance/gecko-dev#54, bytecodealliance/gecko-dev#55, and then integrated into StarlingMonkey in bytecodealliance/StarlingMonkey#91. The feature is off by default; it requires a `--enable-experimental-aot` flag to be passed to `js-compute-runtime-cli.js`. This requires a separate build of the engine Wasm module to be used when the flag is passed. This should still be considered experimental until it is tested more widely. The PBL+weval combination passes all jit-tests and jstests in SpiderMonkey, and all integration tests in StarlingMonkey; however, it has not yet been widely tested in real-world scenarios. Initial speedups we are seeing on Octane (CPU-intensive JS benchmarks) are in the 3x-5x range. This is roughly equivalent to the speedup that a native JS engine's "baseline JIT" compiler tier gets over its interpreter, and it uses the same basic techniques -- compiling all polymorphic operations (all basic JS operators) to inline-cache sites that dispatch to stubs depending on types. Further speedups can be obtained eventually by inlining stubs from warmed-up IC chains, but that requires warmup. Important to note is that this compilation approach is *fully ahead-of-time*: it requires no profiling or observation or warmup of user code, and compiles the JS directly to Wasm that does not do any further codegen/JIT at runtime. Thus, it is suitable for the per-request isolation model (new Wasm instance for each request, with no shared state).
This PR pulls in my work to use "weval", the WebAssembly partial evaluator, to perform ahead-of-time compilation of JavaScript using the PBL interpreter we previously contributed to SpiderMonkey. This work has been merged into the BA fork of SpiderMonkey in bytecodealliance/gecko-dev#45, bytecodealliance/gecko-dev#46, bytecodealliance/gecko-dev#47, bytecodealliance/gecko-dev#48, bytecodealliance/gecko-dev#51, bytecodealliance/gecko-dev#52, bytecodealliance/gecko-dev#53, bytecodealliance/gecko-dev#54, bytecodealliance/gecko-dev#55, and then integrated into StarlingMonkey in bytecodealliance/StarlingMonkey#91. The feature is off by default; it requires a `--enable-experimental-aot` flag to be passed to `js-compute-runtime-cli.js`. This requires a separate build of the engine Wasm module to be used when the flag is passed. This should still be considered experimental until it is tested more widely. The PBL+weval combination passes all jit-tests and jstests in SpiderMonkey, and all integration tests in StarlingMonkey; however, it has not yet been widely tested in real-world scenarios. Initial speedups we are seeing on Octane (CPU-intensive JS benchmarks) are in the 3x-5x range. This is roughly equivalent to the speedup that a native JS engine's "baseline JIT" compiler tier gets over its interpreter, and it uses the same basic techniques -- compiling all polymorphic operations (all basic JS operators) to inline-cache sites that dispatch to stubs depending on types. Further speedups can be obtained eventually by inlining stubs from warmed-up IC chains, but that requires warmup. Important to note is that this compilation approach is *fully ahead-of-time*: it requires no profiling or observation or warmup of user code, and compiles the JS directly to Wasm that does not do any further codegen/JIT at runtime. Thus, it is suitable for the per-request isolation model (new Wasm instance for each request, with no shared state).
This PR pulls in my work to use "weval", the WebAssembly partial evaluator, to perform ahead-of-time compilation of JavaScript using the PBL interpreter we previously contributed to SpiderMonkey. This work has been merged into the BA fork of SpiderMonkey in bytecodealliance/gecko-dev#45, bytecodealliance/gecko-dev#46, bytecodealliance/gecko-dev#47, bytecodealliance/gecko-dev#48, bytecodealliance/gecko-dev#51, bytecodealliance/gecko-dev#52, bytecodealliance/gecko-dev#53, bytecodealliance/gecko-dev#54, bytecodealliance/gecko-dev#55, and then integrated into StarlingMonkey in bytecodealliance/StarlingMonkey#91. The feature is off by default; it requires a `--enable-experimental-aot` flag to be passed to `js-compute-runtime-cli.js`. This requires a separate build of the engine Wasm module to be used when the flag is passed. This should still be considered experimental until it is tested more widely. The PBL+weval combination passes all jit-tests and jstests in SpiderMonkey, and all integration tests in StarlingMonkey; however, it has not yet been widely tested in real-world scenarios. Initial speedups we are seeing on Octane (CPU-intensive JS benchmarks) are in the 3x-5x range. This is roughly equivalent to the speedup that a native JS engine's "baseline JIT" compiler tier gets over its interpreter, and it uses the same basic techniques -- compiling all polymorphic operations (all basic JS operators) to inline-cache sites that dispatch to stubs depending on types. Further speedups can be obtained eventually by inlining stubs from warmed-up IC chains, but that requires warmup. Important to note is that this compilation approach is *fully ahead-of-time*: it requires no profiling or observation or warmup of user code, and compiles the JS directly to Wasm that does not do any further codegen/JIT at runtime. Thus, it is suitable for the per-request isolation model (new Wasm instance for each request, with no shared state).
When the
WEVAL
option is turned on (cmake -DWEVAL=ON
), this PR adds:componentize.sh
.When built with:
We can then do:
Using the Richards Octane benchmark adapted slightly with a
main()
for the HTTP server world 1, I get the following results: