test: improve test infrastructure #554

abrown · 2024-11-21T23:43:41Z

This change represents a rather large re-design in how wasi-libc builds and runs its tests. Initially, #346 retrieved the libc-test repository and built a subset of those tests to give us some amount of test coverage. Later, because there was no way to add custom C tests, #522 added a smoke directory which allowed this. But (a) each of these test suites was built and run separately and (b) it was unclear how to add more tests flexibly--some tests should only run on *p2 targets or *-threads targets, e.g.

This change reworks all of this so that all tests are built the same way, in the same place. For downloaded tests like those from libc-test, I chose to add "stub tests" that #include the original version. This not only keeps all enabled tests in one place, it also allows us to add "directives," C comments that the Makefile uses to filter out tests for certain targets or add special compile, link or run flags. These rudimentary scripts, along with other Bash logic I moved out of the Makefile now live in the scripts directory.

Finally, all of this is explained more clearly in an updated README.md. The hope with documenting this a bit better is that it would be easier for drive-by contributors to be able to either dump in new C tests for regressions they may find or enable more libc-tests. As of my current count, we only enable 40/75 of libc-test's functional tests, 0/228 math tests, 0/69 regression tests, and 0/79 API tests. Though many of these may not apply to WASI programs, it would be nice to explore how many more of these tests can be enabled to increase wasi-libc's test coverage. This change should explain how to do that and, with directives, make it possible to condition how the tests compile and run.

This change represents a rather large re-design in how `wasi-libc` builds and runs its tests. Initially, WebAssembly#346 retrieved the `libc-test` repository and built a subset of those tests to give us some amount of test coverage. Later, because there was no way to add custom C tests, WebAssembly#522 added a `smoke` directory which allowed this. But (a) each of these test suites was built and run separately and (b) it was unclear how to add more tests flexibly--some tests should only run on `*p2` targets or `*-threads` targets, e.g. This change reworks all of this so that all tests are built the same way, in the same place. For downloaded tests like those from `libc-test`, I chose to add "stub tests" that `#include` the original version. This not only keeps all enabled tests in one place, it also allows us to add "directives," C comments that the `Makefile` uses to filter out tests for certain targets or add special compile, link or run flags. These rudimentary scripts, along with other Bash logic I moved out of the Makefile now live in the `scripts` directory. Finally, all of this is explained more clearly in an updated `README.md`. The hope with documenting this a bit better is that it would be easier for drive-by contributors to be able to either dump in new C tests for regressions they may find or enable more libc-tests. As of my current count, we only enable 40/75 of libc-test's functional tests, 0/228 math tests, 0/69 regression tests, and 0/79 API tests. Though many of these may not apply to WASI programs, it would be nice to explore how many more of these tests can be enabled to increase wasi-libc's test coverage. This change should explain how to do that and, with directives, make it possible to condition how the tests compile and run.

sbc100

Wow! Very nice setup. Kind of reminds me of llvm's lit/filecheck system.

test/README.md

sbc100 · 2024-11-21T23:58:18Z

test/README.md

+// filter.py(TARGET_TRIPLE): !wasm32-wasip2
+// add-flags.py(CFLAGS): ...
+// add-flags.py(LDFLAGS): ...
+// add-flags.py(RUN): ...


Would this be more readable if we drop the .py here?

Also perhaps a special prefix such as "//!" would be good to distinguish directives from normal comments?

Agreed on the //! bit. Still mulling over the .py suggestion: one thing that always bugs me with infrastructure stuff like this is that I don't know where to look if something goes wrong. I figured if I put the .py extension on there most of us would think, "oh, I see, this logic is from some file... let's bust out find..." But that may not be as clear as I think?

test/scripts/failed-tests.sh

sbc100 · 2024-11-22T00:05:05Z

test/scripts/run-test.sh

+    echo "$ENGINE $WASM" > cmd.sh
+    chmod +x cmd.sh
+    ./cmd.sh &> output.log
+    [ $? -eq 0 ] || echo "Test failed" >> output.log


It looks like the string "Test failed" at the end of output.log is the signal that the test failed? Is that right?

It's actually if there's any output at all; appending this string just makes sure of that.

sbc100 · 2024-11-22T00:05:16Z

test/scripts/run-test.sh

+    ./cmd.sh &> output.log
+    [ $? -eq 0 ] || echo "Test failed" >> output.log
+popd > /dev/null
+


Trailing newline here.

All of my files have that to avoid GitHub's red-line warnings.

To avoid a `Makefile` dependency issue, a previous commit changed the location of the download directory to live inside the build directory; this change propagates that to the CI configuration. Also, it is no longer necessary to clean up anything between runs: both the `build` and `run` directory are subdivided by target triple so repeated builds will not interfere.

abrown · 2024-11-23T00:14:22Z

Wow! Very nice setup. Kind of reminds me of llvm's lit/filecheck system.

Heh, a poor man's version unfortunately, but it should work ok to enable more tests. I think one future thing we'll want to add is the ability to express "if we're compiling target X, then add flags Y and Z." But I didn't bother with that just yet.

sbc100 reviewed Nov 22, 2024

View reviewed changes

abrown added 6 commits November 21, 2024 18:50

fix: avoid using pushd and popd

355683d

review: update comment from *.err to *.log

22815af

review: use //! for directive comments

86bb7a1

fix: compile wasm32-wasip2 target correctly

0626461

review: drop the .c suffix on test directories

bcbd60f

abrown force-pushed the improve-test-infrastructure branch from e4d1534 to bcbd60f Compare November 23, 2024 00:11

abrown marked this pull request as ready for review November 23, 2024 00:20

abrown requested a review from sbc100 November 23, 2024 00:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: improve test infrastructure #554

test: improve test infrastructure #554

abrown commented Nov 21, 2024 •

edited

Loading

sbc100 left a comment

sbc100 Nov 21, 2024

abrown Nov 22, 2024

sbc100 Nov 22, 2024

abrown Nov 22, 2024

sbc100 Nov 22, 2024

abrown Nov 23, 2024

abrown commented Nov 23, 2024

test: improve test infrastructure #554

Are you sure you want to change the base?

test: improve test infrastructure #554

Conversation

abrown commented Nov 21, 2024 • edited Loading

sbc100 left a comment

Choose a reason for hiding this comment

sbc100 Nov 21, 2024

Choose a reason for hiding this comment

abrown Nov 22, 2024

Choose a reason for hiding this comment

sbc100 Nov 22, 2024

Choose a reason for hiding this comment

abrown Nov 22, 2024

Choose a reason for hiding this comment

sbc100 Nov 22, 2024

Choose a reason for hiding this comment

abrown Nov 23, 2024

Choose a reason for hiding this comment

abrown commented Nov 23, 2024

abrown commented Nov 21, 2024 •

edited

Loading