Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: improve test infrastructure #554

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

abrown
Copy link
Collaborator

@abrown abrown commented Nov 21, 2024

This change represents a rather large re-design in how wasi-libc builds and runs its tests. Initially, #346 retrieved the libc-test repository and built a subset of those tests to give us some amount of test coverage. Later, because there was no way to add custom C tests, #522 added a smoke directory which allowed this. But (a) each of these test suites was built and run separately and (b) it was unclear how to add more tests flexibly--some tests should only run on *p2 targets or *-threads targets, e.g.

This change reworks all of this so that all tests are built the same way, in the same place. For downloaded tests like those from libc-test, I chose to add "stub tests" that #include the original version. This not only keeps all enabled tests in one place, it also allows us to add "directives," C comments that the Makefile uses to filter out tests for certain targets or add special compile, link or run flags. These rudimentary scripts, along with other Bash logic I moved out of the Makefile now live in the scripts directory.

Finally, all of this is explained more clearly in an updated README.md. The hope with documenting this a bit better is that it would be easier for drive-by contributors to be able to either dump in new C tests for regressions they may find or enable more libc-tests. As of my current count, we only enable 40/75 of libc-test's functional tests, 0/228 math tests, 0/69 regression tests, and 0/79 API tests. Though many of these may not apply to WASI programs, it would be nice to explore how many more of these tests can be enabled to increase wasi-libc's test coverage. This change should explain how to do that and, with directives, make it possible to condition how the tests compile and run.

This change represents a rather large re-design in how `wasi-libc`
builds and runs its tests. Initially, WebAssembly#346 retrieved the `libc-test`
repository and built a subset of those tests to give us some amount of
test coverage. Later, because there was no way to add custom C tests,
 WebAssembly#522 added a `smoke` directory which allowed this. But (a) each of
these test suites was built and run separately and (b) it was unclear
how to add more tests flexibly--some tests should only run on `*p2`
targets or `*-threads` targets, e.g.

This change reworks all of this so that all tests are built the same
way, in the same place. For downloaded tests like those from
`libc-test`, I chose to add "stub tests" that `#include` the original
version. This not only keeps all enabled tests in one place, it also
allows us to add "directives," C comments that the `Makefile` uses to
filter out tests for certain targets or add special compile, link or run
flags. These rudimentary scripts, along with other Bash logic I moved
out of the Makefile now live in the `scripts` directory.

Finally, all of this is explained more clearly in an updated
`README.md`. The hope with documenting this a bit better is that it
would be easier for drive-by contributors to be able to either dump in
new C tests for regressions they may find or enable more libc-tests. As
of my current count, we only enable 40/75 of libc-test's functional
tests, 0/228 math tests, 0/69 regression tests, and 0/79 API tests.
Though many of these may not apply to WASI programs, it would be nice to
explore how many more of these tests can be enabled to increase
wasi-libc's test coverage. This change should explain how to do that
and, with directives, make it possible to condition how the tests
compile and run.
Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! Very nice setup. Kind of reminds me of llvm's lit/filecheck system.

test/README.md Outdated Show resolved Hide resolved
test/README.md Outdated
// filter.py(TARGET_TRIPLE): !wasm32-wasip2
// add-flags.py(CFLAGS): ...
// add-flags.py(LDFLAGS): ...
// add-flags.py(RUN): ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be more readable if we drop the .py here?

Also perhaps a special prefix such as "//!" would be good to distinguish directives from normal comments?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on the //! bit. Still mulling over the .py suggestion: one thing that always bugs me with infrastructure stuff like this is that I don't know where to look if something goes wrong. I figured if I put the .py extension on there most of us would think, "oh, I see, this logic is from some file... let's bust out find..." But that may not be as clear as I think?

test/scripts/failed-tests.sh Show resolved Hide resolved
echo "$ENGINE $WASM" > cmd.sh
chmod +x cmd.sh
./cmd.sh &> output.log
[ $? -eq 0 ] || echo "Test failed" >> output.log
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the string "Test failed" at the end of output.log is the signal that the test failed? Is that right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually if there's any output at all; appending this string just makes sure of that.

./cmd.sh &> output.log
[ $? -eq 0 ] || echo "Test failed" >> output.log
popd > /dev/null

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing newline here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of my files have that to avoid GitHub's red-line warnings.

To avoid a `Makefile` dependency issue, a previous commit changed the
location of the download directory to live inside the build directory;
this change propagates that to the CI configuration. Also, it is no
longer necessary to clean up anything between runs: both the `build` and
`run` directory are subdivided by target triple so repeated builds will
not interfere.
@abrown
Copy link
Collaborator Author

abrown commented Nov 23, 2024

Wow! Very nice setup. Kind of reminds me of llvm's lit/filecheck system.

Heh, a poor man's version unfortunately, but it should work ok to enable more tests. I think one future thing we'll want to add is the ability to express "if we're compiling target X, then add flags Y and Z." But I didn't bother with that just yet.

@abrown abrown marked this pull request as ready for review November 23, 2024 00:20
@abrown abrown requested a review from sbc100 November 23, 2024 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants