-
Notifications
You must be signed in to change notification settings - Fork 801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor test apps to use unit-test framework #4014
Open
bennylp
wants to merge
90
commits into
master
Choose a base branch
from
unittest-framework
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…e new framework (reason: because parallel flag is not set, doh!)
…, skip essential tests, list tests
…ce returning value
…use of errors when running on Windows virtual machine
…t wait the previous test to complete)
…ing, and running unit test. This can be used by all test apps. pjlib-util-test has been ported to use this utilities
…sential test because it does not exist in features test (in pjlib-test)
…up from 45m originally to 15m using 10 worker threads
…4:30 minutes with 10 worker threads, from 45:42m originally)
…provements due to exclusive tests
… automatic error reporting) hopefully make it easier to use
…e with unit-test logging (see unittest.md)
…args are set in GitHub action variables
Merged
…since binding fails occasionally
… into unittest-framework
For failed pjsip-test on Mac, it is because of After fixing this, it's never stuck again in ~15 runs here, but occasionally (~20%) will fail with:
|
That failures happen here as well, and I tried to fix it in 58602a4 |
…test-framework
… into unittest-framework
… into unittest-framework
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains modifications to PJSIP test apps (
pjlib-test
,pjlib-util-test
,pjnath-test
,pjmedia-test
, andpjsip-test
) to use the new unit test framework (#4007) with the main objective to make them complete faster.Note: I recommend not to immediately delete the branch after merging to track possible future problems due to the very large changes in the test files.
Timing Results
Let's get straight to it. Below are the test time improvements from the original with the new framework using several worker thread settings.
GitHub CI timings:
Notes on timing
Settings with three worker threads (totalling four threads with the main thread) are significant because our GitHub runners mostly use 4 VCPU according to this article.
Some tests cannot be made faster than certain limit with more worker threads, because that is the longest test case duration in that test.
General look and feel
All test apps have common look and feel with uniform command line options, which look something like this:
The test outputs are also uniform, which look something like this:
Running the tests
With Makefile build system, it is easier to run the tests with the
make
command. TheMakefile
accepts two environment variables:CI_ARGS
contains arguments for the test apps, andCI_MODE
to indicate we're running under GitHub CI (#3374). Sample invocation:Otherwise (e.g. on Windows) run each of the app directly. Use
-h
to get help.GitHub CI modifications
CI_UBUNTU_ARGS
,CI_WIN_ARGS
,CI_MAC_ARGS
, andCI_MODE
. UseCI_XXX_ARGS
to control run-time arguments for the test apps, especially the number of worker threads, which should be equal to the number of vCPU of the GitHub action runners minus one (because the main thread also runs the test cases).Tips on troubleshooting errors
When the logging does not convey sufficient info about the error, use
--log-no-cache
to display logs as they are written, most likely with-w 0
to disable worker thread to avoid cluttering the output.But sometimes, problem only arises with specific worker thread number and test orders. In this case, troubleshooting will be challenging indeed. :) Use
-v, --verbose
to display when tests are started/ended. This way you can know what tests were started when the failed test was running. After that, you can try running only these tests rather than all tests to reproduce the problem.Test shuffling (
--shuffle
arg) is used by default on GitHub CI via repository variables (see above). To reproduce the error, make note of the seed value used when running the (failed) test (it is printed in the output), and re-run the test (locally) using--shuffle --seed N
args.The
--stop-err
option is useful to avoid waiting for all tests to complete when debugging an error.Open issues
Reproducibility
As mentioned above, we're supposed to be able to reproduce the test sequence by using
--shuffle
and specific--seed
value. But as it turns out, this is not the case. Even with the same seed, the test sequence can be different on different machine. We already use our own psudo random number generator inunittest.c
, but this doesn't seem to fix the problem.Intermitten crashes
There is an intermitten crashes in pjsip's
regc_test
. This could be related totdata
being dec-ref-ed more than it should (even though the test result is success). The exact cause however is still not known.Test app modifications
General
There is a new utility file in
pjlib/src/pjlib-test/test_util.h
that is shared by all test apps to parse command line arguments, show usage, register tests, and control the unit testing process.The main front-end files (
main.c
) were modified to be more nice as command line apps.The main modification in test body (
test.c
) is to use the unit-test framework.Some test codes were changed, replacing manual checks with
PJ_TEST_XXX()
macros, mainly to test the usage of these macros and to make the test nicer. But since it made the PR very big, I didn't continue the effort, unless when it was necessary for debugging some problems.In general, large tests needed to be split into smaller ones to make them run in parallel. But major problems arose, mainly because the tests share global states or manipulate common objects.
More specific changes are discussed below.
pjlib-test
notespjlib-test
has "special" arrangements intest.c
, because it needs to test the unit-test (UT) framework first, before running the rest of the test using the UT framework. But before testing the UT framework, it needs to test the components needed by the UT framework such as list, fifobuf, and OS. And so on. That's why the test output is different than the rest of the test apps.Other than that, the modifications to the test functions are not too major, at least compared to pjnath-test and pjsip-test, and I think the test time is quite satisfactory.
pjlib-util-test
notesWe couldn't speed up more because tests such as
resolver_test()
andhttp_client_test()
takes about three minutes to complete and they couldn't be split up without major effort due to the use of global states. Since the test time is already quite satisfactory, I didn't pursue further optimizations.pjnath-test
notespjnath-test
requires large modifications to make the tests run in parallel as follows:mem
pool factory since many tests validate the memory leak in the pool factory, therefore having a single pool factory will not workserver.c
so that server can be instantiated multiple times simultaneously (this was the motivation behind API to get DNS server's bound address to allow specifying zero as port number #3999).ice_test
,turn_sock_test
,concur_test
) into individual test for each configuration, making them parallelable.As the result, there are 70 smaller test items in
pjnath-test
, and with 7 worker threads, we can save 40 minutes of test time!pjmedia-test
notespjmedia-test
has the least modifications because it has very few tests. The original duration was 4m18.691s, and has come down a little to 2m8.363s with 1 worker thread.Having said that, some minor modifications were done:
pjmedia_endpt_create()
withpjmedia_endpt_create2()
(similarly..destroy()
with..destroy2()
) inmips_test()
andcodec_test_vectors()
, to avoid inadvertently initializingpjmedia_aud_subsys
which on Ubuntu emits lots of debugging messages during initialization (although the messages should have been suppressed in the code).printf
with log in jbuf test to make the output tidy, and renamedjbuf_main
function name tojbuf_test
to be consistent.pjsip-test
notespjsip-test
has also gone through the biggest and most difficult modifications to make the tests parallelable, which involves:pjsip_tpselector
to bind transaction (andtdata
in case of stateless request) with specific loop transport, otherwise the transaction/tdata may find other instance of loop transporttsx_uac_test
failed because UA layer has now been registered before the test)tsx_basic_test
,tsx_uac_test
,tsx_uas_test
to take the index to parameters rather than the parameter itself to make the test output more informative.