Skip unsupported regression tests #615

schuessf · 2023-02-28T14:17:39Z

Currently many regression tests still fail (see #611). Therefore the Jenkins status ("unstable") is not a good indicator. However, some of the tests are expected to fail (and always have), because they only pass for some settings/toolchains (e.g. due to overapproximation).

This PR provides the possibility to mark those tests as skipped after running them. Therefore we store all tests (consisting of file, settings, toolchain) along with a verdict in a separate file. If a tests fails with this verdict, it is marked as skipped, otherwise as failed.

There are still some open points to discuss:

Currently the files containing the verdicts need to be in the same folder as the settings or toolchains (and not the file itself!) to parse the skipped-file beforehand.
In the current version the verdicts are the results as strings (e.g. TIMEOUT, UNKNOWN, EXCEPTION_OR_ERROR,...). This has two possible issues. First we need to convert these Strings to enums, which might fail. Second these categories might not be as precise as wanted. Therefore we could use the description (e.g. Unable to prove ... Reason: overapproximation of ...) as verdict in this skipped-file.
And of course, if it is reasonable to mark the given tests as skipped.

danieldietsch · 2023-02-28T15:07:32Z

When you run a test suite with skipped tests, can you infer from the test suite result alone which tests where skipped? For example by looking into the TEST-*.xml?

trunk/examples/programs/FloatingPoint/regression/c/.skip

trunk/examples/programs/regression/c/.skip

danieldietsch · 2023-02-28T15:12:29Z

trunk/examples/programs/regression/c/.skip

+BitwiseOperations02.c KojakC-Reach-32Bit-Default.epf KojakC.xml UNKNOWN
+
+// Timeout
+BitwiseOperations01.c BlockEncodingV2AutomizerC-FP-MaxSaneBE.epf BlockEncodingV2AutomizerC.xml TIMEOUT


these worked with MaxSaneBE at some point. Does increasing the timeout help here?

danieldietsch · 2023-02-28T15:16:36Z

...urce/Library-UltimateTest/src/de/uni_freiburg/informatik/ultimate/test/UltimateTestCase.java

@@ -183,7 +184,12 @@ public void test() {
 				}
 				if (th != null) {
 					message += " (Ultimate threw an Exception: " + th.getMessage() + ")";
+					if (result == TestResult.IGNORE) {
+						skipTest(message, th);


are you sure its a good idea to throw a new exception and dont keep the old one? at least as wrapped exception?

@maul-esel wrote this 🙂

Ok, so this was a long time ago, but: From my reading of the code, it seems like we (now) do keep the old exception th as a wrapped exception, right?

...c/de/uni_freiburg/informatik/ultimate/test/junitextension/testfactory/FactoryTestRunner.java

danieldietsch · 2023-04-23T08:15:30Z

@schuessf Is there a reason why this isnt merged?

schuessf · 2024-08-20T09:27:43Z

Okay, I revisited this PR after a long time 🙃 and updated a few files. These are the downsides and open questions:

What should be name of the files that specify tests that are skipped (currently: .skip)? Should it be a fixed name or do we want to allow some pattern?
Currently the files have to be in the same folder as the settings or toolchains, but not the file itself. This allows to parse the file only once beforehand and not for every test iself.
The verdict in the .skip-files can currently only be the result (e.g. UNKNOWN, TIMEOUT, ...), but not a substring of the more detailed error message. This would allow us a more precise check, whether the test should be indeed marked as skipped (or if an unexpected error occured).

If we can agree on these questions, this should be ready to be merged -- and finally all our tests might pass 😉

danieldietsch · 2024-08-20T09:49:24Z

I would say:

Keep .skip, do not allow different names. Easier to find with find and the like.
Keep it in the same folder as settings or toolchains. Just one concept, would be more complex if we had different ones. And parsing once is of course also important :)
It would be nice to allow regexes here; then we could also exclude all results with .*.

schuessf · 2024-08-20T10:42:28Z

I updated the code to use regular expressions that also match on the message and not only the result (7b67a7e), without changing the .skip files yet.
If we also want to use regular expressions in the .skip files, I think the current space-separated format is not optimal (since there might be also spaces inside the regexes). Maybe we can change the format also to be more "category-based", to avoid duplicating the same regexes.

danieldietsch · 2024-08-20T10:47:31Z

Yeah, using them in the .skip files would be needed. Category-based would also be nice. Perhaps use YAML or something for which we already have a parser?

maul-esel

Thanks for reviving this! 🎉

See below for a few comments and questions.

...eTest/src/de/uni_freiburg/informatik/ultimate/test/decider/SafetyCheckTestResultDecider.java

...Test/src/de/uni_freiburg/informatik/ultimate/regressiontest/AbstractRegressionTestSuite.java

trunk/examples/concurrent/bpl/regression/skip.yml

...Test/src/de/uni_freiburg/informatik/ultimate/regressiontest/AbstractRegressionTestSuite.java

trunk/examples/concurrent/pthreads/races/regression/skip.yml

trunk/examples/programs/regression/bpl/skip.yml

...Test/src/de/uni_freiburg/informatik/ultimate/regressiontest/AbstractRegressionTestSuite.java

danieldietsch · 2024-08-28T19:30:36Z

Just to bump, not that it lays dormant for too long :)

schuessf · 2024-08-29T21:02:51Z

Just to bump, not that it lays dormant for too long :)

There are some conflicts now. I will try to resolve them next week and finalize this PR 😉

schuessf · 2024-09-04T11:51:09Z

I just discovered some bugs and restarted the nightly. I think, this PR should be ready now.

Just one thing I recently discovered: All tests that are marked as skipped, occur also as "success" in Jenkins (and maybe even in JUnit?), e.g. see here and here. @maul-esel do you know why this is happening?

maul-esel · 2024-09-07T12:20:26Z

I do not know. When running tests in Eclipse, it looks fine. But I get the same phenomenon when running tests via the maven command line:

Tests run: 3, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 27.804 s - in de.uni_freiburg.informatik.ultimate.regressiontest.generic.RegressionTestSuite
I_my-test_regression_sum_invariant.bpl S_my-test_regression_ReachSafety-32bit-Automizer.epf T_my-test_regression_ReachSafety.xml(de.uni_freiburg.informatik.ultimate.regressiontest.generic.RegressionTestSuite) skipped
I_my-test_regression_sum_invariant.bpl S_my-test_regression_ReachSafety-32bit-Automizer.epf T_my-test_regression_ReachSafety.xml(de.uni_freiburg.informatik.ultimate.regressiontest.generic.RegressionTestSuite)  Time elapsed: 26.044 s
I_my-test_regression_sum_invariant.bpl S_my-test_regression_ReachSafety-32bit-GemCutter.epf T_my-test_regression_ReachSafety.xml(de.uni_freiburg.informatik.ultimate.regressiontest.generic.RegressionTestSuite)  Time elapsed: 1.751 s

(the third line should not be there)

schuessf requested review from maul-esel, Heizmann and danieldietsch February 28, 2023 14:17

danieldietsch approved these changes Feb 28, 2023

View reviewed changes

schuessf force-pushed the wip/dk/skip-unsupported2 branch 2 times, most recently from 46204d4 to d43de7f Compare August 20, 2024 09:02

maul-esel reviewed Aug 20, 2024

View reviewed changes

maul-esel and others added 15 commits September 2, 2024 10:03

add support for ignoring tests after running them

8b4ea8f

preliminary support for test result deciders to skip tests

12536f9

First attempt to check if a test case is ignored

f3ad4f9

Add possibility to ignore test failures to SafetyCheckTestResultDecider

8fae4cd

Don't overwrite IGNORE result

a03eee8

Skip overflow tests as well

2547848

Ignore safety tests in various folders

8cbef81

Ignore some more tests and move to the correct location

688f032

Ignore another test

da2b1a0

Improve parsing of ignore file

1f787f0

Restructure ignore files, add comments with reason

522e294

Add expected verdict to ignored.txt, only skip with that verdict

ed8ccb9

Fix some expected verdicts

dbd795a

Catch case where String cannot be converted to enum

bcbff56

Minor: Restructure method

adde04c

schuessf added 13 commits September 2, 2024 10:04

Skip tests with overapproximation

75786af

Fix build in new regression tests

fcfd79a

Throw exception when testsuite does not support skipping tests

fa6fc5f

Update .skip files

b03532f

Skip data race tests

9189631

Fix updating wrong .skip file

2467d4a

Use regex to check if test should be marked as skipped

c17b485

Fix typo

989a4f2

Remove unnecessary .skip file

b72f878

Use YAML for skip files

b4ea028

Add missing space

ebef1df

Use proper YAML syntax for skip files

ab8bd81

Only allow skip files next to the tests itself

362adb7

schuessf force-pushed the wip/dk/skip-unsupported2 branch from 62e5a22 to 362adb7 Compare September 2, 2024 08:07

schuessf added 10 commits September 2, 2024 10:09

Use .testignore as filename

a782a21

Split .testignore properly for subfolders

594d115

Allow skipping termination tests

f845e42

Add missing import

261c3fa

Add newline

10f0efb

Add checks for no overriden result

1fbf05f

Add newline

9a97c39

Avoid parentheses in regex

1f5cc6a

Add timeout

c9d2c0b

Skip termination tests properly

95ffcc8

Add separate method to override verdict

ecb5fa7

maul-esel approved these changes Sep 9, 2024

View reviewed changes

schuessf merged commit a9b967e into dev Sep 9, 2024
3 of 4 checks passed

schuessf deleted the wip/dk/skip-unsupported2 branch September 9, 2024 20:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip unsupported regression tests #615

Skip unsupported regression tests #615

schuessf commented Feb 28, 2023 •

edited

Loading

danieldietsch commented Feb 28, 2023

danieldietsch Feb 28, 2023

danieldietsch Feb 28, 2023

schuessf Feb 28, 2023

maul-esel Aug 20, 2024

danieldietsch commented Apr 23, 2023

schuessf commented Aug 20, 2024 •

edited

Loading

danieldietsch commented Aug 20, 2024

schuessf commented Aug 20, 2024

danieldietsch commented Aug 20, 2024

maul-esel left a comment

danieldietsch commented Aug 28, 2024

schuessf commented Aug 29, 2024

schuessf commented Sep 4, 2024

maul-esel commented Sep 7, 2024

Skip unsupported regression tests #615

Skip unsupported regression tests #615

Conversation

schuessf commented Feb 28, 2023 • edited Loading

danieldietsch commented Feb 28, 2023

danieldietsch Feb 28, 2023

Choose a reason for hiding this comment

danieldietsch Feb 28, 2023

Choose a reason for hiding this comment

schuessf Feb 28, 2023

Choose a reason for hiding this comment

maul-esel Aug 20, 2024

Choose a reason for hiding this comment

danieldietsch commented Apr 23, 2023

schuessf commented Aug 20, 2024 • edited Loading

danieldietsch commented Aug 20, 2024

schuessf commented Aug 20, 2024

danieldietsch commented Aug 20, 2024

maul-esel left a comment

Choose a reason for hiding this comment

danieldietsch commented Aug 28, 2024

schuessf commented Aug 29, 2024

schuessf commented Sep 4, 2024

maul-esel commented Sep 7, 2024

schuessf commented Feb 28, 2023 •

edited

Loading

schuessf commented Aug 20, 2024 •

edited

Loading