You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've had a handful of bugs in the past which only reproduced with a particular seed, such as 716f007 or d8e5351.
These can happen because we obfuscate code in different ways depending on the seed. For instance, a bug might only appear when one of the literal obfuscators is chosen on a particular string; if we have 8 such literal obfuscators, one single seed from a go test run has 12.5% chance of triggering the failure.
CI does tend to catch these unlikely flakes at some point; each CI run spins up the tests on three platforms, and we also run jobs on master after merge, so each test gets at least 6 different test runs. Still, multiple flakes have gone unnoticed for weeks or months in the past.
To catch them faster, I propose that we add a test-seeds CI job that runs all the tests with a series of "default seeds". For example, it could be 32 seeds ranging from -seed=001122334400 to -seed=00112233441A. We could teach the tests to inject such a default seed any time a test runs garble build without specifying a seed, and we could pass it around via an env var.
This still wouldn't make flakes impossible, but if we test with 32 seeds rather than 1, then we're roughly 32 times more likely to find such flakes.
The number of seeds to test with is up for debate. 32 is probably a bit optimistic; the CI job could take a long time. Perhaps we could start with 10.
Something else we could do is use go test -short on those seeds; most integration tests do their basic testing as part of -short, and do extra expensive tests such as double-checking go build as part of the non-short build. Meaning that we'd basically already get all the benefit of many seeds with -short, but testing with each seed would be significantly faster.
The text was updated successfully, but these errors were encountered:
I realise that the two examples I pasted above were bugs in test code, and not in garble itself, but I still think this is important to ensure the software is stable. For instance, in #461 (comment) I ran into a bug that only triggered with one literal obfuscator, so I had to keep running the tests with different seeds to get lucky. I very nearly submitted a PR with the bug because I initially had not noticed it; having CI triple-check many seeds would have been nice.
most integration tests do their basic testing as part of -short, and do extra expensive tests such as double-checking go build as part of the non-short build.
Speaking of which, go test -race takes a long time on CI; go test takes ~3m on Linux, and it's ~8m for go test -race. We could cut a good two or three minutes from the CI time by using go test -short -race instead; following the same logic as above, we shouldn't lose much race coverage at all.
We've had a handful of bugs in the past which only reproduced with a particular seed, such as 716f007 or d8e5351.
These can happen because we obfuscate code in different ways depending on the seed. For instance, a bug might only appear when one of the literal obfuscators is chosen on a particular string; if we have 8 such literal obfuscators, one single seed from a
go test
run has 12.5% chance of triggering the failure.CI does tend to catch these unlikely flakes at some point; each CI run spins up the tests on three platforms, and we also run jobs on master after merge, so each test gets at least 6 different test runs. Still, multiple flakes have gone unnoticed for weeks or months in the past.
To catch them faster, I propose that we add a
test-seeds
CI job that runs all the tests with a series of "default seeds". For example, it could be 32 seeds ranging from-seed=001122334400
to-seed=00112233441A
. We could teach the tests to inject such a default seed any time a test runsgarble build
without specifying a seed, and we could pass it around via an env var.This still wouldn't make flakes impossible, but if we test with 32 seeds rather than 1, then we're roughly 32 times more likely to find such flakes.
The number of seeds to test with is up for debate. 32 is probably a bit optimistic; the CI job could take a long time. Perhaps we could start with 10.
Something else we could do is use
go test -short
on those seeds; most integration tests do their basic testing as part of-short
, and do extra expensive tests such as double-checkinggo build
as part of the non-short build. Meaning that we'd basically already get all the benefit of many seeds with-short
, but testing with each seed would be significantly faster.The text was updated successfully, but these errors were encountered: