Our testing tools have served us well, and there are plenty of good things with them that we absolutely need to keep.
We have grown a large test suite though, and the sheer size of it is putting pressure on it and it is
starting to show cracks.
Besides bugs per-se (double execution) - that we hopefully fixed - our design is reaching its limits.
We do have rather impactful issues with regard to test isolation and test concurrency, leading to situations where
it is hard or outright impossible to figure out which test is causing a cascading failure, or why the adding of a new test
file working individually breaks everything.
Furthermore, as we are not prescriptive on certain things, we do see a lot of negative patterns emerging from test authors:
- defer being used instead of t.Cleanup
- not calling cleanup routines before running tests
- outright forgetting to cleanup resources
- Cwd for binary calls is by default the current working directory of the test process - this is causing a variety of issues,
as it could very well be read-only (lima), and should by default be a temp directory
- manipulating the environment directly - which has side-effects for other tests
- tests becoming big blurbs of mixed together setup, cleanup, and actual test routines - making them hard to read and figuring
out what is actually being tested
- subtests repetitiveness w. shadowing testutil.T leading to confusing code
- ... or not dereferencing the current test in a loop
- in-test homegrown abstractions being inherently repetitive, with the same boilerplate over and over again
- structuring tests and subtests being left as an exercise to the developer - leading to a wide variety of approaches
and complex boilerplate
- hard to debug: a lot of the assert methods do not provide any feedback whatsoever on what was the command that actually
failed, or what was the output, or environment of it
- icmd.Expected showing its limits, and making assumptions that there is only one error you can test, or that you can only
test the output if exitCode == 0
- running commands with other binaries than the current target (eg: not calling base.Cmd) being left as an exercise
to the developer, leading to all of the issues above (Chdir, Env manipulation, no debugging output, etc)
- no-parallel being the default - unless specified otherwise - which should be the other way around
- very rarely testing stderr in case of error - partly because we do not use typed errors, but also because it is
cumbersome / not obvious / limited
This new tooling offers a set of abstractions that should address all of these and encourage more expressive, better structured,
better isolated, more debuggable tests.
Signed-off-by: apostasie <[email protected]>