Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructuring data-driven tests for better encapsulation and less fragility #394

Open
vreuter opened this issue Dec 1, 2024 · 0 comments
Labels
good first issue Good for newcomers help wanted Extra attention is needed testing

Comments

@vreuter
Copy link
Collaborator

vreuter commented Dec 1, 2024

Right now, we have a variety of tests in src/test/scala which are backed by small data files (usually CSV or JSON) in src/test/resources. Many of these tests and files are intimately linked, with test expectations and assertions linked to small changes in data within a particular resource file, relative to another resource file. The file names typically allude to what the changes in data and expectation are, and the test(s) referencing the data file inform about the expectation, but this linkage is inherently fragile.

What if a tiny change is made to a data file -- will it break the associated test(s)? What if test logic changes -- will the assertions now fail? What if the implementation logic changes -- are the relationships under test still expected to hold? These are somewhat inherently difficult questions to answer for any combination of (code, tests, data), but physical separation of test data and test logic exacerbates the difficulty.

We should come up with some abstractions which better encapsulate the relationship between test data and tests logic/expectation, and then use these abstractions to more closely couple the data and the logic. Ideally, we move to a model where each test function/assertion in a particular test class more closely resembles an executor which takes a bundle of data and expectations, and perhaps how to pull out--to check against expectation --"observations" from the result of executing the function call being tested, and then passes or fails based on the comparison of the expectations to observations. This would be table-driven parameterization, with each input something like (data, expectations, extractions), where data are passed to the function call being executed, and then expectations and extractions pair up 1:1, such that each extraction yields an observation to check against the corresponding expectation, each extraction being applied to the result from the application of the function under test to the given test data. This is already our model in several spots, but we should adopt it and encode it in more thoroughgoing fashion.

@vreuter vreuter added good first issue Good for newcomers help wanted Extra attention is needed testing labels Dec 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed testing
Projects
None yet
Development

No branches or pull requests

1 participant