-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC. Support for resetting function-scoped fixtures #12596
Comments
the scope name "item" is misleading having a scope name per function call is also misleading as "function" is sce scope per call this really needs a design scope out, as practically this is a type of dynamic parameterize and we need propper ways of reporting this and actually remaking fixtures of the function scope for sanity the whole idea of a "multi-executor" needs to be fleshed out in a more nuanced manner, as we need to set out for correct reporting, and need to consider plugins like rerunfailures (that currently are messy wrt setupstate) and repeat (which are messy wrt abusing fixture parameters for re-execution i dont want to deter the attempt, but the introduction of package scope while it was not quite complete turned out to carry some massive pain points that are not clearly solved and a "multi executor" idea to me seems like something with a potential for similar problems i want to avoid the mistakes we made with package scope additionally we are also working on a definition scope (which is per test definition and it applies so that multiple parameterizations of the same item may use it |
Thanks for the feedback @RonnyPfannschmidt! I understand your concerns about adding a new scope, but I also think it fits quite naturally if we can find a good naming and well-understood semantics for it.
Just to clarify: The intended semantics for the new scope is "thing that is executed and reported by pytest", do you think it should be called
It can be seen as dynamic parametrization, yes. But I'm not convinced we need pytest to report it as such, the status quo of repeated executions without pytest's knowledge is working well with the exception of function fixtures. I worry that aiming too high here will just lead to no progress, because that's hard.
Thanks! I'll look into these two presently, to see if there's enough overlap to classify them in the same way and reuse ideas. The concept itself is just "something that wraps the test function to run several times in a single pytest test", which is well established by hypothesis and don't need much more fleshing-out?
Understood. Note, the "multi executor" idea is not new, only the term I coined for it, to aim for (limited) generality. Read it as "hypothesis" if that is clearer.
That should be fine, it would be orthogonal to this proposal, with |
part of me is getting the impression that we shouldnt have a new scope, instead we should enable a "function/item" to be run any number of time while rerunning its function scope setup entirely (this avoids issues like tmpdir setup/teardown and other confusions as it would all be about being able to run a item multiple times for different reasons (repeat, flakyness, different hypothesis datasets) however that type of rerunning is something that needs a change in the runtestprotocol, which is one of the harder changes of the internals |
I checked these
|
Yeah, that would work. However: A new
The
...ouch. |
The assessment on repeat is incorrect,as the repeats it does create new test ids which is problematic For example setting repeat to 1000 on a Testsuite of 1000 tests ends up with 1000000 test ids for rerun failures in the pytest cache |
Just to leave my first impressions after reading this thread, my gut feeling aligns with @RonnyPfannschmidt: I would really like to avoid introducing a new user-facing scope, as it seems really complicated to explain/understand for the average user, because in order for it to make sense you need to first introduce the concept of "multi-executor", which is a bit niche and not something users encounter on a day to day basis. I also agree that given this is something plugins deal with it, perhaps we can greatly simplify things by providing plugins with the ability of resetting the fixture caches -- we can introduce a new experimental method to class Node:
def reset_fixture_cache(self, names: Iterable[str], recursive: bool=False) -> None:
"""
EXPERIMENTAL
Can be called by plugins in order to reset the internal cache of the given fixtures names,
which will force their execution again the next time they are requested. Should be used
with care by plugins which execute a test function multiple times.
:param recursive:
If True, also reset all fixtures of the same name (as fixtures might overwrite
other fixtures with the same name in different locations).
""" I have not looked into the details, but I suspect this is trivial/simple to implement and would solve all the issues with plugins that need to execute the same test function more than once (hypothesis, pytest-repeat, etc). |
Thanks for your suggestion @nicoddemus! I very much agree on the user-facing aspect of such a new scope, and the proposed documentation says only the following:
That said, your alternative proposal of resetting only specified fixtures would work without introducing new scope. Such a method would definitely be useful, and enough to implement basic support for proper function-scoped fixtures in hypothesis. However, I worry that it would be more error prone, as it's hard to protect against accidentally depending on a resetting fixture:
The problem here is that |
Why would this be a problem? Unless I'm missing something, resetting |
i suppose the fear is that reset has extra messy effects on hidden dependencies that will see effects from and the answer to that in this case of the proposed api is "well to bad" |
Ah I see. Well yeah, but I suppose for plugins like Hypothesis, they would be responsible to go over all the "function"-scoped fixtures of a test and resetting them all. While I understand this might not sound ideal, I think is still a step in the right direction because:
Introducing an entire new scope is much more risky because we would be releasing a user-facing feature directly, it would introduce much more code churn, documentation, etc. Also one thing does not necessarily exclude the other: we can introduce the |
Apologies for the late response, I've been travelling. Thanks for constructive ideas, I'd be happy to try implementing @nicoddemus' proposed idea. But I also realize from your back-and-forth that I don't understand the details sufficiently yet, so there will likely be some iterations. One thing in particular: After invalidating the relevant fixtures, what would be the right entry point for recreating them (without starting a new subtest or anything like that)? In the draft PR I did this explicitly (in the same order as initial creation), but I suspect this may not be enough here since the closure-to-reset may a subset of the closure of the test function - and hence some fixtures outside the scope may be torn down. [edit: If there is no such entry point, that's fine too, just so that I don't go too far down the wrong trouser leg.] |
my current understanding is that normal item setup should recreate them as needed |
What's the problem this feature will solve?
Support for per-execution fixtures in executors that wrap each test function and runs the function multiple times (for different examples) have been a long-standing issue, for example:
The motivating example is hypothesis, but in an effort to keep this issue more general I'll just refer to it as a "multi-executor" here. I'm not familiar with other multi-executors, but if there are any then the chosen solution should hopefully be usable by those as well.
Some of the reasons that this has been stalled are that the example executions are not up-front enumerable, and that the test item should still be failed/passed and reported as a unit — hence, not a good fit for subtests or other declarative-style approaches.
I propose a simpler solution, which is to expose a method that the multi-executor can call in-between examples, to explicitly signal the start of a new example and reset function-scoped fixtures.
Describe the solution you'd like
My proposed solution is, briefly:
item
, in-betweenfunction
andclass
. The new scope is equivalent tofunction
unless running with a multi-executor, but is scoped to the test item so that it is not reset between repeated executions of the function for the same test item. Few fixtures will have theitem
scope, but an example of a fixture having naturalitem
scope isrequest
.This new scope is not required for the correctness of the proposed functionality, but since each test item can be executed hundreds of times, there are performance implications that make it highly desirable. (plus, it's a helpful concept for explaining multi-executors in documentation).
FixtureRequest
, to be called by the multi-executor when function-scoped fixtures should be reset.I realize this is backwards control flow from the usual
pytest
hooks, but it's only an integration point for multi-executors, not anything that end users will have to care about.It's also possible to encapsulate the backwards call itself in a pytest-supplied decorator/wrapper — such a wrapper is actually implemented as proof-of-concept in the PR tests, and could be used as an alternative integration point: Execute the wrapper multiple times instead of the original function, and the wrapper takes care of resetting fixtures and passing their new values to the function.
The rest are implementation details, shown in the (working & with tests) prototype PR #12597.
tmp_path
) was not compatible with resets due to expecting to always having run the end-of-itempytest_runtest_makereport
before teardown.The fix was easy, but it is an indication that there may be other such incompatible fixtures in the wild. I believe this is ok, as they are not impacted unless running with a multi-executor — which was never safe before, and (opt-out) disallowed by at least hypothesis.
What do you think?
Additional context
function
, to support competing dependency-mutating fixtures. This is different from, and not addressed by, this proposal.The text was updated successfully, but these errors were encountered: