-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conflicting fixtures #7053
Comments
unfortunately the fixture system is completely unaware of such conflicts and so far none has set out to come up with a comprehensible wa of handling "uniqe" resource configurations and their dependency/topimization |
I'm still a little unclear as to how everything ties together on your end, but I'll present a solution, and you can tell me if I'm missing something. It sounds like you have several tests that are setup pretty much the same way, except the config is different. It also sounds like you have some tests that should only apply for some configs, but not others. It also looks like your example fixtures are showing that you have 2 files, and they will either have a 1 or a two written to them to show that they would conflict as one fixture could overwrite the work of another. You also mentioned that it takes a long time to prepare one of these configs, but that will be handled as a pleasant side effect of the solution I'm presenting. Side NoteI'm a bit of a testing purist, so before I dive in, I want to address this:
The difference between one test and another that do the same setup and take the same measurements (i.e. assertions) is the independent variables. If you have different independent variables, you have different tests. Differences in configs (in this case at least) constitute different independent variables, which means they are different tests because you'd be testing something different. When you treat them as different tests in the eyes of pytest, things get easier. This means each test is already defined and given a separate nodeid, regardless of which config is active. Then, when you only want to run certain tests, because you only want to use certain configs at that moment, you can then just "select" those tests using this technique. With that in mind, on to the solution. Back to itIf you have a common setup with only some variables being changed, and all the tests are the same, then fixture parameterization is the answer. Simply parameterize the right fixture, and everything extends from there. Pytest will automatically handle one param set at a time, so if that parameterized fixture has the right scope, it will only run the minimum amount of times that it needs to, and it will finish one param set before moving on to the next. In your case, it sounds like that would mean something like parameterizing a session-scoped fixture with the different config starting values (i.e. something that you can pass in to some function/class to kick off the config generation easily). Some abstraction may be required for you to more easily set this up. The tricky bit now, is that you may have certain tests that shouldn't be running against certain config params. This could lead you to using runtime logic to SolutionGoing off of your example, let's say you have your top level config_permutations= [
(Config("file1", "1"),),
(Config("file1", 2"),),
(Config("file2", "1"),),
(Config("file2", 2"),),
(Config("file1", "1"), Config("file2","1")),
(Config("file1", "1"), Config("file2","2")),
(Config("file1", "2"), Config("file2","1")),
(Config("file1", "2"), Config("file2","2")),
]
@pytest.fixture(scope="session", params=config_permutations)
def write_configs(request):
for config in request.param:
config.write_to_disk() This will parameterize every test inside Unfortunately, this includes the tests that shouldn't run for some of those param sets. But that's just because we aren't done yet. Now let's say you have some tests that should only run for configs config_permutations= [
(Config("file1", "1"), Config("file2","1")),
(Config("file1", "1"), Config("file2","2")),
]
@pytest.fixture(scope="session", params=config_permutations)
def write_configs(request):
for config in request.param:
config.write_to_disk() Now this You can repeat this approach for all the other tests that can't run under all config variations. If you're using (Resolvable) CaveatThe only issue, is that the alternate fixture versions will probably run multiple times, even if only one config is being "selected". By that I mean This can definitely be solved, and there's more than one solution for this part. However, I'll let you read through and see if you can find the answer (maybe something with singletons 🤔 ). If you're not sure, let me know and I can give some ideas. |
Thank you very much for your replies. @SalmonMode I made a oversimplified example. Our test configuration is mostly based on one config file with ~1000 bytes, each having 256 different possibilities. But there are more ways to configure the system also. So any sorting based on config is not doable. And any permutation of possible configs will be too big for any test system. One of the beautis with pytest is that we can write requirements just next to the test with fixtures or marks. We would really like to use that. I have looked into the source code into some options but have not came up with a good solution. This works but will not minimize the number of options. So option1 will in the example below be called three times instead of two. Also I have to write "option1" three times for each test.
Something like this is what I would like. But it would require a larger change to pytest so it is probably not doable soon (unless someone else more into pytest can/will help).
|
I wouldn't worry about pytest handling all the possible config permutations. If you are worried, it might be worth a look to see if your testing approach is apt, as there may be tests that either don't need to be run at all, or they could be structured far more efficiently. Rest assured, though, as there's almost always a pure pytest solution, and I'm happy to help find it. After reading through your original post again, it also sounds like the hardware only has one config "file", so to speak, rather than multiple. This makes things quite a bit simpler. But correct me if I'm wrong. Ultimately, you have a large number of configurations, and then you have a large amount of different hardware. Some pieces of hardware are incompatible with some configurations. The part that's confusing me is where you said:
I originally thought you wanted to run all tests on all hardware, under every config that each hardware is compatible with. I'm starting to think that isn't the case. Are you looking to only run each test function once, but certain tests can only be run on certain hardware with certain configurations? |
You can access marker values from fixtures using @pytest.fixture
def parsed_args(request):
marker = request.node.get_closest_marker("cli_args")
args = marker.args
if args is None:
args = ... # some suitable default for unmarked tests
# could also raise an error instead?
return mod.parse_args(args)
@pytest.mark.args(['--verbose', 'filename'])
def test_verbose(parsed_args):
mod.run(parsed_args) |
That shouldn't be necessary. Aside from the built-in If the answer to my question above is "yes", then I think the most straightforward route would to define each test for each applicable hardware configuration. Using If the fixtures for the configs are parameterized, you can give the params With that in place, you can simply target specific hardware configurations for a single test runin order to make sure they all happen together and the device is only configured/restarted once per config. I understand the concern about having to do I also highly recommend against manipulating the way pytest collects tests, as every time I've seen that done, it's only resulted in a more complex and difficult to maintain system. Note: Filter tests the way I described happens after pytest has collected all the tests. They are separate phases, and filtering is a primary feature of pytest, so it is automatically accommodated for in all aspects of pytest, so long as it's done through the standard means. Modifying how pytest collects tests isn't accommodated for in all aspects of pytest, so while it's technically supported, the burden of maintaining that is placed on the user, and it's expected that they know what they're doing and are aware of all its implications. This is part of why I avoid modifying how it collects tests. |
Closing this as an inactive question. |
why not have an option to group tests that require incompatible fixtures apart? maybe have a scope from contextlib import contextmanager
import pytest
dogs = set()
@contextmanager
def expensive_operation_with_dog_named(name):
dogs.add(name)
yield
dogs.remove(name)
@pytest.fixture(scope="tests that use this fixture")
def charlie():
with expensive_operation_with_dog_named("charlie"):
yield
@pytest.fixture(scope="tests that use this fixture")
def buddy():
with expensive_operation_with_dog_named("buddy"):
yield
def test_charlie(charlie):
assert dogs == {"charlie"}
def test_buddy(buddy):
assert dogs == {"buddy"}
def test_charlie_again(charlie):
assert dogs == {"charlie"} in my case |
@oakkitten you can do this by leveraging scope already. You can use a sub-package or just a module to house the tests that need that expensive fixture, and have the fixture set to that scope (e.g. package if it's for the package, or module if you're using a module). A fixture will only run once a test that requests it is starting to execute, unless it has autouse set to true. So in this case, it looks like |
i can, but it's rather inconvenient. it's perhaps ok if the kind of dog is one of the most important quality of a test, but what if it's rather insignificant? you are now grouping tests by some insignificant quality of them. and what if you have a test that needs not only a dog but also a cat, which is also expensive? i suppose you'd have to have nested test structures now and it's just getting more complicated. grouping test by incompatible fixtures automatically, rather than manually, seems like a cleaner solution and not too difficult to implement... why not? edit: here's an awkward idea that at least works from contextlib import contextmanager
import pytest
dogs = set()
@contextmanager
def expensive_operation_with_dog_named(name):
dogs.add(name)
yield
dogs.remove(name)
@pytest.fixture(scope="session", params=["charlie", "buddy"])
def dog(request):
with expensive_operation_with_dog_named(request.param):
yield
@pytest.fixture()
def charlie(dog):
if dogs != {"charlie"}:
pytest.skip()
@pytest.fixture()
def buddy(dog):
if dogs != {"buddy"}:
pytest.skip()
def test_charlie(charlie):
assert dogs == {"charlie"}
def test_buddy(buddy):
assert dogs == {"buddy"}
def test_charlie_again(charlie):
assert dogs == {"charlie"} |
Ah, I think I understand now. Parameterizing is great for when you want to make sure different contexts/input data result in the same behavior and output/state. For example, parameterizing browsers for e2e testing, as one would still expect identical behavior in different browser (from the perspective of the test, at least). But if you do it only because it let's you repeat yourself less (DRY), and are triggering different behaviors because of different inputs (e.g. Charlie vs buddy), then it's less clean. It violates DAMP and KISS. Based on the fundamentals of how the framework operates, it either implies test cases that don't actually exist, and dependencies that shouldn't be, or it requires more complex logic (complex meaning more diverse behavior in a smaller amount of code), and complexity is the enemy, or you manage to have both problems. The cleanest possible solution, is having two fixtures. The first example you gave is actually quite possible, and very clean. Assuming that first snippet you provided was the entire test file, you can just make the scope for Charlie and buddy be "module", and you're done. The ideal is grouping tests by the behavior under test, which is why we use fixtures to control that grouping, as a fixture often represents the triggering of a particular behavior, no matter how insignificant. |
Also, if you have a test that needs both cat and dog, then you can have the relevant cat fixtures depend on the relevant dog fixtures, so you can capitalize on the dog fixture having already been run. I like to think of it like a big XML document, so you'd have your cat tag inside the dog tag. If you have some tests that only needed the cat stuff, but not the dog stuff, you can still have them be inside the dog stuff, but then if you run them in isolation, they'll still be executing the dog stuff when they don't need to. And of course, the tests that only need the cat stuff would still be running with unnecessary dependencies when running the full test suite, which introduces confounding variables. There is no optimal solution where a test framework can be aware of these kinds of dependencies and switch them on and off based on when a test happens to be executing, because then your tests, by definition, won't be idiomatic or deterministic. Edit: instead of "deterministic", the correct word to use is "linearizable" (for anyone reading in the future) |
i only used parametrizing to demonstrate that this problem can be easily solved. it's the wrong tool to use, but the end result is that the expensive operation is run the minimal number of times and the structure of the tests does not depend on pytest internals. all without changing pytest code!
if you do that, all tests but the first one fails def test_buddy(buddy):
> assert dogs == {"buddy"}
E AssertionError: assert {'buddy', 'charlie'} == {'buddy'}
def test_charlie_again(charlie):
> assert dogs == {"charlie"}
E AssertionError: assert {'buddy', 'charlie'} == {'charlie'} did you have something different in mind?
if cat is unrelated to dog, this would lead to some very awkward code. there wouldn't be only "tom" now, but "tom and buddy". for buddy, charlie and no dog that's 3 fixtures instead of one?
here's a working example with cats and dogs, i just copy-pasted half the code: from contextlib import contextmanager
import pytest
dogs = set()
@contextmanager
def expensive_operation_with_dog_named(name):
dogs.add(name)
yield
dogs.remove(name)
@pytest.fixture(scope="session", params=["charlie", "buddy"])
def dog(request):
with expensive_operation_with_dog_named(request.param):
yield
@pytest.fixture()
def charlie(dog):
if dogs != {"charlie"}:
pytest.skip()
@pytest.fixture()
def buddy(dog):
if dogs != {"buddy"}:
pytest.skip()
cats = set()
@contextmanager
def expensive_operation_with_cat_named(name):
cats.add(name)
yield
cats.remove(name)
@pytest.fixture(scope="session", params=["tom", "simba"])
def cat(request):
with expensive_operation_with_cat_named(request.param):
yield
@pytest.fixture()
def tom(cat):
if cats != {"tom"}:
pytest.skip()
@pytest.fixture()
def simba(cat):
if cats != {"simba"}:
pytest.skip()
def test_charlie_tom(charlie, tom):
assert dogs == {"charlie"}
assert cats == {"tom"}
def test_charlie_simba(charlie, simba):
assert dogs == {"charlie"}
assert cats == {"simba"}
def test_buddy_tom(buddy, tom):
assert dogs == {"buddy"}
assert cats == {"tom"}
def test_buddy_simba(buddy, simba):
assert dogs == {"buddy"}
assert cats == {"simba"} this seems very deterministic and as ideomatic as it can get? a test simply depends on a cat and a dog and you can put tests in any order and cats and dogs in any order. pytest is even smart enough to run the expensive operation the total of 5 times! (it would be nice to have a say about which fixture is more expensive, a cat or a dog, so that pytest could minimize the number of times it's run. in this example, pytest runs, in order, charlie, tom, buddy, simba, charlie. reordering stuff gives different order, but pytest seems to be bent on making more dog than cats. apparently, the order depends on the name of parametrized fixture. it seems that the name that's closer to |
Ah I see why changing the scope didn't fix it. Unfortunately, your solution isn't deterministic, because it's dependent on test execution order, in that the changes to the system state are not the same for a given test if that test is run in isolation versus the entire suite being run. You should never rely on test execution order, because it means that you're likely introducing unnecessary dependencies and confounding variables, aren't engineering a test around a given behavior effectively, and lose the ability to run your tests deterministically. Pytest also doesn't have a set way of ordering tests, and how it happens to order them is up to the current implementation in certain areas of the code. Even if you pinned pytest to a specific version and identified a (currently) deterministic sorting order for the tests, you'd still be beholden to the order you define your tests in and how they're named, which will very quickly become a nightmare to maintain and extend, and also wouldn't be idiomatic, nor would it be Pythonic. Attempting to "repair" state after a given test is tempting, but unfortunately means you're still manipulating the state in various ways differently. It also means you're adding complexity and assuming that you will be perfect at undoing what was done, which is almost never the case. For example, if using Selenium and you want to share the same browser between two different tests to save time, you might attempt to wipe the history, cookies, and cache, but that doesn't work as browsers are incredibly complex, and there will always be something left behind. That's why it's recommended to always make a fresh driver session for each test. I understand the desire to speed up very slow tests, but there's always a better way. Can you provide some more context on what Edit: instead of "deterministic", the correct word to use is "linearizable" (for anyone reading in the future) |
but it's not? my very idea is to not rely on test structure (or order) to get the same results. this working example (again, this is only to demonstrate that pytest can easily do it) demonstrates that you can have tests in any order and have test arguments in any order, really you can have anything in any order here and it will work the same way any time.
now this is an unrelated problem that already exists in pytest. how tests are run is already depending on the names of the fixtures. it would be best to create another issue for this, though.
i'm assuming you are talking about |
The tests are dependent on test execution order in that they could only be considered deterministic based on test execution order being deterministic. Determinism is dependent on both behavior and outcome. If I run your tests as a suite, it's sort of like doing That said, given this is more to be representative of an IRC network starting/stopping, then it's a different story. But I have to ask why not have a package level fixture in a conftest that launches the IRC network for the tests that depend on it, and have all the tests that depend on it be the only ones in that package?
Not quite. It's not actually a problem. Pytest doesn't actually care about order, or naming, and never will. I only mentioned it because, as I said, the determinism of your tests is dependent on test execution order. The only thing you can guarantee related to test order, is test batching. Tests within a given scope will always be run together (e.g. all the tests in a class will be run together, but in an effectively random order). The execution ordering within that group is irrelevant. Order of tests should always be considered nondeterministic, but not the order of fixtures. Fixture order is determined by scope, which fixtures request which other fixtures, and whether or not a fixture is autouse. Scope can be leveraged to ensure, not an ordering of tests, but a common context under which a group of tests can run. In other words, it lets you perform your "arrange" steps, and then the action, and then run multiple asserts against that resulting state without having to repeat the steps. The order of the asserts shouldn't matter, because the asserts should be using non-state-changing queries only. They can also be used to ensure that immutable resources (or effectively immutable resources, i.e. those that could be mutated, but aren't) are only created/calculated once for the entire scope, which can save a lot of time. For example, starting an IRC network for several tests to leverage. I'm guessing, but I think these last two bits are the secret ingredient you need. Granted, I'm only going off the context that some tests require your IRC network, while others don't. |
i'm not sure i understand the general idea. you definitely can make tests that depend on the order of them, and it's definitely a bad idea, isn't it? e.g. foo = set()
def test_a():
assert foo == set()
foo.add("a")
def test_b():
assert foo == set() here the tests are executed in the order that they are defined and the order matters. in the examples that i posted the tests are executed in the order that pytest defines for parametrized fixtures and the definition order doesn't matter... this is the default and at least partially documented behavior of pytest, isn't it? if there's a problem with this behavior, well, it's already affecting the existing tests, doesn't it? |
Yes, absolutely. But what I'm saying is that, while it's not your intent to make the tests dependent on test execution order, they still are because that's the only way they can be considered deterministic. In order for them to be deterministic, a latter test has to always have a setup of In order for a test to be considered deterministic, it has to have the exact same setup when run as part of a test suite as it does when run in isolation. Since you were saying that the set being updated was just a placeholder for a network starting and stopping, this doesn't apply, as the state of the running system would not have been modified from one test to another. Had this been an actual object, then it would have applied.
It is not. At least, not exactly. Pytest really only cares about grouping, not order . As you parameterize a fixture you create a set of bubbles, where each bubble is one param set. Everything that was affected by that parameterization effectively gets a copy made and inserted into each bubble. These bubbles are now groupings. If you parameterize something that is affected by parameterization, then more bubbles get made, and copies of each of those bubbles are copied into the previous bubble, like a fractal. Pytest basically is only concerned about maintaining the ordering of groupings, so each of those bubbles get executed in one shot, but it's not particularly concerned about the order within a bubble. The groupings are pretty much determined by dependencies, so as long as you specify all dependencies explicitly (e.g. fixture a depends on b and c, fixture b depends on c and d, and fixture c depends on fixture d, giving you d -> c -> b -> a), then you'll have complete control over the order of operations for any given test, but not the test execution order. One other issue, is that you're leveraging the global namespace for setting up dependencies. In pytest, the global namespace isn't really supposed to be used really. Everything is meant to be done through fixtures. By stepping outside the fixture system, things become a tad janky |
to reiterate a bit on our conversation from earlier first of all, regarding the cats and dogs example where expensive operations were run 6 or 7 times. i tried several python versions but i only got 5 expensive operations. if you launch this repl.it, it should produce...
i would like to know how i can reproduce the behavior where 6 or 7 expensive operations are run. regarding determinism, what i understand with it is that tests that run in the same environment and with the same command line arguments should always produce the same results. the idea is, if there's an error in the test, you can catch it, and if you fixed it, you can verify that it is fixed. if you look at my silly example here, these tests produce the same results every time, unless you run them in isolation. so these tests are deterministic. the fact that in isolation these tests produce different results is only a matter of my incompetence, not the fault within the test system itself. as far as i know, this holds for my other examples as well. so the tests in them are deterministic. again, while my examples do work and are deterministic, they are only mock examples that only demonstrate that pytest can already do it. if this is ever properly implemented, pytest could guarantee some of the minor details. this would only be a different guarantee in place of the existing one—one arising from the fact that parametrized fixtures are executed deterministically. regarding the complexity. while my silly example does use branching logic ( regarding the scoped solution. all in all, there are three levels of abstraction in the tests here:
so considering your example, there are multiple problems: class TestConfigA:
@pytest.fixture(scope='class')
def charybdis_config(self):
return {"dog": : "charlie", "cat": "tom"} first of all, there's "TestConfigA" (or "TestCharlieTom", or "configs_1121_1122", from earlier comments). if you are lucky, you can assign here a name that is meaningful. but if you are unlucky, "TestCharlieTom" is as good as it gets. so you are now stuck with a usless name and you might as well add a comment explaining why you need it in the first place. now the reader of the test is aware of the minor details of how dogs work. this is a good example of an abstraction violation. then, you have cats and dogs in one place, even though these are unrelated. that would be a uh... a violation of separation of concerns? futhermore, this is a violation of do not repeat yourself. and finally, this code runs the expensive operation 8 times. as it's nested, as far as i see, it can't be optimized to run the expensive operation 5 times, which is the possible minimum. in this particular example, this is now just as slow as simply having function-scoped fixtures. i don't think there's any advantage at all here. |
Run this to install the branch of pytest I've been working on with the maintainers that resolves some fundamental bugs in how fixtures are cleaned up. It's just waiting on some code review (and probably a migration strategy), but it will eventually be merged in.
Then put a print inside
A test must have the same result, and operate exactly the same when run in isolation as when run as part of a suite. If they don't, then they are different tests because one relies on other tests that came before it and the other does not. Whether or not you consider this deterministic is not important. What is important is that it's an inconsistent test. Developers will be depending on a test operating consistently between these two contexts so they can iterate rapidly. Many developers will be running a single test in isolation because it failed and they're trying to test a potential fix. If it passes in isolation, but fails as part of a suite, that will only cause frustration and cost valuable time.
Do not depend on this. This would be banking on implementation and behavior that isn't specifically intended by pytest. It just happens to work out that way at this current moment.
It's in a fixture, and therefore part of your test. You're not using a plugin. You're using a shortcut that raises an exception after an expensive operation has been performed.
The test's name would usually be a combination of the nested namespaces it's in, with each level providing more context, and eventually explaining the arrange/act/assert being done. There's no real test case here to put into words, so these are all placeholder names. So this is moot and irrelevant to the discussion. The point of my examples was to demonstrate a concept surrounding using logical structures to create more concrete isolations between tests. If you want to provide me with an actual test that you want to run, I'd be happy to show how I would name things.
This is the test scenario you provided. If you don't want them tested together, you'd have to come up with another scenario. If you're talking about the fact that I used a dict to house them together, remember that it's a quick mockup with placeholder data. I could have used two fixtures if that would've made more sense to you. But I can only do so much with a limited, abstract example scenario.
As I mentioned before, DRY is only good as long as you don't violate DAMP and KISS. Code gold is fun, but not helpful for writing actual code. DRY is not a law, it's a reminder that if you're repeating complex blocks of code often, it likely means you need to abstract. The argument that I'm not following DAMP in my examples is moot, because I wasn't demonstrating actually descriptive and meaningful locations, but rather where and how they could be used in such a structure.
Honestly, this is starting to come off as a little combative, so I'd like to apologize if you felt I've come off that way towards you and insert a reminder that my goal here is to set you up with something that is sustainable, to keep you from trying to prematurely optimize, identify better ways to optimize, and to help point you in the right direction for a testing mindset. I had to leave our chat earlier, but I have another solution I wanted to present but didn't have the time. My philosophy is that you should never define a test that will never run, because it signals that there's a deeper issue. It indicates you're in an XY problem. You're focused on parameterizing fixtures to be more DRY, when DRY is a very bad thing when it comes to test definitions. But the solution may have nothing to do with fixtures at all. You said you were working with IRC networks and that spinning multiple ones up wasn't a problem. Depending on what exactly you're testing, you could consider those networks infrastructure, and infrastructure should not be established by your tests. They should assume it's already in place, and just rely on configs to be pointed at it. So rather than focusing on how to most optimally spin up and shut down the networks, just spin up the ones you want to run tests against, and kick off the relevant tests. You can even mark your tests based on the networks they're supposed to run against, so you can launch them by running something like That said, this really does depend on a lot of context I just don't have, so YMMV. If I had more context, I may be able to provide more useful suggestions. |
i can confirm this regression. it's probably a non-issue as pytest git master doesn't have it and the branch either way, this is a development branch on forked repo. i'm not sure how valuable it can be in the scope of this discussion. regardless of that, the tests still pass, don't they?
up to this point you were criticizing my proposal saying that it is not deterministic. so i suppose the question whether or not the relevant tests are deterministic or not is an important one. do you consider these tests non-deterministic? the question whether or not these tests are consistent, that is if they can be run in isolation with the same result or not, as i understand, is not related at all to the question whether or not they are deterministic or not. you are saying that the tests in my proposal are inconsistent. would you mind showing this inconsistency? this seems to work fine:
neither my proposal, nor my silly example, depends on the fact that parametrized fixtures are executed deterministically.
since we are talking here about the ordering of fixture execution (which i am not depending on), something that is probably determined solely by python code, and perhaps the file system, i don't see how pytest wouldn't have to go out of its way to make tests non-deterministic, by e.g. using random. again, this is quite irrelevant to my proposal.
again, this is a silly example that is only meant to demonstrate that pytest can easily do it. it is not in any way production level (or even alpha level) code. it is a single-file proof of concept. in the world of real tests it doesn't exist. it is not found on any abstraction level of the tests. conceptually, it's a part of a plugin (that doesn't exist yet) or pytest itself (if this issue is resolved). i'll repeat, it is not real code! just a proof of concept. sorry if this sounded rude, if it did, i didn't intend for it to be that way. but it seems to me that you are discussing my toy example as a real-life solution to my trying to run a few irc networks with incompatible fixtures. and i'm just in a weird position of defending my very much abstract proposal to solve the issue of conflicting fixtures. i don't really have a problem in my tests. while launching irc networks is expensive, my tests take less than a minute to run all in all. i'm not going to use paramtrization to speed up tests. again, using parametrization to optimize this is silly. it's just a proof of concept. i hope i made myself clear. if i did, you can probably ignore the rest of this, but just to be completely clear:
the whole point of my proposal is optimization. currently i'm running one network instance per test and it works just fine. this won't solve any other problems, as there aren't any. this can only serve to speed things up.
again, the use of parametrizing is only a proof of concept. it's not real code. i'm not really focusing on parametrizing to be more DRY. in fact, i'm not trying to be DRY at all. it's a silly proof of concept. it only has to work. it's not even that DRY as i do have to repeat myself a bit.
i can't easily spin up more than one. which is exactly the problem. i wouldn't mind spinning up several if i could, that's what i might have said. it's just that so many things are baked into the service that it's hard to run more than one instance. P.S. i still think that this kind of conversation is best suited for irc |
Some stuff went down a little while back, so this got put on the backburner, but it's actually a pretty serious issue IMO. It was actually merged in for 5.3.3 and then we had to back it out in 5.3.4 because of an implementation detail (I mistakenly used a
To clarify, I'm not saying the current implementation of that behavior is nondeterministic. I'm saying the way the ordering happens to break down within one of those bubbles in a given release isn't intended despite being deterministic. Another deterministic implementation could be made in another release, or another version of python could change something that causes that ordering to be different, yet still deterministic given that release and version of Python. My point here is that it's not something you should depend on, because it can change just from updating something. I'm not saying you're doing this, just cautioning you because your phrasing implied you wanted to rely on it.
That seems to be the core of the miscommunication, then. People usually come asking for solutions, rather than discussing potential solutions, so that's my default approach. I agree this would best be continued elsewhere. If you want to talk more, you can find me in the Selenium slack/IRC 😁 |
so to recap on our previous conversation, in which we, i think, came to a certain kind of an agreement suppose you have the following tests that use fixtures of type def test_charlie_simba(charlie, simba): ...
def test_buddy_tom(buddy, tom): ...
def test_buddy_simba(buddy, simba): ...
def test_charlie_tom(charlie, tom): ... notice that def test_simba_buddy(simba, buddy): ... and this test that doesn't use fixtures at all: def test_tree(): ... you could have this test organized in the following simple way, using regular function-scoped fixtures. note how every fixture is set up in the order that pytest guarantees, and also how tests only use the fixtures they need. when now, the
P.S. this kind of scope is perhaps not a scope at all and might be a step towards not using scopes at all. |
A few comments:
It may seem tricky at first, but only because naming things in programming is typically very challenging, and we're using nebulous concepts such as In programming, namespaces and logical structures are used for organization, and the same is used here. Namespaces, provided through logical structures, allow you to represent branches in context. In the case of testing, that means dependencies and the behavior under test. For example, you could have a package called The end result could be something like this:
Quite the contrary. Pytest being able to optimize the logical structure based approach is more of a secondary effect of the structure. The main focus of the structuring is to organize the tests in a coherent and conventional way. Finding tests becomes easy because I can just follow the namespaces to locate a certain collection of tests or even fixtures, and even identify if there's a hole in coverage just from that (e.g. having a You also get a few cool benefits, like being able to easily target a collection of tests for execution. For example, if someone modified the login behavior, you can just do Engineers use namespaces and logical structures like this because it's the best way of organizing things around functionality/behavior when writing code. That is why I recommend using it here. Not because of optimization (even though pytest provides it in this case), but because after almost 2 centuries of programming (I know, hard to believe, but it's true), this really is the best way of handling this that we've come up with, and it's the convention in every other aspect of programming. Somethings to note:
or
Then this tells you upfront what your structure should generally look like. What form each of these take is up to you and your preferences/needs, but namespaces are an incredibly useful tool and a perfect solution for this. |
i think these two points boil down to the same thing. sure, if your test structure and properties of your fixtures are similar, you can cave the structure and naming as in your example. but this only fits the most simple cases. to give a counterexample, suppose your code is making animal food and accessories, so your test structure looks like...
now if you were to reuse the fixture
i doubt anyone would find this acceptable. P.S. this is a completely different topic, but instead of classes, pytest could use nested functions, e.g. def tests_with_charlie(charlie):
def test_charlie_name():
assert charlie.name == "charlie"
def test_charlie_age():
assert charlie.age == 7 with this approach, it would be technically possible to get rid of scopes completely. you can see that both tests use the same fixture @pytest.fixture(reusable=True)
def charlie():
... the major downside to this that this would involve much more “magic” |
I think that may be the piece of info you're missing. "Uproot" is a very strong word. It's really just cutting and pasting (and maybe some removing There's no need to restructure other fixtures or even the fixture you want to be available in another area. As long as the fixture is in a namespace directly above where you want to have access to it, then you can request it. This is what the The scope of the moved fixture doesn't even have to change. Where the fixture is defined has no real impact (other than coincidence) on when in the order of operations a fixture will execute. So you'd only have to move that For example, if I go with your first example of (I added the
Then I could have a It's super handy, and I do want to emphasize that you can have more than one |
i'm not sure how this can work? if you are to use the same
you can't use a class-scoped fixture, or even a module-scoped one, can you? |
Not the same instance, no. My solution is just for providing the same fixture, not for providing the same instance of the fixture. But that brings us back to relying on logical structures and the scopes tied to them. Taking a look back at your original comment with more context, it looks like what you are looking for isn't really possible, or at least wouldn't result in something congruent. Your examples up until now have only had two fixtures with the |
if we are not optimizing, we can just use function-scoped fixtures...
what exactly is unpredictable?
i don't think it can? how do you mean? |
You absolutely could use function-scoped fixtures exclusively. However, I never said don't optimize. Optimization is just a secondary concern. My primary goal is organization, and scopes larger than that level can help build a mental model by showing how dependencies are shared. But structuring them this way also makes optimizing trivial because I've already laid out things based on dependencies, so I just have to specify whether a fixture should run once for everything under it's umbrella, or if it should re-execute for every iteration on a certain scope. For example, to adapt my solution above so that Pytest doesn't really optimize for you. But it gives you the tools that make optimizing very easy through scopes and structures. In other words, it doesn't do the optimization work for you; you line the dominoes up, and it follows along. That's what's being talked about here (although I think I should probably update that bit in the docs to make it clearer). Things also just get a little janky at the function level, so I try to stay away from that.
When there isn't a clear, explicit chain of dependencies in fixtures, pytest has to make assumptions about which fixture should execute before another. For example, in this case: @pytest.fixture
def x():
pass
@pytest.fixture
def y(x):
pass
def test_thing(y, x):
pass it's clear that @pytest.fixture
def x():
pass
@pytest.fixture
def y():
pass
def test_thing(y, x):
pass it's unclear which should execute first, so pytest has to pick one to happen first. However, no matter how many other tests are also run, the other tests that are run can't influence the order that pytest chooses for that test, and of the possible ways it could choose, none of them really conflict with each other. The Consider the following fixtures and tests: @pytest.fixture(scope="only the tests that use this fixture")
def x():
pass
@pytest.fixture(scope="only the tests that use this fixture")
def y():
pass
@pytest.fixture(scope="only the tests that use this fixture")
def z():
pass
def test_x(x):
pass
def test_x_y(x, y):
pass
def test_y(y):
pass
def test_y_z(y, z):
pass
def test_z(z):
pass
def test_z_x(z, x):
pass The order of operations for just calling
with these groupings: <x>
<test_x />
<y>
<test_x_y />
</y>
<z>
<test_z_x />
</z>
</x>
with these groupings: <y>
<test_y />
<x>
<test_x_y />
</x>
<z>
<test_y_z />
</z>
</y>
with these groupings: <z>
<test_z />
<x>
<test_z_x />
</x>
<y>
<test_y_z />
</y>
</z> While the sorting within the outermost group may vary, the nesting is predictable. These are all in direct conflict with each other. What makes it unpredictable, is the fact that a given test can be absolutely certain what its order of operation will be, but only some of the time, as pytest would then consider all tests that it would be attempting to run before deciding what the OoO for that test would be. For example, in this case, if |
but it is clear? according to the documentation,
but this is wrong? you have fixture
similarly, for
and for
or for all tests
there's no ambiguity whatsoever? |
Unfortunately, no. I see where that's referenced in the docs, but it's actually incorrect, and it's quite easy to prove. If those fixtures were module-scoped, and there was another test function with the order reversed, it would be ambiguous. This comes up every once in a while, but nothing outside of the fixture scope/dependency system is intended to control order of operations, because it would fall apart too easily otherwise. If you need something to happen before something else, the only reliable way to do this is to either have the latter thing request the former thing, rely on the latter thing having a smaller scope then the former thing, or have the former thing be autouse and make sure the latter thing isn't requested by any autouse things (that last one isn't too dependable for the same reason). The fact that it works is merely coincidence and a result of a deterministic fixture pipeline (deterministic in the literal sense). The requested fixtures are stored as a list, and iteration over a list is deterministic (this is actually touching the area of that serious bug we talked about, and I believe this or something related came up on one of the tickets). I should make a note to fix that... There's actually a lot of stuff in there I should touch up haha
There is nothing to indicate z can't be in x, unless certain tests aren't executed, which isn't always the case, hence the ambiguous ambiguity. Like I said, unless you rely on scope or an explicit dependency chain (or autouse, but that's kinda iffy), then the rules become ambiguous. |
so, you are saying that the documentation is wrong, and that ambiguity is acceptable? and then you are arguing against a proposal that does away with ambiguity?
however wrong, the pytest documentation is still indicating that. also, the text of my proposal is indicating that. again, as i mentioned, how the tests are actually run is quite orthogonal to it. if you welcome ambiguity, your conflicting examples in the comment before the last one will do the job. if you don't welcome it, you can have a very predictable test structure. |
The documentation is incorrect, yes, but the ambiguity isn't a question if acceptability. I'm saying that ambiguity is unavoidable for pytest as a framework, as I demonstrated with the bit about swapping the order and making the fixtures module-scoped, and the example with the x, y, and z fixtures.
No, I'm arguing against a proposal that allows the order of operations of tests to be affected by the existence/execution of other tests. I abhor ambiguity. And that's exactly why I don't want pytest making decisions about the order of operations on my behalf. Your proposal forces pytest to make decisions on your behalf with regards to the order of operations, and do so differently based on which tests are running. Since it's making decisions for you, that means there's still ambiguity. It only becomes less ambiguous in the context of you executing a select subset of your tests, but even then, it requires pytest to consider all the tests that will be executed for that test run before it (or you) can know how to make it less ambiguous. Pytest provides all the tools necessary that allow us to eliminate ambiguity as it stands. The tools don't guarantee ambiguity will be eliminated, but that's unavoidable with most things in programming. If someone uses the fixture scopes as they are now, and leaves ambiguity, that's not good either. If you don't want ambiguity, the only way to actually solve it is to establish clear dependency chains. |
okay, just to be clear. do you find this test acceptable? @pytest.fixture
def x():
pass
@pytest.fixture
def y():
pass
def test_y_x(y, x):
pass you are saying:
this also true about my proposal, isn't it? so if you find this acceptable, should you not find my proposal acceptable?
i think i showed how the order of execution of test in my proposal, regardless of the above, does not in any way change the set up plan of individual fixtures. do you mean something else by “order of operations of tests”?
would you explain what decisions pytest would be making? if a fixture
i don't see why it's unavoidable in principle. if you just remove explicit scopes altogether, you won't have this problem |
Ah, no. That I do not find acceptable, because there is ambiguity as it's unclear if My goal with the x/y/z example was to demonstrate that your proposal doesn't get rid of ambiguity, because pytest would still have to decide which fixtures execute before the others, and it would change based on the tests that would be running for that test run.
I see what you're saying, but your proposal does not do enough to eliminate ambiguity because a given test can still have a different fixture execution order depending on what other tests are running at that time. At that point, it wouldn't bring anything to the table we don't already have with how fixtures currently work. As I demonstrated in the x/y/z example, pytest still has to decide which fixture executes before another. In that x/y/z example, if I run
with these groupings: <z>
<test_z />
<x>
<test_z_x />
</x>
<y>
<test_y_z />
</y>
</z> and if I run
with these groupings: <y>
<test_y />
<x>
<test_x_y />
</x>
<z>
<test_y_z />
</z>
</y> It's not specified whether Pytest had to make decide which would execute before the other in both cases, because the information was not provided by the programmer.
Again, the documentation is wrong, and needs to be corrected. Fixture request order in a given test/fixture signature simply cannot guarantee fixture execution order, and this can be demonstrated like so: @pytest.fixture(scope="module")
def order():
return []
@pytest.fixture(scope="module")
def x(order):
order.append("x")
@pytest.fixture(scope="module")
def y(order):
order.append("y")
def test_x_y(x, y, order):
assert order == ["x", "y"]
def test_y_x(y, x, order):
assert order == ["y", "x"] If either test is run in isolation, they will pass. But if run as a suite, one will always fail, despite each one providing what you believe to be clear instructions on which fixture should execute first.
Removing explicit scopes would reduce everything to the function scope level, as pytest wouldn't be able to assume which fixtures don't need to be re-executed for certain groups of tests, and wouldn't eliminate the ambiguity. Pytest would still have to decide for you which fixtures go before others if clear dependencies aren't established explicitly. If everything were reduced to the function level because explicit scopes were removed, and the order fixtures are requested in fixture/test signatures did control execution order, then it still wouldn't eliminate ambiguity because of autouse fixtures. For example: @pytest.fixture
def order():
return []
@pytest.fixture(autouse=True)
def y(order):
order.append("y")
@pytest.fixture(autouse=True)
def x(order):
order.append("x")
def test_order(order):
assert order == ["y", "x"] If explicit scopes and autouse fixtures were eliminated, and fixture request order did control fixture execution order, then fixture execution order could be determined in exactly the same way as MRO, and only then would ambiguity be eliminated (an algorithm other than MRO could be used, but then it would just be inconsistent with Python as a whole), because you'd be forced to explicitly state what is dependent on what, and in what order. But that's already a requirement if you want to eliminate ambiguity with how pytest currently works. If you then make a new proposal after eliminating scope and autouse fixtures to somehow mark fixtures so that they know which groups of tests they don't have to re-execute between, you'd have gone full circle and reimplemented scopes. That's why the potential for ambiguity is unavoidable in pytest. |
Hello. Thanks for a great test framework. I have not worked long with pytest and I'm sorry if there already is a way to handle this.
I want to use pytest for testing on hardware where we can have some configurations. It is easy to set configurations before each test but it unfortunately takes up to a minute (as our hw need to be restarted after each config change). So to save time I want to group tests that have the same configurations.
I can easily use both markups and fixtures to set or act on configurations but unfortunately I have not found any way to tell pytest which configurations of our hardware are incompatible.
Here is an example where we get into problem:
The fixtures will group tests to minimize the number of open fixtures. However it does not know that fixture set_config_1_to_1 and set_config_1_to_2 cannot both be active at the same time.
I would also like this to work through an entire session so we can run all tests in our repos with one config and then change config. Then run next config etc. Currently my solution is to handle these configurations outside pytest and run applicable ones based on whatever configurations are active.
That has the downside that tests are run multiple times as many combinations of configurations are applicable for some of our tests leading to longer test times.
The text was updated successfully, but these errors were encountered: