-
-
Notifications
You must be signed in to change notification settings - Fork 342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a strategy for handling (async) generator cleanup #265
Comments
Issue python-trio#23 was closed as a duplicate of python-trio#265. Have the docs link to the open issue.
Here's an idea: what if we scope async generators to the task that first iterated them? That is:
There's some subtlety here as far as treating the nested child sorta-task like a task for these purposes, and scoping asyncgens created there to the direct-parent nursery rather than to the lifetime of the overarching task that opened that nursery, but I think that's mostly details. This is only a partial solution; it still doesn't deal correctly with asyncgens that create nurseries/cancel scopes.
From reading the discussion in #264, it seems like the desired behavior here would be to limit each turn of agen() to 0.8 seconds, and the whole thing (including time spent by caller() when agen() is suspended) to 6 seconds. That gives you a shorter-lived cancel scope (in caller()) more deeply in the cancel scope stack than a longer-lived one (in agen()). I guess context variables are supposed to solve this? I feel vaguely like it should be solvable without them, but can't currently think of how. I guess cancel scopes could detect that they were created inside an asyncgen (via... stack introspection? ugh) and mark themselves accordingly, and cancel scope |
Thinking about this more:
|
Expanding on that last bullet point: we want to distinguish
from
(the second should interrupt the sleep while the first should not). Luckily, the firstiter hook is called when the coroutine returned by asend() is constructed, before the send(None) is performed and thus before any of the async generator code is executed. So if we add asyncgens to I'm sure the devil will be in the details, but can you see any missing pieces in the overall idea? None of this helps with regular non-async generators that create a cancel scope. But since there's not much point to making a cancel scope that you can't put any awaits in, I don't think there's any use for such a generator besides decorating it with @contextmanager, and we've established we like the existing behavior in that case. |
Even more wrinkles: asyncgens can iterate over other asyncgens! I don't have time to think through the implications of this right now; mostly leaving a note here for my future self. I can't immediately come up with any reasons this would invalidate the approach I'm proposing, but it probably does at least mean we need to be careful about the order in which we apply the asyncgen cancel scopes to the cancel scope stack. Also, ag_await is None is not enough to know an asyncgen is suspended at a yield -- it could just be exhausted. The condition we want for is-currently-suspended-via-yield-to-caller is |
Random technical notesI was definitely going to ask about bpo-32526; that's a pretty obscure issue, so nice catch :-). It's also legal to create an async generator in one task and then iterate it in another task. Actually I've even used this in an example... in the live-coded happy eyeballs, I cheat a little bit, because I claim that it's doing the same thing as the twisted happy eyeballs code, but twisted's version actually supports incremental async def fake_incremental_getaddrinfo(*args, **kwargs):
for target in await getaddrinfo(*args, **kwargs):
yield target Then in the happy eyeballs code, each child task instead of doing try:
target = await targets_aiter.__anext__()
except StopAsyncIteration:
return You also need to do a bit of fiddling with the I suppose one could keep a global list of all living async generators, and scan them whenever a cancel scope is created or a task traps, but this would be way too expensive. (It might even be too expensive if we could restrict to a single task's generators... these are pretty performance-sensitive operations.) Another potential way to map cancel scopes to async generators would be to use the But let's step back a bitThe bigger question is what semantics we're even going for here. I think in this issue we should try to focus on just the GC issues, since they're simpler. (I added a link back to here from #264 though so the comments don't get forgotten.) If you want to work on this, then probably the first thing that would be useful would be to figure out exactly what Python is doing right now, with no hooks defined :-). I wrote down a guess in the first post, but I haven't checked the details (which may require reading CPython's source and stuff). There's also a simple thing we could do that might be better than what we do now, not sure: guarantee that all async generators will get |
Oh, I should also note that if we do decide to start using the async gen hooks, then as part of that we'll probably want to implement support for those hooks in |
Thanks for the happy-eyeballs example; that does seem like something we shouldn't gratuitously forbid. python-trio/async_generator#14 has the GC hooks support, and AFAICT behaves identically to native async generators whether or not a finalizer hook is installed (and your hypothesis in the first post on this thread matches the results of my investigation). GC: I like the idea of doing aclose_forcefully() to finalize a generator, if no one put an aclosing() block around it to make sure the potentially-blocking operations in its cleanup don't leak out of the cancel scope intended for them. As far as where to do this (for purposes of exception propagation), how do you feel about "the nearest enclosing task that's still around"? That is, we'd install a firstiter hook that adds a weakref to the async generator object to a list in the current Task, as above, but instead of closing async generators when their task completes, we'd just roll them up to be async generators of the parent task. And the finalizer hook would make it so aclose_forcefully() gets called in the context of the task in whose asyncgens list this generator currently lives (at the time of its GC, whatever that might be). Cancel scopes: Unfortunately, you can't weakly reference a frame object. I guess we could still keep a set of
which might be implemented something like
And then we just have to implement CancelScope.suspend(). |
Er, nix the yield_from_ bit; clearly we do want to keep our cancel scopes active until the delegate actually yields something. But we could make async_generator.yield_from_ take an optional yield_ parameter, and say e.g.
Another idea: still assuming we make async_generator.yield_from_ support a user-specified yield_ function, we could move the cancel scope "stack switching" to the side doing the iteration rather than the side doing the yielding, and then people can write native-syntax async generators if they don't care about 3.5 compatibility:
|
so interestingly in aiohttp we were getting a very strange issue where a task was abandoned, followed by a Note how the generator except/finally doesn't have the same task as when it was entered, and further that it's a completely different task that I did not create. |
@thehesiod yes, that gist is showing the asyncgen gc hooks working: the generator's |
This is unrelated, but do we have a workaround for PEP 533 currently? Specifically, the The proposal sounds great BTW -- how it should've been from the beginning. Thank you! |
@devxpy We don't have a general workaround, and I don't think there's any prospect of 533 making it into 3.8. But I think the discussion in #638 may be getting closer to an acceptable workaround – in particular starting here: #638 (comment) The idea is instead of having a generator that you start and stop and then have to figure out how to handle it being abandoned in the "stop" state, you run the "generator" as a concurrent task that sends values to the loop. |
With another year+ of experience, I think we can see better what's going on here. There are two problems: (1)
The whole "discard the exception" thing makes me wince, but that's how Python handles exceptions in Implementing this is straightforward, though there are some fiddly details that have to be taken care of. In particular: with the async generator hooks you have to handle both the case where the objects are GCed while the run loop is running, and the case where they outlive the run loop (in which case you forcibly "GC" them when the run loop exits). And you have to be prepared for the hooks to be called from any thread, so you need to use a |
Some discussion of this starting here: https://gitter.im/python-trio/general?at=5c5e2cb5d1e3093ab5eb34ad |
Example of a user getting very confused by async generator/async with/trio/gc interactions: https://gitter.im/python-trio/general?at=5d475c7be939ab2447f895aa The generator code was pretty simple: async def read_next_line_from_file_async(self, file_path):
async with await trio.open(file_path) as f:
async for line in f:
yield line When it got garbage collected, the GC injected If Trio had used the async gen GC hooks to move the cleanup into a pre-cancelled system task, then that would have solved their issue completely. |
I wonder if it can be of any help - https://bugs.python.org/issue35409 (Async generator might re-throw GeneratorExit on aclose()) and I came across this in https://github.com/vxgmichel/aiostream/blob/master/aiostream/aiter_utils.py#L179:L185. |
I'm working on implementing this. My current plan works like you've suggested by default, but also allows users to create "walled gardens" in which async generator cleanup behaves in a more structured-concurrency-ish way:
I think this offers a nice middle ground between totally acausal/"afterthought" cleanup on the one hand, and strictly writing |
I dunno. The basic cleanup feature is obviously necessary. The "walled garden" idea is interesting, but seems way more tentative/experimental. And that's a lot of different APIs you're suggesting there :-). (I'm particularly dubious of the timeout stuff... there's a lot of room for obscure issues if folks start expecting that objects can keep running after being GC'ed.) So I think I'd like to see the basic feature land first and then think about extensions (possibly in a new issue). (NB it should also be possible to experiment with cleanup managers in tricycle; just overwrite trio's default asyncgen hooks with extended versions.) |
There's a problem with generators and async generators: they don't necessarily iterate to the end. Sometimes they get suspended in the middle, and then garbage collected. When this happens, their
__del__
methodthrow
s in aGeneratorExit
exception, so thatfinally
blocks and__exit__
methods get a chance to run. BUT,__del__
is very weird: it can be run in the middle of any arbitrary python bytecode instruction. It's like a signal handler but even more so.So for example, this code:
Here the generator's
finally
block gets run at some totally arbitrary time, so it could in principle print any task, or crash because no task is currently active, or because it doesn't even run in the trio thread. And similar problems arise for pretty much any access to trio APIs. (Note that even without trio, the same thing happens forthreading.current_thread()
, accessing thread-local storage, etc. But it trio the use of this kind of implicit thread-bound state is much more common, since the run loop itself is accessed via thread-local storage.)PEP 533 would be the real fix, but that seems to be stalled for now.
For regular generators, we just need to document this; I can't see what else we can do. There are some related issues in #264 that also need documentation; probably they should go in the same section of the manual.
For async generators, this problem is both better and worse. The thing about async generators is that the gc can't actually clean them up directly: it would like to do the same thing as it does for regular generators and have
__del__
throw inGeneratorExit
, but this doesn't really work, because for an async generator you should useathrow
and let the coroutine runner iterate it, but__del__
doesn't have access to a coroutine runner. So PEP 525 defines some hooks that trio can set, to get (a) notification that an async generator has been started, and (b) notification when an async generator is ready for garbage collection.The intention was that these hooks are used to let the event loop take care of doing
gen.athrow(GeneratorExit)
to emulate how regular generators work. But as we saw above, regular generators are already broken so... we don't want to emulate them. Being able to re-enter trio's run loop doesn't help, because we need a task context. And this is even more true for async generators, since the whole point of an async generator is that it can do async-y things, and if you want toawait
something you need to make sure you have a timeout set, which means you really really really need to make sure that this happens in the original task context where the user was expecting it to happen. Letting anawait
sneak outside of itswith fail_after
is definitely not OK.So we aren't going to use these hooks in the way they were originally intended. But, until PEP 533 becomes a thing, maybe we can at least use these hooks for harm reduction.
Some options:
Don't define the hooks at all. In this case I'm not 100% sure what Python does, but I think it does something like: throw in
GeneratorExit
and iterate once, and if it yields again (e.g. because there's anasync with
or afinally: await ...
) then it gives up and prints an error message. This is basically like synchronous generators, but maybe a little less friendly. It's not awful, but maybe we can do better.Use the
firstiter
hook to forbid the use of async generators, unless we have some explicit evidence that someone is taking responsibility for cleaning them up. For example, we could make it so that our @acontextmanager marks its async generators as being cleaned up (though 3.7'scontextlib.asynccontextmanager
is going to be a problem...), and provide anaclosing
context manager that does the same. This approach was pioneered by Curio, which goes a step further and introspects the generator's bytecode, and if it doesn't see any dangerous operations (yield
oryield from
inside anexcept
,finally
,with
, orasync with
), then it allows those async generators as well. I really don't want to have this kind of bytecode introspection in trio. OTOH without it then async comprehensions like(2 * i async for i in aiter)
become essentially unusable.Use the
finalizer
hook to catch when an async generator is cleaned up, and run it in a context that (a) is carefully set up so that any attempt to access trio state generates a clear error message, and (b) any attempt toyield
causes a clear error / crashes the program (as opposed to just being printed to stderr, like regular__del__
does). (If doing this we'd probably also want afirstiter
hook like asyncio's, so we can force all async generators to be finalized when the run loop exits. This is just a minor refinement, though.)The good thing about the
finalizer
approach is that it has no false negatives: all async generators that can be run safely are allowed to run. The downside is that it has false positives: if an async generator is written poorly, then it's possible that it will be allowed to run and everything will look fine in tests, but then in production it blows up (e.g. because in tests it's always iterated to the end, and in production someone breaks out of a loop so the generator gets collected while only half-finished). Thefirstiter
approach is the opposite: it only allows generators to run if it can be sure that they're safe in all circumstances.(If PEP 533 is too much for 3.7, then maybe a smaller ask would be set_gen_hooks() that work the same way but for synchronous generators?)
The text was updated successfully, but these errors were encountered: