Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to recover from cancel scope nesting violations #1061

Merged
merged 5 commits into from
Jun 4, 2019

Conversation

oremanj
Copy link
Member

@oremanj oremanj commented May 21, 2019

When exiting an outer cancel scope before some inner cancel scope nested inside it, Trio would previously raise a confusing chain of internal errors. One common source of this misbehavior was an async generator that yielded within a cancel scope. Now the mis-nesting condition will be noticed and in many cases the user will receive a single RuntimeError with a useful traceback. This is a best-effort feature, and it's still possible to fool Trio if you try hard enough, such as by closing a nursery's cancel scope from within some child task of the nursery.

Fixes #882. Relevant to #1056, #264.

When exiting an outer cancel scope before some inner cancel scope nested inside it, Trio would previously raise a confusing chain of internal errors.
One common source of this misbehavior was an async generator that yielded within a cancel scope.
Now the mis-nesting condition will be noticed and in many cases the user will receive a single RuntimeError with a useful traceback.
This is a best-effort feature, and it's still possible to fool Trio if you try hard enough, such as by closing a nursery's cancel scope
from within some child task of the nursery.

Fixes python-trio#882. Relevant to python-trio#1056, python-trio#264.
@oremanj oremanj requested a review from njsmith May 21, 2019 08:17
@codecov
Copy link

codecov bot commented May 21, 2019

Codecov Report

Merging #1061 into master will increase coverage by <.01%.
The diff coverage is 94%.

@@            Coverage Diff             @@
##           master    #1061      +/-   ##
==========================================
+ Coverage   99.15%   99.16%   +<.01%     
==========================================
  Files         102      102              
  Lines       12374    12465      +91     
  Branches      923      936      +13     
==========================================
+ Hits        12270    12361      +91     
  Misses         83       83              
  Partials       21       21
Impacted Files Coverage Δ
trio/_core/tests/test_run.py 100% <100%> (ø) ⬆️
trio/_core/_run.py 97.75% <87.5%> (+0.12%) ⬆️

@belm0
Copy link
Member

belm0 commented May 27, 2019

For the lazy, would you post an example traceback with and without this change?

@oremanj
Copy link
Member Author

oremanj commented May 29, 2019

$ cat t.py
import trio
from contextlib import ExitStack

@trio.run
async def example():
    outer = trio.CancelScope()
    inner = trio.CancelScope()
    with ExitStack() as stack:
        stack.enter_context(outer)
        with inner:
            stack.close()

With this change:

Traceback (most recent call last):
  File "t.py", line 4, in <module>
    @trio.run
  File "/Users/oremanj/rc/trio/trio/_core/_run.py", line 1687, in run
    raise runner.main_task_outcome.error
  File "t.py", line 11, in example
    stack.close()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/contextlib.py", line 519, in close
    self.__exit__(None, None, None)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/contextlib.py", line 511, in __exit__
    raise exc_details[1]
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/contextlib.py", line 496, in __exit__
    if cb(*exc_details):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/contextlib.py", line 377, in _exit_wrapper
    return cm_exit(cm, exc_type, exc, tb)
  File "/Users/oremanj/rc/trio/trio/_core/_ki.py", line 167, in wrapper
    return fn(*args, **kwargs)
  File "/Users/oremanj/rc/trio/trio/_core/_run.py", line 464, in __exit__
    raise remaining_error_after_cancel_scope
RuntimeError: Cancel scope stack corrupted: attempted to exit <trio.CancelScope at 0x1043c7320, active> in <Task '__main__.example' at 0x1043df208> that's still within its child <trio.CancelScope at 0x1043c7388, active>

Typically this is caused by one of the following:
  - yielding within a generator or async generator that's opened a cancel
    scope or nursery (unless the generator is a @contextmanager or
    @asynccontextmanager); see https://github.com/python-trio/trio/issues/638
  - manually calling __enter__ or __exit__ on a trio.CancelScope, or
    __aenter__ or __aexit__ on the object returned by trio.open_nursery();
    doing so correctly is difficult and you should use @[async]contextmanager
    instead, or maybe [Async]ExitStack
  - using [Async]ExitStack to interleave the entries/exits of cancel scopes
    and/or nurseries in a way that couldn't be achieved by some nesting of
    'with' and 'async with' blocks
  - using the low-level coroutine object protocol to execute some parts of
    an async function in a different cancel scope/nursery context than
    other parts
If you don't believe you're doing any of these things, please file a bug:
https://github.com/python-trio/trio/issues/new

Without this change:

$ python t.py
Traceback (most recent call last):
  File "/Users/oremanj/rc/triovenv37/lib/python3.7/site-packages/trio/_core/_run.py", line 1430, in run
    run_impl(runner, async_fn, args)
  File "/Users/oremanj/rc/triovenv37/lib/python3.7/site-packages/trio/_core/_run.py", line 1579, in run_impl
    runner.task_exited(task, final_outcome)
  File "/Users/oremanj/rc/triovenv37/lib/python3.7/site-packages/trio/_core/_run.py", line 1050, in task_exited
    task._cancel_stack[-1]._remove_task(task)
  File "/Users/oremanj/rc/triovenv37/lib/python3.7/site-packages/trio/_core/_run.py", line 347, in _remove_task
    self._tasks.remove(task)
KeyError: <Task '__main__.example' at 0x105074940>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "t.py", line 4, in <module>
    @trio.run
  File "/Users/oremanj/rc/triovenv37/lib/python3.7/site-packages/trio/_core/_run.py", line 1436, in run
    ) from exc
trio.TrioInternalError: internal error in trio - please file a bug!
Exception ignored in: <function Nursery.__del__ at 0x104e3f620>
Traceback (most recent call last):
  File "/Users/oremanj/rc/triovenv37/lib/python3.7/site-packages/trio/_core/_run.py", line 638, in __del__
AssertionError:

# of a cancel status that's been closed. This is used to permit
# recovery from mis-nested cancel scopes (well, at least enough
# recovery to show a useful traceback).
abandoned_by_misnesting = attr.ib(default=False, init=False, repr=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a new public API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CancelStatus isn't part of the public API, so abandoned_by_misnesting and encloses() aren't either. They are non-underscore-prefixed because they're safe to use by other code in _run without worrying about CancelStatus's invariants.

trio/_core/_run.py Show resolved Hide resolved
trio/_core/_run.py Outdated Show resolved Hide resolved
@oremanj
Copy link
Member Author

oremanj commented Jun 4, 2019

Codecov report is nonsensical on this one too.

@njsmith
Copy link
Member

njsmith commented Jun 4, 2019

Looks good – let's see if it helps :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failing to close a CancelScope produces super confusing errors
3 participants