Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Python thread finalizer can cause thread-local storage to be cleared prematurely in some situations #362

Closed
2 tasks done
dagardner-nv opened this issue Aug 16, 2023 · 0 comments · Fixed by #364 or #365
Closed
2 tasks done
Assignees
Labels
bug Something isn't working

Comments

@dagardner-nv
Copy link
Contributor

Version

23.11

Which installation method(s) does this occur on?

Docker, Conda, Source

Describe the bug.

When the thread finalizer (https://github.com/nv-morpheus/MRC/blob/branch-23.11/python/mrc/_pymrc/src/executor.cpp#L125) is called it decrements the GIL which releases all objects in thread local storage.

The issue is that if once of the objects being released also declares a finalizer (https://github.com/cupy/cupy/blob/fd54d95af0110e6bdf840da7a833357e72fbb43b/cupy/fft/_cache.pyx#L152) which in turn makes a call to gc.collect (https://github.com/cupy/cupy/blob/fd54d95af0110e6bdf840da7a833357e72fbb43b/cupy/fft/_cache.pyx#L128) then garbage collection runs while the thread state is in the middle of being destroyed.

Minimum reproducible example

No response

Relevant log output

No response

Full env printout

No response

Other/Misc.

No response

Code of Conduct

  • I agree to follow MRC's Code of Conduct
  • I have searched the open bugs and have found no duplicates for this bug report
@dagardner-nv dagardner-nv added the bug Something isn't working label Aug 16, 2023
@dagardner-nv dagardner-nv moved this from Todo to In Progress in Morpheus Boards Aug 16, 2023
dagardner-nv added a commit to dagardner-nv/MRC that referenced this issue Aug 17, 2023
…age collector prior to finalizing the thread
dagardner-nv added a commit to dagardner-nv/MRC that referenced this issue Aug 17, 2023
…hon garbage collector prior to finalizing the thread"

This reverts commit 50abe96.
rapids-bot bot pushed a commit to nv-morpheus/utilities that referenced this issue Aug 17, 2023
dagardner-nv added a commit to dagardner-nv/MRC that referenced this issue Aug 18, 2023
@mdemoret-nv mdemoret-nv removed their assignment Aug 22, 2023
@rapids-bot rapids-bot bot closed this as completed in #365 Sep 25, 2023
rapids-bot bot pushed a commit that referenced this issue Sep 25, 2023
* PR replicates issue #362, and will trigger a pybind11 internal error using an un-patched version of pybind11
* Only run IWYU on files changed in PR

Note:
* This bug requires the code in question to be run in a thread created by C++.

fixes #362

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: #365
@github-project-automation github-project-automation bot moved this from In Progress to Done in Morpheus Boards Sep 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
2 participants