-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a weakref finalizer to stop notifier thread #143
base: branch-0.36
Are you sure you want to change the base?
Use a weakref finalizer to stop notifier thread #143
Conversation
When creating our own application context with the python notifier thread enabled we previously had to arrange to stop it before dropping the reference to the context we created. The consequence of not doing so is that our application hangs in exit waiting for a thread to join that never will. Applying the principle of least surprise, an object should clean up all of its resources when dropping out of scope. To ensure this change Application.stop_notifier_thread to a staticmethod and attach it as a weakref finalizer when starting the notifier thread. This way, we don't have to handle the case where there is no notifier thread, nor do we have to remember to call the stop function manually. This additionally allows us to clean up the top-level reset function since there is no longer a need to stop the notifier "manually" during reset.
This looks like a great solution, thanks for working on it @wence- . I see some tests are failing though because they explicitly call |
I reinstated the top-level call, and will then see if removal also works fine. |
How do we want to handle stopping the notifier in distributed-ucxx? It seems like there it might make sense to still stop the notifier explicitly, rather than waiting for the global application context to go out of scope in atexit teardown. |
Yeah, that was the only way I found to stop the notifier thread "on time" for distributed, if you have other ideas how to make that better they're certainly welcome. If you're worried about the top-level call exclusively, we should be ok removing that and then calling it directly from the |
It seems that all progress thread tests are hanging during the first one (probably after it completes), maybe there's some selfref still in place. Let me know if you'd like a rubber duck to debug this at some point. |
67b17b4
to
e5170e7
Compare
I figured out the problem. The test_benchmark_cluster test relies on an explicit call to |
So I just dropped the removal of |
Hmm, just the one test run failed, but no obvious error. |
Seems like this was an exit code |
I was not yet able to repro. |
When creating our own application context with the python notifier thread enabled we previously had to arrange to stop it before dropping the reference to the context we created. The consequence of not doing so is that our application hangs in exit waiting for a thread to join that never will.
Applying the principle of least surprise, an object should clean up all of its resources when dropping out of scope. To ensure this change Application.stop_notifier_thread to a staticmethod and attach it as a weakref finalizer when starting the notifier thread. This way, we don't have to handle the case where there is no notifier thread, nor do we have to remember to call the stop function manually.
This additionally allows us to clean up the top-level reset function since there is no longer a need to stop the notifier "manually" during reset.