Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid a crash when there is a timeout when shutting down the Dask cluster #2580

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bouweandela
Copy link
Member

@bouweandela bouweandela commented Nov 15, 2024

Description

Avoid a crash when there is a timeout when shutting down the Dask cluster. I have seen this happen on Levante at DKRZ recently. It does not appear to be a real problem as the cluster is shut down after the tool completes running.


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@bouweandela bouweandela added the bug Something isn't working label Nov 15, 2024
Copy link

codecov bot commented Nov 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.66%. Comparing base (a328578) to head (be726ed).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2580   +/-   ##
=======================================
  Coverage   94.66%   94.66%           
=======================================
  Files         251      251           
  Lines       14287    14290    +3     
=======================================
+ Hits        13525    13528    +3     
  Misses        762      762           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bouweandela bouweandela marked this pull request as ready for review November 15, 2024 09:34
Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM bud, maybe tell the user what to do when the Cluster fails to shut down? ie don't panic, the process will be terminated by the host etc

@bouweandela
Copy link
Member Author

bouweandela commented Nov 21, 2024

How about "Timeout while trying to shut down the cluster at %s, you may want to check it was stopped."?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants