-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define "GPU" as a worker resource #1401
base: branch-24.12
Are you sure you want to change the base?
Conversation
resources = dict(pair.split("=") for pair in resources) | ||
resources = valmap(float, resources) | ||
gpu_resources = valmap(int, itemfilter(lambda x: x != "GPU", resources)) | ||
resources = valmap(float, itemfilter(lambda x: x == "GPU", resources)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jacobtomlinson you have written this section originally, I presume mapping values to float
was done to support a definition such as "MEMORY"
but since I don't see any tests or any other explicit mention in Dask-CUDA, could you comment if that's right and whether you can think of more robust ways for us to handle types here other than what I wrote above as "GPU" or NOT "GPU"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I don't remember the reasoning. I think dask handles these values as floats, as you say it's to support values like MEMORY
or other arbitrary quantities.
if "GPU" not in worker_kwargs["resources"]: | ||
worker_kwargs["GPU"] = 1 | ||
else: | ||
worker_kwargs["resources"] = {"GPU": 1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it makes sense to use {"cuda": 1}
(i.e. use "cuda" as the resource name for a CUDA-capable GPU)?
This would align with the convention used in DL (e.g. pytorch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't mind personally, but the keyword GPU
is documented in Distributed worker resources docs, so that seems like a more universal option and better documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay nice - That's good enough for me.
Add "GPU" as a worker resource to each CUDA worker. This should support users in identifying whether the workers available contain a GPU resource.