Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda.cudart.getLocalRuntimeVersion() raises RuntimeError: Failed to dlopen libcudart.so.12 #89

Closed
Matt711 opened this issue Sep 10, 2024 · 10 comments
Assignees
Labels
bug Something isn't working cuda.bindings Everything related to the cuda.bindings module P0 High priority - Must do!

Comments

@Matt711
Copy link
Member

Matt711 commented Sep 10, 2024

Is this a bug? getLocalRuntimeVersion() fails for me in cuda 11.8 environment. I'm asking because I see that the API call is in the cuda-python 11.8 release notes.

In the source code, we're hard coding libcudart.so.12. Is that right?

Repro

In [1]: from cuda import cudart

In [2]: cudart.getLocalRuntimeVersion()
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[2], line 1
----> 1 cudart.getLocalRuntimeVersion()

File ~/.conda/envs/rapids/lib/python3.11/site-packages/cuda/cudart.pyx:24961, in cuda.cudart.getLocalRuntimeVersion()

File ~/.conda/envs/rapids/lib/python3.11/site-packages/cuda/ccudart.pyx:2365, in cuda.ccudart.getLocalRuntimeVersion()

File ~/.conda/envs/rapids/lib/python3.11/site-packages/cuda/_lib/ccudart/ccudart.pyx:2121, in cuda._lib.ccudart.ccudart._getLocalRuntimeVersion()

RuntimeError: Failed to dlopen libcudart.so.12
@github-actions github-actions bot added the triage Needs the team's attention label Sep 10, 2024
@Matt711
Copy link
Member Author

Matt711 commented Sep 10, 2024

xref rmm/1675

@leofang
Copy link
Member

leofang commented Sep 10, 2024

It seems to be a backport mistake that we should fix:

# Load
handle = dlfcn.dlopen('libcudart.so.12', dlfcn.RTLD_NOW)
if handle == NULL:
with gil:
raise RuntimeError(f'Failed to dlopen libcudart.so.12')
__cudaRuntimeGetVersion = dlfcn.dlsym(handle, 'cudaRuntimeGetVersion')

@Matt711 how urgent is this?

@leofang leofang added bug Something isn't working and removed triage Needs the team's attention labels Sep 10, 2024
@leofang leofang added this to the cuda-12-RC1, cuda-11-RC1 milestone Sep 10, 2024
@leofang leofang added the P0 High priority - Must do! label Sep 10, 2024
@Matt711
Copy link
Member Author

Matt711 commented Sep 10, 2024

@Matt711 how urgent is this?

Not urgent. We already have a workaround using numba.cuda. I also don't mind working on this @leofang, if you could point me in the right direction.

@leofang
Copy link
Member

leofang commented Sep 10, 2024

Thanks, @Matt711. The offending code that I linked to above is from the 11.8.x branch, so ideally we can just fix the lines referencing libcudart.so.12 to .11. But we're transitioning to a new development/release process so let me check with @vzhurba01 later today first, and get back to you later.

@leofang
Copy link
Member

leofang commented Sep 10, 2024

@Matt711 we discussed and will try to get a new 11.8.x release out next week, with this bug fixed and perhaps also #75 backported.

@Matt711
Copy link
Member Author

Matt711 commented Sep 11, 2024

Thanks @leofang

@wence-
Copy link

wence- commented Sep 30, 2024

@leofang, @vzhurba01 did this backport/release occur?

@vzhurba01
Copy link
Collaborator

Not yet. The wheels and conda packages are currently going through pre-release validation. I'll update this issue once posting is complete.

@vzhurba01
Copy link
Collaborator

FYI I've updated the repo with the fix under the patch release 11.8.4 (tag v11.8.4).

I created new issue #139 to track the wheels/conda uploads for this patch release. I'm thinking of keeping this current issue open though until they are uploaded, and then give a notice before closing.

@leofang leofang mentioned this issue Oct 8, 2024
@leofang leofang added the cuda.bindings Everything related to the cuda.bindings module label Oct 10, 2024
@vzhurba01
Copy link
Collaborator

https://pypi.org/project/cuda-python/11.8.4/
https://anaconda.org/nvidia/cuda-python/files?version=11.8.4

Issue #139 is now complete as both PYPI and Conda (nvidia channel) are now updated with the 11.8.4 patch. Thank you all for your patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuda.bindings Everything related to the cuda.bindings module P0 High priority - Must do!
Projects
None yet
Development

No branches or pull requests

4 participants