-
Notifications
You must be signed in to change notification settings - Fork 980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Cutlass python does not detect GPU #1919
Comments
Can you please list the version of |
@jackkosaian I show you everything I could find installed in Python related to Cuda, Nvidia, however my cuda python is 12.6.1: `[email protected]@altek1:~$ pip3.10 list | grep nvidia [notice] A new release of pip is available: 24.0 -> 24.3.1 Executing nvidia-smi and nvcc --version: |
Thanks. Does the issue also occur if you install the version of the CUTLASS Python interface available on PyPI? This can be installed via: pip install nvidia-cutlass (You'll probably need to uninstall any version of the CUTLASS Python interface you may have previously installed via |
@jackkosaian Apparently, it was something related to python3.10 because I got the same error even with the command you told me. When I repeated all the steps but with pip3 and python3.8, the error did not appear; however, when running the following example, the GPU detects the program, but it seems to only use it partially with only 1-6% usage. With all tests, it is even lower, with 0-1% and only 300MB out of 32G. I wonder if this is normal. `import cutlass plan = cutlass.op.Gemm(element=np.float16, layout=cutlass.LayoutType.RowMajor) |
Interesting on the Python 3.10 vs. 3.8 finding. Regarding GPU utilization: the GEMM you're running and those in unit tests are not large enough to show high utilization of the GPU via |
Describe the bug
I am trying to use Cutlass Python and build it from source.
My environment is formed by Ubuntu 18.04, cuda 11.8, GPU Nvidia Tesla V100 volta, python3.10, make 3.19 and GCC version 9.4.0.
I successfully built and compiled Cutlass following the guidelines here. However, I now desire to compile cutlass Python to use pytorch with a cutlass. However, when following the guidelines in /python, it fails because it does not detect the GPU.
Steps/Code to reproduce bug
I have executed
pip install -e .
in the root directory /cutlass, and it works fine because pip detects and compiles cutlass; in fact, if I runpip list | grep nvidia
, it showsnvidia-cutlass 3.6.0.0
. However, when I run a test, or this basic example fails:`import cutlass
import numpy as np
plan = cutlass.op.Gemm(element=np.float16, layout=cutlass.LayoutType.RowMajor)
A, B, C, D = [np.ones((1024, 1024), dtype=np.float16) for i in range(4)]
plan.run(A, B, C, D)`
And the output error is:
File "/mnt/beegfs/gap/[email protected]/cutlass/test/python/cutlass/conv2d/test.py", line 4, in <module> plan = cutlass.op.Gemm(element=np.float16, layout=cutlass.LayoutType.RowMajor) File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/op/gemm.py", line 224, in __init__ super().__init__(cc=cc, kernel_cc=kernel_cc) File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/op/op.py", line 72, in __init__ self.cc = cc if cc is not None else device_cc() File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/backend/utils/device.py", line 77, in device_cc device = cutlass.device_id() File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/__init__.py", line 176, in device_id initialize_cuda_context() File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/__init__.py", line 163, in initialize_cuda_context raise RuntimeError(f"cudaFree failed with error {err}") RuntimeError: cudaFree failed with error 3
Or if I run the tests:
`======================================================================
ERROR: conv2d_sm80 (unittest.loader._FailedTest)
ImportError: Failed to import test module: conv2d_sm80
Traceback (most recent call last):
File "/usr/lib/python3.10/unittest/loader.py", line 436, in _find_test_path
module = self._get_module_from_name(name)
File "/usr/lib/python3.10/unittest/loader.py", line 377, in _get_module_from_name
import(name)
File "/mnt/beegfs/gap/[email protected]/cutlass/test/python/cutlass/conv2d/conv2d_sm80.py", line 50, in
@unittest.skipIf(device_cc() < cc, 'Device compute capability is invalid for SM80 tests.')
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/backend/utils/device.py", line 77, in device_cc
device = cutlass.device_id()
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/init.py", line 176, in device_id
initialize_cuda_context()
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/init.py", line 163, in initialize_cuda_context
raise RuntimeError(f"cudaFree failed with error {err}")
RuntimeError: cudaFree failed with error 3
Ran 1 test in 0.002s
FAILED (errors=1)
Traceback (most recent call last):
File "/mnt/beegfs/gap/[email protected]/cutlass/test/python/cutlass/conv2d/run_all_tests.py", line 44, in
raise Exception('Test cases failed')
Exception: Test cases failed
[email protected]@altek1:~/cutlass/test/python/cutlass/conv2d$ python3.10 test.py
Traceback (most recent call last):
File "/mnt/beegfs/gap/[email protected]/cutlass/test/python/cutlass/conv2d/test.py", line 4, in
plan = cutlass.op.Gemm(element=np.float16, layout=cutlass.LayoutType.RowMajor)
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/op/gemm.py", line 224, in init
super().init(cc=cc, kernel_cc=kernel_cc)
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/op/op.py", line 72, in init
self.cc = cc if cc is not None else device_cc()
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/backend/utils/device.py", line 77, in device_cc
device = cutlass.device_id()
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/init.py", line 176, in device_id
initialize_cuda_context()
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/init.py", line 163, in initialize_cuda_context
raise RuntimeError(f"cudaFree failed with error {err}")
RuntimeError: cudaFree failed with error 3`
I would like to know if I omitted any step because I didn't modify the cuda or path variables. It was automatically detected by cmake. I just did:
$ mkdir build && cd build $ cmake .. -DCUTLASS_NVCC_ARCHS=70 $ make -j$(nproc)
And It worked fine.
Any help would be appreciated.
The text was updated successfully, but these errors were encountered: