[BUG] Cutlass python does not detect GPU #1919

IzanCatalan · 2024-11-05T17:27:46Z

Describe the bug
I am trying to use Cutlass Python and build it from source.
My environment is formed by Ubuntu 18.04, cuda 11.8, GPU Nvidia Tesla V100 volta, python3.10, make 3.19 and GCC version 9.4.0.
I successfully built and compiled Cutlass following the guidelines here. However, I now desire to compile cutlass Python to use pytorch with a cutlass. However, when following the guidelines in /python, it fails because it does not detect the GPU.

Steps/Code to reproduce bug
I have executed pip install -e . in the root directory /cutlass, and it works fine because pip detects and compiles cutlass; in fact, if I run pip list | grep nvidia, it shows nvidia-cutlass 3.6.0.0 . However, when I run a test, or this basic example fails:

`import cutlass
import numpy as np

plan = cutlass.op.Gemm(element=np.float16, layout=cutlass.LayoutType.RowMajor)
A, B, C, D = [np.ones((1024, 1024), dtype=np.float16) for i in range(4)]
plan.run(A, B, C, D)`

And the output error is:
File "/mnt/beegfs/gap/[email protected]/cutlass/test/python/cutlass/conv2d/test.py", line 4, in <module> plan = cutlass.op.Gemm(element=np.float16, layout=cutlass.LayoutType.RowMajor) File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/op/gemm.py", line 224, in __init__ super().__init__(cc=cc, kernel_cc=kernel_cc) File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/op/op.py", line 72, in __init__ self.cc = cc if cc is not None else device_cc() File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/backend/utils/device.py", line 77, in device_cc device = cutlass.device_id() File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/__init__.py", line 176, in device_id initialize_cuda_context() File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/__init__.py", line 163, in initialize_cuda_context raise RuntimeError(f"cudaFree failed with error {err}") RuntimeError: cudaFree failed with error 3

Or if I run the tests:

`======================================================================
ERROR: conv2d_sm80 (unittest.loader._FailedTest)

ImportError: Failed to import test module: conv2d_sm80
Traceback (most recent call last):
File "/usr/lib/python3.10/unittest/loader.py", line 436, in _find_test_path
module = self._get_module_from_name(name)
File "/usr/lib/python3.10/unittest/loader.py", line 377, in _get_module_from_name
import(name)
File "/mnt/beegfs/gap/[email protected]/cutlass/test/python/cutlass/conv2d/conv2d_sm80.py", line 50, in
@unittest.skipIf(device_cc() < cc, 'Device compute capability is invalid for SM80 tests.')
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/backend/utils/device.py", line 77, in device_cc
device = cutlass.device_id()
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/init.py", line 176, in device_id
initialize_cuda_context()
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/init.py", line 163, in initialize_cuda_context
raise RuntimeError(f"cudaFree failed with error {err}")
RuntimeError: cudaFree failed with error 3

Ran 1 test in 0.002s

FAILED (errors=1)
Traceback (most recent call last):
File "/mnt/beegfs/gap/[email protected]/cutlass/test/python/cutlass/conv2d/run_all_tests.py", line 44, in
raise Exception('Test cases failed')
Exception: Test cases failed
[email protected]@altek1:~/cutlass/test/python/cutlass/conv2d$ python3.10 test.py
Traceback (most recent call last):
File "/mnt/beegfs/gap/[email protected]/cutlass/test/python/cutlass/conv2d/test.py", line 4, in
plan = cutlass.op.Gemm(element=np.float16, layout=cutlass.LayoutType.RowMajor)
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/op/gemm.py", line 224, in init
super().init(cc=cc, kernel_cc=kernel_cc)
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/op/op.py", line 72, in init
self.cc = cc if cc is not None else device_cc()
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/backend/utils/device.py", line 77, in device_cc
device = cutlass.device_id()
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/init.py", line 176, in device_id
initialize_cuda_context()
File "/mnt/beegfs/gap/[email protected]/cutlass/python/cutlass/init.py", line 163, in initialize_cuda_context
raise RuntimeError(f"cudaFree failed with error {err}")
RuntimeError: cudaFree failed with error 3`

I would like to know if I omitted any step because I didn't modify the cuda or path variables. It was automatically detected by cmake. I just did:

$ mkdir build && cd build $ cmake .. -DCUTLASS_NVCC_ARCHS=70 $ make -j$(nproc)

And It worked fine.

Any help would be appreciated.

The text was updated successfully, but these errors were encountered:

jackkosaian · 2024-11-05T18:10:29Z

Can you please list the version of cuda-python installed on your system?

IzanCatalan · 2024-11-05T22:23:20Z

@jackkosaian I show you everything I could find installed in Python related to Cuda, Nvidia, however my cuda python is 12.6.1:

`[email protected]@altek1:~$ pip3.10 list | grep nvidia
nvidia-cublas-cu11 11.11.3.6
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu11 11.8.87
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu11 11.8.89
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu11 11.8.89
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu11 9.1.0.70
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu11 10.9.0.58
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu11 10.3.0.86
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu11 11.4.1.48
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu11 11.7.5.86
nvidia-cusparse-cu12 12.1.0.106
nvidia-cutlass 3.6.0.0 /mnt/beegfs/gap/[email protected]/cutlass
nvidia-nccl-cu11 2.20.5
nvidia-nccl-cu12 2.18.1
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu11 11.8.86
nvidia-nvtx-cu12 12.1.105

[notice] A new release of pip is available: 24.0 -> 24.3.1
[notice] To update, run: python3.10 -m pip install --upgrade pip
[email protected]@altek1:~$ pip3.10 list | grep cuda
cuda-python 12.6.1
nvidia-cuda-cupti-cu11 11.8.87
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu11 11.8.89
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu11 11.8.89
nvidia-cuda-runtime-cu12 12.1.105`

Executing nvidia-smi and nvcc --version:

jackkosaian · 2024-11-05T22:47:24Z

Thanks.

Does the issue also occur if you install the version of the CUTLASS Python interface available on PyPI?

This can be installed via:

pip install nvidia-cutlass

(You'll probably need to uninstall any version of the CUTLASS Python interface you may have previously installed via pip install -e .)

IzanCatalan · 2024-11-06T18:10:41Z

@jackkosaian Apparently, it was something related to python3.10 because I got the same error even with the command you told me. When I repeated all the steps but with pip3 and python3.8, the error did not appear; however, when running the following example, the GPU detects the program, but it seems to only use it partially with only 1-6% usage. With all tests, it is even lower, with 0-1% and only 300MB out of 32G. I wonder if this is normal.

`import cutlass
import numpy as np

plan = cutlass.op.Gemm(element=np.float16, layout=cutlass.LayoutType.RowMajor)
A, B, C, D = [np.ones((1024, 1024), dtype=np.float16) for i in range(4)]
plan.run(A, B, C, D)`

jackkosaian · 2024-11-06T23:26:07Z

Interesting on the Python 3.10 vs. 3.8 finding.

Regarding GPU utilization: the GEMM you're running and those in unit tests are not large enough to show high utilization of the GPU via nvidia-smi (assuming that's how you're measuring utilization and memory utilization).

IzanCatalan added ? - Needs Triage bug Something isn't working labels Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Cutlass python does not detect GPU #1919

[BUG] Cutlass python does not detect GPU #1919

IzanCatalan commented Nov 5, 2024

jackkosaian commented Nov 5, 2024

IzanCatalan commented Nov 5, 2024 •

edited

Loading

jackkosaian commented Nov 5, 2024

IzanCatalan commented Nov 6, 2024

jackkosaian commented Nov 6, 2024

[BUG] Cutlass python does not detect GPU #1919

[BUG] Cutlass python does not detect GPU #1919

Comments

IzanCatalan commented Nov 5, 2024

`====================================================================== ERROR: conv2d_sm80 (unittest.loader._FailedTest)

jackkosaian commented Nov 5, 2024

IzanCatalan commented Nov 5, 2024 • edited Loading

jackkosaian commented Nov 5, 2024

IzanCatalan commented Nov 6, 2024

jackkosaian commented Nov 6, 2024

`======================================================================
ERROR: conv2d_sm80 (unittest.loader._FailedTest)

IzanCatalan commented Nov 5, 2024 •

edited

Loading