-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pyNVML won't work on a Jetson, is there a workaround #400
Comments
So we use pynvml in two places:
|
The probability of a Jetson with a discrete GPU is ultra low and we can say that they don't exist outside of NVIDIA DRIVE units. We could easily wrap the affinity functionality in a statement such as |
It might be possible to detect affinity through |
Alternatively, if there is only one GPU on a jetson, does device affinity do anything ? |
Theoretically there is no device affinity on a Jetson, GPU and CPU share the same chunk of RAM and don't have to communicate via PCI bus. |
Do any of the Jetson board have multiple GPUs @JasonAtNvidia ? Note that dask-cuda is targeting a one-process-per-GPU model for parallelism, and if none of the boards have multiple GPUs you may not have a lot of use for dask-cuda anyway. If there are multiple GPU Jetsons, is there a reliable way to query whether the system is running on a Jetson? We can certainly add some conditions and work around pyNVML, we do something similar for the DGXs in dask-cuda/dask_cuda/tests/test_dgx.py Lines 30 to 40 in 8d42f27
|
There are Jetson boards with multiple GPU capability, DRIVE units are most common. They have a Xavier SoM and a Turing daughter board. The linux-4-tegra distribution has a file in |
Sorry for the late reply here @JasonAtNvidia , when you say multiple GPU capability you're saying that you can address each process with
As long as we can choose each GPU correctly, these should work for us to detect the platform correctly so we can work around the current NVML workaround. As soon as you confirm we can indeed use |
@pentschev I do not have a Jetson device to test multiple GPUs with, but I am able to verify that CUDA_VISIBLE_DEVICES=0 is successful and CUDA_VISIBLE_DEVICES=1 results in an error that no device is found. I will try to find a multiple GPU device to test with. |
@JasonAtNvidia I just pushed #402 , this should work with Tegra, but I don't have access to a Tegra device to test, it would be great if you could test it when you have a chance. |
@pentschev
|
@JasonAtNvidia those are the correct functions. It would be interesting to know if you can go any further to do some Dask computation as well, but as I mentioned before, you won't see any benefits in using dask-cuda with a single GPU vs just using the library (e.g., CuPy, cuDF, etc.) you're trying to compute with alone. |
This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d. |
This issue has been labeled |
There is no NVML library on aarch64 NVIDIA Jetson That will break many libraries relying on this library, such as cuxfilter. The geospatial and cuxfilter libraries are among the most requested for Jetson and I'd love to make it work. Is there a way to use Numba functions to replace pyNVML in this library?
The text was updated successfully, but these errors were encountered: