Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVMLError(ret) pynvml.nvml.NVMLError_InvalidArgument: Invalid Argument #168

Open
GabrielZZZ opened this issue Oct 13, 2022 · 3 comments
Open

Comments

@GabrielZZZ
Copy link

Hi,

I am using Ubuntu 20.04 with Nvidia RTX3090. When I followed the instructions to train the model, it always gives me this error:
File "/opt/conda/lib/python3.8/site-packages/pynvml/nvml.py", line 366, in check_return raise NVMLError(ret) pynvml.nvml.NVMLError_InvalidArgument: Invalid Argument
Does anyone know any possible solutions? That would be very helpful.
image

@Gienapp
Copy link

Gienapp commented Nov 7, 2022

Hi, I'm having the same problem. Did you come up with a solution?

@GabrielZZZ
Copy link
Author

I am afraid not. It appears the forum is not very active.

@Gienapp
Copy link

Gienapp commented Nov 8, 2022

For me the solution was to change the number nproc_per_node in the training command from 8 to the number of GPUs my server has.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants