-
-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the GPU memory usage sometimes spikes unexpectedly #37
Comments
@ak01user How are you running it on GPU? I have a 3080 and it still looks like it's using the CPU, anything specific I need to do or modify? |
hi, you only need to modify the providers parameter, but please make sure you have installed onnxruntime-gpu @satvikpendem session = InferenceSession( kokoro_instance = Kokoro.from_session(session, "./data/voices.json")` |
Thanks, got it. Having some issues running CUDA on Windows but I'll try on WSL and see. How much faster is the GPU for you versus the CPU time for generating the audio file? |
I have encapsulated the API using FastAPI. I simulated 100 requests, each generating 8 seconds of audio. It took 136 seconds in CPU mode and 41 seconds on the GPU. My device has an i7-13700KF CPU and an RTX 4090 GPU. |
I see the issue, In the meantime, I did: python3 -m venv .venv
.venv/bin/pip install -U kokoro-onnx
.venv/bin/pip uninstall onnxruntime
.venv/bin/pip install onnxruntime-gpu
.venv/bin/python hello.py Works great with CUDA, the TensorRT provider throws an error about how it needs to infer shapes but when I tried that it didn't seem to work. |
Good finding! Thanks uv pip uninstall onnxruntime
uv pip install onnxruntime-gpu Then run it. let me know if it still overrides it. |
I released new version with gpu support. |
@thewh1teagle Yes uv does still override it because it detects that onnxruntime (non GPU) is in the dependencies list and automatically installs that before running. Only my method seems to work to override the dependency to the GPU version. For your example, are you doing |
After the last update, if you install |
well done, this project is truly amazing and very user-friendly. I tried using it on a GPU with the CUDAExecutionProvider, but during inference, the GPU memory usage sometimes spikes unexpectedly. Why is that? Gpu is 4090 with 24gb.
The text was updated successfully, but these errors were encountered: