the GPU memory usage sometimes spikes unexpectedly #37

ak01user · 2025-01-15T08:56:08Z

well done, this project is truly amazing and very user-friendly. I tried using it on a GPU with the CUDAExecutionProvider, but during inference, the GPU memory usage sometimes spikes unexpectedly. Why is that? Gpu is 4090 with 24gb.

satvikpendem · 2025-01-15T20:32:10Z

@ak01user How are you running it on GPU? I have a 3080 and it still looks like it's using the CPU, anything specific I need to do or modify?

ak01user · 2025-01-16T05:36:29Z

hi, you only need to modify the providers parameter, but please make sure you have installed onnxruntime-gpu @satvikpendem
`providers = ['CUDAExecutionProvider'] # specific CUDA mode

session = InferenceSession(
"./data/kokoro-v0_19.onnx", providers=providers
)

kokoro_instance = Kokoro.from_session(session, "./data/voices.json")`

satvikpendem · 2025-01-16T05:40:00Z

Thanks, got it. Having some issues running CUDA on Windows but I'll try on WSL and see. How much faster is the GPU for you versus the CPU time for generating the audio file?

ak01user · 2025-01-16T06:47:09Z

I have encapsulated the API using FastAPI. I simulated 100 requests, each generating 8 seconds of audio. It took 136 seconds in CPU mode and 41 seconds on the GPU. My device has an i7-13700KF CPU and an RTX 4090 GPU.

satvikpendem · 2025-01-17T18:14:11Z

I see the issue, uv automatically overrides and installs the CPU version (onnxruntime) and uses that when you run uv run hello.py even if you install the GPU version, because only the CPU version is specified in the kokoro-onnx dependencies. @thewh1teagle is there a way to add a flag to add the GPU version instead?

In the meantime, I did:

python3 -m venv .venv
.venv/bin/pip install -U kokoro-onnx
.venv/bin/pip uninstall onnxruntime
.venv/bin/pip install onnxruntime-gpu
.venv/bin/python hello.py

Works great with CUDA, the TensorRT provider throws an error about how it needs to infer shapes but when I tried that it didn't seem to work.

thewh1teagle · 2025-01-17T21:53:12Z

I see the issue, uv automatically overrides and installs the CPU version (onnxruntime) and uses that when you run uv run hello.py even if you install the GPU version, because only the CPU version is specified in the kokoro-onnx dependencies.

Good finding! Thanks
You can use pip directly with uv for instance

uv pip uninstall onnxruntime
uv pip install onnxruntime-gpu

Then run it. let me know if it still overrides it.
As for long term solution in kokoro-onnx. I thought maybe add option to install kokoro-onnx with pip install kokoro-onnx[gpu]
Then it will use onnxruntime-gpu instead. I need to see how feature flags with dependencies works best with uv

thewh1teagle · 2025-01-17T22:19:53Z

@satvikpendem

I released new version with gpu support.
See examples/with_gpu.py

satvikpendem · 2025-01-18T08:11:14Z

Then run it. let me know if it still overrides it.

@thewh1teagle Yes uv does still override it because it detects that onnxruntime (non GPU) is in the dependencies list and automatically installs that before running. Only my method seems to work to override the dependency to the GPU version. For your example, are you doing uv pip install kokoro-onnx[gpu] and that works fine on GPU with CUDA?

thewh1teagle · 2025-01-18T20:37:04Z

Yes uv does still override it because it detects that onnxruntime (non GPU) is in the dependencies list and automatically installs that before running. Only my method seems to work to override the dependency to the GPU version. For your example, are you doing uv pip install kokoro-onnx[gpu] and that works fine on GPU with CUDA?

After the last update, if you install kokoro-onnx[gpu] it works?

thewh1teagle added the bug Something isn't working label Jan 17, 2025

thewh1teagle mentioned this issue Jan 17, 2025

Improve speed #40

Open

satvikpendem mentioned this issue Jan 18, 2025

TensorRT support #50

Open

nazdridoy mentioned this issue Jan 19, 2025

Enabling use of GPU nazdridoy/kokoro-tts#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the GPU memory usage sometimes spikes unexpectedly #37

the GPU memory usage sometimes spikes unexpectedly #37

ak01user commented Jan 15, 2025 •

edited

Loading

satvikpendem commented Jan 15, 2025

ak01user commented Jan 16, 2025 •

edited

Loading

satvikpendem commented Jan 16, 2025 •

edited

Loading

ak01user commented Jan 16, 2025

satvikpendem commented Jan 17, 2025 •

edited

Loading

thewh1teagle commented Jan 17, 2025

thewh1teagle commented Jan 17, 2025

satvikpendem commented Jan 18, 2025 •

edited

Loading

thewh1teagle commented Jan 18, 2025

the GPU memory usage sometimes spikes unexpectedly #37

the GPU memory usage sometimes spikes unexpectedly #37

Comments

ak01user commented Jan 15, 2025 • edited Loading

satvikpendem commented Jan 15, 2025

ak01user commented Jan 16, 2025 • edited Loading

satvikpendem commented Jan 16, 2025 • edited Loading

ak01user commented Jan 16, 2025

satvikpendem commented Jan 17, 2025 • edited Loading

thewh1teagle commented Jan 17, 2025

thewh1teagle commented Jan 17, 2025

satvikpendem commented Jan 18, 2025 • edited Loading

thewh1teagle commented Jan 18, 2025

ak01user commented Jan 15, 2025 •

edited

Loading

ak01user commented Jan 16, 2025 •

edited

Loading

satvikpendem commented Jan 16, 2025 •

edited

Loading

satvikpendem commented Jan 17, 2025 •

edited

Loading

satvikpendem commented Jan 18, 2025 •

edited

Loading