Fix support for lower VRAM GPUs #4

nicolaschan · 2024-09-02T18:04:23Z

I tried running this model on an RTX 4090 with 24GB of memory but the model does not fit in this. I encountered two issues with the current script which this PR resolves:

There is an attempt to check for lower VRAM GPUs, but this check happens after the .to("cuda") call loads the weights into VRAM, resulting in an OOM exception there before we can get to the low VRAM check.
Even with 24GB of VRAM, model offload runs out of memory so we should use sequential offload instead. This will slow down inference but brings memory usage down to ~2GB peak.

Fix support for lower VRAM GPUs

6b2ae0d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix support for lower VRAM GPUs #4

Fix support for lower VRAM GPUs #4

nicolaschan commented Sep 2, 2024

Fix support for lower VRAM GPUs #4

Are you sure you want to change the base?

Fix support for lower VRAM GPUs #4

Conversation

nicolaschan commented Sep 2, 2024