-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qwen2VL 2B & 7B OOM #1390
Comments
can someone reproduce ? before we investigate further ? edd maybe ? |
I'm also having an extremely unexpected OOM error when trying to train with Qwen. I'm using a free colab T4 for continued pretraining, which I've done using Mistral 7B with no problems in the past. But trying to do the same for Qwen 2.5 3B is running out of memory.... Memory use before I run
Result of the
This was with a max sequence length of only 2048 - when I initially tried it with a higher value, it gave me OOM on the exact same line of code, but without getting far enough to output the P.S. I also want to add that I was using |
When fine-tuning a Qwen2 model on an A100 (80GB), I get OOMs.
This is surprising given batch size of 1, small images (256 x 256), and 4-bit training. With the same data, it's possible to train LLAMA3 11B with batch size of 8 and only 15 GB of memory consumed.
Traceback:
The text was updated successfully, but these errors were encountered: