Qwen 2.5 Support is here!

There are some issues with Qwen 2.5 models which Unsloth has fixed!

Kaggle Base model finetuning notebook: https://www.kaggle.com/code/danielhanchen/kaggle-qwen-2-5-unsloth-notebook/notebook
Kaggle Instruct model finetuning notebook: https://www.kaggle.com/code/danielhanchen/kaggle-qwen-2-5-conversational-unsloth
Colab finetuning notebook: https://colab.research.google.com/drive/1Kose-ucXO1IBaZq5BvbwWieuubP7hxvQ?usp=sharing
Colab conversational notebook: https://colab.research.google.com/drive/1qN1CEalC70EO1wGKhNxs1go1W9So61R5?usp=sharing

EOS token issues

Qwen 2.5 Base models (0.5b all the way until 72b) - EOS token should be <|endoftext|> not <|im_end|>. The base models <|im_end|> is actually untrained, so it'll cause NaN gradients if you use it. You should re-pull the tokenizer from source, or you can download fixed base models from https://huggingface.co/unsloth if that helps.

Chat template issues

Qwen 2.5 Base models should NOT have a chat_template, this will actually cause errors especially in Unsloth's finetuning notebooks, since I check if untrained tokens exist in the chat template to counteract NaN gradients.
Do NOT use Qwen 2.5's chat template for the base models. This will cause NaN gradients!

4bit uploaded models

Qwen 2.5 0.5b 4bit 0.5b Instruct 0.5b 4bit Instruct 0.5b
Qwen 2.5 1.5b 4bit 1.5b Instruct 1.5b 4bit Instruct 1.5b
Qwen 2.5 3b 4bit 3b Instruct 3b 4bit Instruct 3b
Qwen 2.5 7b 4bit 7b Instruct 7b 4bit Instruct 7b
Qwen 2.5 14b 4bit 14b Instruct 14b 4bit Instruct 14b
Qwen 2.5 32b 4bit 32b Instruct 32b 4bit Instruct 32b
Qwen 2.5 72b 4bit 72b Instruct 72b 4bit Instruct 72b

What's Changed

Phi 3.5 by @danielhanchen in #940
Phi 3.5 by @danielhanchen in #941
Fix DPO by @danielhanchen in #947
Phi 3.5 bug fix by @danielhanchen in #955
Cohere, Bug fixes by @danielhanchen in #984
Gemma faster inference by @danielhanchen in #987
Bug fixes by @danielhanchen in #1004
Update README.md by @danielhanchen in #1033
Update README.md by @danielhanchen in #1036
fix: chat_templates.py bug by @NazimHAli in #1048

New Contributors

@NazimHAli made their first contribution in #1048

Full Changelog: August-2024...September-2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen 2.5 Support