Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Badcase]: Loss does not drop when using Liger Kernel at Qwen2.5 #921

Open
4 tasks done
Se-Hun opened this issue Sep 19, 2024 · 2 comments
Open
4 tasks done

[Badcase]: Loss does not drop when using Liger Kernel at Qwen2.5 #921

Se-Hun opened this issue Sep 19, 2024 · 2 comments

Comments

@Se-Hun
Copy link

Se-Hun commented Sep 19, 2024

Has this been raised before?

Description

Hello,

In my case:

I am trying to instruction tuning Qwen2.5-14B-Instruct with Liger Kernel.

There is my question:

I know that the liger kernel is supported in the dev version of huggingface transformers. However, when training the Qwen2.5 model with Liger Kernel, the loss value does not drop. Not supported yet at Qwen2.5?

@jklj077
Copy link
Collaborator

jklj077 commented Sep 19, 2024

Hi, unfortunately, I don't think we have any experience with it. Would mind reporting this to the Liger Kernel maintainers?

@jklj077 jklj077 changed the title [Question]: Loss does not drop when using Liger Kernel at Qwen2.5 [Badcase]: Loss does not drop when using Liger Kernel at Qwen2.5 Sep 25, 2024
@jklj077
Copy link
Collaborator

jklj077 commented Sep 26, 2024

Hi, could you please provide more info on this? did you use frameworks like axolotl, llama-factory, or others? which type of chat templates and which script did you run? transformers have released a version that support Liger Kernel, could you give it a try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants