[Badcase]: Loss does not drop when using Liger Kernel at Qwen2.5 #921

Se-Hun · 2024-09-19T07:49:32Z

Has this been raised before?

I have checked the GitHub README.
I have checked the Qwen documentation and cannot find an answer there.
I have searched the issues and there is not a similar one.
I confirm that this is not a bug report, a feature request, or a badcase.

Description

Hello,

In my case:

I am trying to instruction tuning Qwen2.5-14B-Instruct with Liger Kernel.

There is my question:

I know that the liger kernel is supported in the dev version of huggingface transformers. However, when training the Qwen2.5 model with Liger Kernel, the loss value does not drop. Not supported yet at Qwen2.5?

jklj077 · 2024-09-19T07:58:43Z

Hi, unfortunately, I don't think we have any experience with it. Would mind reporting this to the Liger Kernel maintainers?

jklj077 · 2024-09-26T05:40:22Z

Hi, could you please provide more info on this? did you use frameworks like axolotl, llama-factory, or others? which type of chat templates and which script did you run? transformers have released a version that support Liger Kernel, could you give it a try?

jklj077 changed the title ~~[Question]: Loss does not drop when using Liger Kernel at Qwen2.5~~ [Badcase]: Loss does not drop when using Liger Kernel at Qwen2.5 Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Badcase]: Loss does not drop when using Liger Kernel at Qwen2.5 #921

[Badcase]: Loss does not drop when using Liger Kernel at Qwen2.5 #921

Se-Hun commented Sep 19, 2024

jklj077 commented Sep 19, 2024

jklj077 commented Sep 26, 2024

[Badcase]: Loss does not drop when using Liger Kernel at Qwen2.5 #921

[Badcase]: Loss does not drop when using Liger Kernel at Qwen2.5 #921

Comments

Se-Hun commented Sep 19, 2024

Has this been raised before?

Description

In my case:

There is my question:

jklj077 commented Sep 19, 2024

jklj077 commented Sep 26, 2024