fix: implement llama3 RoPE scaling type and fix converter #1751

ebraraktas · 2024-07-31T23:16:16Z

Fixes #1745 .

Implementation is ported from tranformers

ebraraktas · 2024-08-01T00:29:13Z

While implementing this, I saw that rotary embedding layer is shared among layers of Llama3 and huggingface implementation is refactored to use that shared one. Maybe we can implement this feature in ctranslate2 to reduce memory usage.

minhthuc2502 · 2024-08-02T08:29:14Z

Hello, thank you for your PR. I will fix the CI soon and you can rebase on the master. What do you mean "rotary embedding layer is shared among layers of Llama3 and huggingface implementation". Can you add the link here?

ebraraktas · 2024-08-02T11:24:46Z

Thanks for the comment, I will rebase once you fix it. @minhthuc2502

For RoPE sharing:

rotary_emb will be removed from LlamaAttention, see this comment
position_embeddings (tuple of cos and sin) will be generated from input embeddings at the beginning of inference inside LlamaModel.forward and passed to the layers using it.
And position_embeddings will be passed as an input to LlamaAttention.forward, implemented here

minhthuc2502 · 2024-08-05T09:45:30Z

Ah I understand your point. Actually, we can keep the current architecture because It requires more changes than HF to do this.

minhthuc2502 · 2024-08-07T13:53:32Z

Could you rebase on the master branch? please

ebraraktas · 2024-08-07T20:54:05Z

As in this PR, no space left error occurred during docker step of the CI/CD. @minhthuc2502

ebraraktas force-pushed the fixllama3.1-conversion branch from 5787b0c to 5503c93 Compare July 31, 2024 23:20

fix: implement llama3 RoPE scaling type and fix converter

8429e5e

ebraraktas force-pushed the fixllama3.1-conversion branch from 5503c93 to 8429e5e Compare August 7, 2024 14:03

build: add definition for M_PI on windows

5d3c89e

ebraraktas force-pushed the fixllama3.1-conversion branch from 585a300 to 5d3c89e Compare August 7, 2024 19:17

minhthuc2502 merged commit a386cbd into OpenNMT:master Aug 12, 2024
13 checks passed

ebraraktas deleted the fixllama3.1-conversion branch September 9, 2024 08:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: implement llama3 RoPE scaling type and fix converter #1751

fix: implement llama3 RoPE scaling type and fix converter #1751

ebraraktas commented Jul 31, 2024

ebraraktas commented Aug 1, 2024

minhthuc2502 commented Aug 2, 2024 •

edited

Loading

ebraraktas commented Aug 2, 2024

minhthuc2502 commented Aug 5, 2024

minhthuc2502 commented Aug 7, 2024

ebraraktas commented Aug 7, 2024 •

edited

Loading

fix: implement llama3 RoPE scaling type and fix converter #1751

fix: implement llama3 RoPE scaling type and fix converter #1751

Conversation

ebraraktas commented Jul 31, 2024

ebraraktas commented Aug 1, 2024

minhthuc2502 commented Aug 2, 2024 • edited Loading

ebraraktas commented Aug 2, 2024

minhthuc2502 commented Aug 5, 2024

minhthuc2502 commented Aug 7, 2024

ebraraktas commented Aug 7, 2024 • edited Loading

minhthuc2502 commented Aug 2, 2024 •

edited

Loading

ebraraktas commented Aug 7, 2024 •

edited

Loading