More on the model support #733

Anindyadeep · 2023-10-15T18:34:12Z

Anindyadeep
Oct 15, 2023

So, I have these two models to fine-tune

CodeLlama-34B
Mistral-7B

CodeLlama derives the same architecture as of Llama. And Mistral is fairly new in the community. So, my question is can I use axolotl to fine-tune these two above models?

winglian · 2023-10-15T20:09:38Z

winglian
Oct 15, 2023
Maintainer

yes, CodeLlama sample configurations are here https://github.com/OpenAccess-AI-Collective/axolotl/tree/main/examples/code-llama
and Mistral configurations are provided here: https://github.com/OpenAccess-AI-Collective/axolotl/tree/main/examples/mistral

0 replies

Anindyadeep · 2023-10-16T02:41:20Z

Anindyadeep
Oct 16, 2023
Author

@winglian, thanks for the answer, it helps. However, the GPU memory requirements of the smaller model (both for fine-tuning and inference) are well written. However, I do not see the same for bigger models like 34B or 70B clearly. For example consider this:

The 34b variant does not fit on 24GB of VRAM - you will need something with +40 gb VRAM that also supports flash attention v2 - A6000 or A100 are good choices.

Here, it is saying that to use a 34 B model, I would require 40 +GB of VRAM. Is this for inference (only) or fine-tuning too? Also, does it also handle the multi-GPU training (I Believe yes, because I saw the deepspeed configs)?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More on the model support #733

{{title}}

Replies: 2 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

More on the model support #733

Anindyadeep Oct 15, 2023

Replies: 2 comments

winglian Oct 15, 2023 Maintainer

Anindyadeep Oct 16, 2023 Author

Anindyadeep
Oct 15, 2023

winglian
Oct 15, 2023
Maintainer

Anindyadeep
Oct 16, 2023
Author