converting LORA to ggml to gguf #3953

xcottos · 2023-11-05T09:56:19Z

Hi everybody,

I have an huggingface model (https://huggingface.co/andreabac3/Fauno-Italian-LLM-13B) that I would like to convert to gguf.

That is a LORA model and I was able to convert it in ggml using convert-lora-to-ggml.py.

Now when I try to convert it to gguf, I tried using convert-llama-ggml-to-gguf.py but the magic number of the ggml model (generated with the first conversion) has a magic number (b'algg') that is not recognised

what am I doing wrong?

Thank you
Luca

KerfuffleV2 · 2023-11-05T10:11:35Z

You don't need to convert from the LoRA from GGML to GGUF. I think what you may be doing wrong is trying to load the LoRA with --model or -m? The way LoRA's work is you load the base model and apply the LoRA on top of it. So in addition to what you linked you'll also need the base model in GGUF to apply the LoRA to. Then you'll do -m base_model.gguf --lora your_lora.bin when actually trying to load the model.

edit: I think this should work as the base model: https://huggingface.co/TheBloke/LLaMA-13b-GGUF

xcottos · 2023-11-05T10:28:29Z

Thank you Kerfuffle, let me process your answer (I'm quite a newbie in LLM) and I will come back to you once I make progresses

Thank you again for the explanation
Luca

YukiTomita-CC · 2023-11-06T06:43:43Z

Hello, I am also facing the same problem.

I was attempting to use a different LoRA adapter, but for now, I followed the previous conversation and downloaded two models. I put TheBloke/LLaMA-13b-GGUF into the llama.cpp/models directory and andreabac3/Fauno-Italian-LLM-13B into the llama.cpp/models/loras directory. After that, I ran the main command as follows:

./main -m models/llama-13b.Q8_0.gguf --lora models/loras/adapter_model.bin --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n 256 -p "The conversation between human and AI assistant.\n[|Human|] Qual'è il significato della vita?\n[|AI|] "

However, the result was as follows (with prior output omitted for brevity):

....................................................................................................
llama_new_context_with_model: n_ctx      = 4096
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_new_context_with_model: kv self size  = 3200.00 MB
llama_build_graph: non-view tensors processed: 924/924
llama_new_context_with_model: compute buffer total size = 364.63 MB
llama_apply_lora_from_file_internal: applying lora adapter from 'models/loras/adapter_model.bin' - please wait ...
llama_apply_lora_from_file_internal: unsupported file version
llama_init_from_gpt_params: error: failed to apply lora adapter
main: error: unable to load model

I am running the latest code and running this on a Docker container with the Ubuntu:22.04 image.
/# make --version | head -1
GNU Make 4.3
/# g++ --version | head -1
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

I apologize if I missed any documentation and am not using this correctly. If I could successfully use the LoRA adapter in llama.cpp, it would make a significant difference to my project.
I am grateful for this repository and the support provided. Any help would be greatly appreciated.

KerfuffleV2 · 2023-11-06T07:03:54Z

@yuki-tomita-127

Hello, I am also facing the same problem.

Did you already convert the LoRA using convert-llama-ggml-to-gguf.py? I think your problem is different from the other person since it sounds like you missed that step.

YukiTomita-CC · 2023-11-06T08:41:26Z

@KerfuffleV2

Thank you for your response.

I apologize for the lack of detail in my previous post. I have attempted to use the convert-llama-ggml-to-gguf.py. Below are the steps I have taken, but I encounter an error when converting from LoRA to GGML to GGUF.

I used convert-lora-to-ggml.py to convert the original LoRA adapter.

python convert-lora-to-ggml.py models/loras

Output:

<Output omitted>
Converted models/loras/adapter_config.json and models/loras/adapter_model.bin to models/loras/ggml-adapter-model.bin

This seems to have worked successfully.

I then used convert-llama-ggml-to-gguf.py to convert the LoRA adapter from ggml to gguf.

python convert-llama-ggml-to-gguf.py --input models/loras/ggml-adapter-model.bin --output models/loras/ggml-adapter-model.gguf

Output:

* Using config: Namespace(input=PosixPath('models/loras/ggml-adapter-model.bin'), output=PosixPath('models/loras/ggml-adapter-model.gguf'), name=None, desc=None, gqa=1, eps='5.0e-06', context_length=2048, model_metadata_dir=None, vocab_dir=None, vocabtype='spm')

=== WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING ===

- Note: If converting LLaMA2, specifying "--eps 1e-5" is required. 70B models also need "--gqa 8".
* Scanning GGML input file
Traceback (most recent call last):
  File "/workspaces/llama.cpp/convert-llama-ggml-to-gguf.py", line 453, in <module>
    main()
  File "/workspaces/llama.cpp/convert-llama-ggml-to-gguf.py", line 430, in main
    offset = model.load(data, 0)
  File "/workspaces/llama.cpp/convert-llama-ggml-to-gguf.py", line 190, in load
    offset += self.validate_header(data, offset)
  File "/workspaces/llama.cpp/convert-llama-ggml-to-gguf.py", line 175, in validate_header
    raise ValueError(f"Unexpected file magic {magic!r}! This doesn't look like a GGML format file.")
ValueError: Unexpected file magic b'algg'! This doesn't look like a GGML format file.

This error occurs when I do so, which I believe is the same result that @xcottos experienced.

KerfuffleV2 · 2023-11-06T08:45:04Z

@yuki-tomita-127

Oh, I'm very sorry. I meant to write convert-lora-to-ggml.py there. My mistake. convert-llama-ggml-to-gguf.py is for converting actual models from GGML to GGUF.

So just to be clear, you'll use convert-lora-to-ggml.py to convert the original HuggingFace format (or whatever) LoRA to the correct format. After that, you don't need any further conversion steps (like from GGML to GGUF). You can load the output from convert-lora-to-ggml.py with --lora with the main example, etc.

Galunid · 2023-11-06T09:01:00Z

@yuki-tomita-127

./main -m models/llama-13b.Q8_0.gguf --lora models/loras/adapter_model.bin --color -c 4096 --temp 0.7 --repeat_penalty >1.1 -n 256 -p "The conversation between human and AI assistant.\n[|Human|] Qual'è il significato della vita?\n[|AI|] "

In your launch command, shouldn't you change models/loras/adapter_model.bin to models/loras/ggml-adapter-model.bin?

KerfuffleV2 · 2023-11-06T09:07:02Z

In your launch command, shouldn't you change

Their problem was I accidentally told them the wrong script, so they weren't able to produce the converted adapter at all.

YukiTomita-CC · 2023-11-06T09:26:07Z

@KerfuffleV2 @Galunid

Indeed, passing the output of convert-lora-to-ggml.py to the --lora option in the main command worked!
I hadn't understood the file format conversion. I really appreciate the series of answers you've provided!

Also, I'd like to apologize if I've taken over the thread a bit; @xcottos was the one who started this discussion. Sorry about that.

github-actions · 2024-04-02T01:12:14Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

HougeLangley · 2024-04-17T15:00:48Z

MediaBrain-SJTU/MING#20

Same for me

github-actions bot added the stale label Mar 19, 2024

github-actions bot closed this as completed Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

converting LORA to ggml to gguf #3953

converting LORA to ggml to gguf #3953

xcottos commented Nov 5, 2023 •

edited

Loading

KerfuffleV2 commented Nov 5, 2023 •

edited

Loading

xcottos commented Nov 5, 2023

YukiTomita-CC commented Nov 6, 2023

KerfuffleV2 commented Nov 6, 2023

YukiTomita-CC commented Nov 6, 2023

KerfuffleV2 commented Nov 6, 2023 •

edited

Loading

Galunid commented Nov 6, 2023

KerfuffleV2 commented Nov 6, 2023

YukiTomita-CC commented Nov 6, 2023

github-actions bot commented Apr 2, 2024

HougeLangley commented Apr 17, 2024

converting LORA to ggml to gguf #3953

converting LORA to ggml to gguf #3953

Comments

xcottos commented Nov 5, 2023 • edited Loading

KerfuffleV2 commented Nov 5, 2023 • edited Loading

xcottos commented Nov 5, 2023

YukiTomita-CC commented Nov 6, 2023

KerfuffleV2 commented Nov 6, 2023

YukiTomita-CC commented Nov 6, 2023

KerfuffleV2 commented Nov 6, 2023 • edited Loading

Galunid commented Nov 6, 2023

KerfuffleV2 commented Nov 6, 2023

YukiTomita-CC commented Nov 6, 2023

github-actions bot commented Apr 2, 2024

HougeLangley commented Apr 17, 2024

xcottos commented Nov 5, 2023 •

edited

Loading

KerfuffleV2 commented Nov 5, 2023 •

edited

Loading

KerfuffleV2 commented Nov 6, 2023 •

edited

Loading