Errors loading ggml models #107

NasonZ · 2023-05-10T00:38:07Z

Hello,
I'm just starting to explore the models made available by gpt4all but I'm having trouble loading a few models.

My environment details:

Ubuntu==22.04
Python==3.10.10
pygpt4all==1.1.0.
llama-cpp-python==0.1.48

Code to reproduce error (vicuna_test.py):

from pygpt4all.models.gpt4all import GPT4All
model = GPT4All('./models/ggml-vicuna-7b-1.1-q4_2.bin') #or any of the other model

Issue:

The issue is that I can't seem to load some of the models listed here - https://github.com/nomic-ai/gpt4all-chat#manual-download-of-models. The models I've failed to load are:

ggml-gpt4all-j.bin
ggml-gpt4all-j-v1.3-groovy.bin
ggml-vicuna-7b-1.1-q4_2.bin
ggml-stable-vicuna-13B.q4_2.bin models.

As shown below, the ggml-gpt4all-l13b-snoozy.bin model loads without issue. I also managed to load this version - https://huggingface.co/mrgaang/aira/blob/main/gpt4all-converted.bin.

# Working example - ggml-gpt4all-l13b-snoozy.bin

$ python vicuna_test.py 
llama_model_load: loading model from './models/ggml-gpt4all-l13b-snoozy.bin' - please wait ...
llama_model_load: n_vocab = 32000
...
llama_model_load: ggml ctx size = 101.25 KB
llama_model_load: mem required  = 9807.93 MB (+ 3216.00 MB per state)
llama_model_load: loading tensors from './models/ggml-gpt4all-l13b-snoozy.bin'
llama_model_load: model size =  7759.39 MB / num tensors = 363
llama_init_from_file: kv self size  =  800.00 MB
#loads without error

Errors loading the listed models:

# gpt4all-j-v1.3-groovy

$ python vicuna_test.py 
llama_model_load: loading model from './models/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ...
llama_model_load: invalid model file './models/ggml-gpt4all-j-v1.3-groovy.bin' (too old, regenerate your model files or convert them with convert-unversioned-ggml-to-ggml.py!)
llama_init_from_file: failed to load model
Segmentation fault (core dumped)

# vicuna-7b

$ python vicuna_test.py 
llama_model_load: loading model from './models/ggml-vicuna-7b-1.1-q4_2.bin' - please wait ...
llama_model_load: n_vocab = 32000
...
llama_model_load: type    = 1
llama_model_load: invalid model file './models/ggml-vicuna-7b-1.1-q4_2.bin' (bad f16 value 5)
llama_init_from_file: failed to load model
Segmentation fault (core dumped)

#ggml-gpt4all-j.bin and ggml-stable-vicuna-13B.q4_2.bin models produced the same error

Can anyone please advise on how I resolve these issues?

The text was updated successfully, but these errors were encountered:

NasonZ · 2023-05-10T00:42:01Z

I'm pretty sure the issue is with GPT4All as I can load all models mentioned with:

from llama_cpp import Llama
model_path = "models/ggml-vicuna-7b-1.1-q4_2.bin"
model = Llama(model_path=model_path)

Are there any major differences from loading the model through Llama instead of GPT4All?

385olt · 2023-05-10T15:31:34Z

I have the same issue.

My environment:
ubuntu 22.04
python 3.10.6
pygpt4all 1.1.0
pygptj 2.0.3
pyllamacpp 2.1.3

Code:
model = GPT4All('./ggml-mpt-7b-chat.bin', prompt_context = "The following is a conversation between Jim and Bob. Bob is trying to help Jim with his requests by answering the questions to the best of his abilities. If Bob cannot help Jim, then he says that he doesn't know.")

I can load ggml-gpt4all-l13b-snoozy.bin and https://huggingface.co/mrgaang/aira/blob/main/gpt4all-converted.bin and they work fine, but the following models fail to load:
-- ggml-mpt-7b-chat.bin
-- ggml-vicuna-7b-1.1-q4_2.bin
-- ggml-wizardLM-7b.q4_2.bin

Loading ggml-wizardLM-7b.q4_2.bin and ggml-vicuna-7b-1.1-q4_2.bin gives

llama_model_load: invalid model file './ggml-wizardLM-7b.q4_2.bin' (bad f16 value 5)
llama_init_from_file: failed to load model

Loading ggml-mpt-7b-chat.bin gives

./ggml-mpt-7b-chat.bin: invalid model file (bad magic [got 0x67676d6d want 0x67676a74])
	you most likely need to regenerate your ggml files
	the benefit is you'll get 10-100x faster load times
	see https://github.com/ggerganov/llama.cpp/issues/91
	use convert-pth-to-ggml.py to regenerate from original pth
	use migrate-ggml-2023-03-30-pr613.py if you deleted originals
llama_init_from_file: failed to load model

I tried using llama.cpp/migrate-ggml-2023-03-30-pr613.py on ggml-mpt-7b-chat.bin, but got the error:

File "/.../llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 133, in read_tokens
    word = fin.read(length)
ValueError: read length must be non-negative or -1

I tried using llama.cpp/convert-unversioned-ggml-to-ggml.py to fix this error, but got the error:

File "/.../llama.cpp/convert-unversioned-ggml-to-ggml.py", line 29, in write_header
    raise Exception('Invalid file magic. Must be an old style ggml file.')
Exception: Invalid file magic. Must be an old style ggml file.

I tried using llama.cpp/migrate-ggml-2023-03-30-pr613.py on ggml-wizardLM-7b.q4_2.bin, but got the message:
./ggml-wizardLM-7b.q4_2.bin: input ggml has already been converted to 'ggjt' magic

I have no idea how to fix this or why it happens.

emilaz · 2023-05-11T13:50:22Z

I'm having the same issue on

Windows 10
Python 3.11.3
pygpt4all 1.1.0

and models
gpt4all-lora-quantized.bin
ggml-gpt4all-j-v1.3-groovy.bin

all result in

llama_model_load: invalid model file './gpt4all-lora-quantized.bin' (too old, regenerate your model files or convert them with convert-unversioned-ggml-to-ggml.py!)
llama_init_from_file: failed to load model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors loading ggml models #107

Errors loading ggml models #107

NasonZ commented May 10, 2023

NasonZ commented May 10, 2023

385olt commented May 10, 2023

emilaz commented May 11, 2023

Errors loading ggml models #107

Errors loading ggml models #107

Comments

NasonZ commented May 10, 2023

NasonZ commented May 10, 2023

385olt commented May 10, 2023

emilaz commented May 11, 2023