You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Error while loading custom finetuned QLoRA model in 4 bit : size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
#71
Open
ApoorvFrontera opened this issue
Aug 7, 2024
· 3 comments
RuntimeError: Error(s) in loading state_dict for Videollama2MistralForCausalLM:
size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
size mismatch for model.mm_projector.readout.2.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
I have debugged and understood the reason that when we pass in the load_4bit=True, the VideoLLama2 model params are loaded as 4bit params. The LLM part weights gets initialized from the base model (in this case mistralai/Mistral-7B-Instruct-v0.2) in 4-bit but the mm_projector does not (I am guessing it is initialized from the random values but in Params4bit class).
Now to initialize the weights for the mm_projector, we try to load it from the previously saved non_lora_trainables.bin which was stored in full precision format.
RuntimeError: Error(s) in loading state_dict for Videollama2MistralForCausalLM:
size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
size mismatch for model.mm_projector.readout.2.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
I have debugged and understood the reason that when we pass in the load_4bit=True, the VideoLLama2 model params are loaded as 4bit params. The LLM part weights gets initialized from the base model (in this case mistralai/Mistral-7B-Instruct-v0.2) in 4-bit but the mm_projector does not (I am guessing it is initialized from the random values but in Params4bit class).
Now to initialize the weights for the mm_projector, we try to load it from the previously saved non_lora_trainables.bin which was stored in full precision format.
Hi Team,
I have successfully finetuned a QLoRA adapter on a custom dataset. When I try to load it in full precision, it gets loaded and works well
But this takes too much time and GPU memory to run inference. So I wanted to load the model in 4-bit precision. So I pass in the load_4bit parameter:
While running this I get the following error:
I have debugged and understood the reason that when we pass in the
load_4bit=True
, the VideoLLama2 model params are loaded as 4bit params. The LLM part weights gets initialized from the base model (in this case mistralai/Mistral-7B-Instruct-v0.2) in 4-bit but themm_projector
does not (I am guessing it is initialized from the random values but in Params4bit class).Line: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/builder.py#L76
model = Videollama2MistralForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, config=lora_cfg_pretrained, **kwargs)
Now to initialize the weights for the
mm_projector
, we try to load it from the previously savednon_lora_trainables.bin
which was stored in full precision format.Line: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/builder.py#L101
model.load_state_dict(non_lora_trainables, strict=False)
Debugging Outputs
Model params
Previously saved non_lora_trainables param
Please advise on how to resolve this.
The text was updated successfully, but these errors were encountered: