Error while loading custom finetuned QLoRA model in 4 bit : size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]). #71

ApoorvFrontera · 2024-08-07T11:02:50Z

Hi Team,

I have successfully finetuned a QLoRA adapter on a custom dataset. When I try to load it in full precision, it gets loaded and works well

But this takes too much time and GPU memory to run inference. So I wanted to load the model in 4-bit precision. So I pass in the load_4bit parameter:

model_path = '/home/apoorv/development/videollama2/VideoLLaMA2/work_dirs/videollama2_vllava/finetune_videollama2_vllava_qlora'
model_name = get_model_name_from_path(model_path)
tokenizer, model, processor, context_len = load_pretrained_model(model_path, None, model_name, load_4bit=True)

While running this I get the following error:

RuntimeError: Error(s) in loading state_dict for Videollama2MistralForCausalLM:
	size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
	size mismatch for model.mm_projector.readout.2.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).

I have debugged and understood the reason that when we pass in the load_4bit=True, the VideoLLama2 model params are loaded as 4bit params. The LLM part weights gets initialized from the base model (in this case mistralai/Mistral-7B-Instruct-v0.2) in 4-bit but the mm_projector does not (I am guessing it is initialized from the random values but in Params4bit class).

Line: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/builder.py#L76
model = Videollama2MistralForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, config=lora_cfg_pretrained, **kwargs)

Now to initialize the weights for the mm_projector, we try to load it from the previously saved non_lora_trainables.bin which was stored in full precision format.

Line: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/builder.py#L101
model.load_state_dict(non_lora_trainables, strict=False)

Debugging Outputs

Model params

**(Pdb) model.model.mm_projector.readout**
Sequential(
  (0): Linear4bit(in_features=4096, out_features=4096, bias=True)
  (1): GELU(approximate='none')
  (2): Linear4bit(in_features=4096, out_features=4096, bias=True)
)
**(Pdb) model.model.mm_projector.readout[0]**
Linear4bit(in_features=4096, out_features=4096, bias=True)
**(Pdb) model.model.mm_projector.readout[0].weight**
Parameter containing:
Parameter(Params4bit([[193],
            [108],
            [250],
            ...,
            [231],
            [107],
            [ 92]], device='cuda:1', dtype=torch.uint8))
**(Pdb) model.model.mm_projector.readout[0].weight.shape**
torch.Size([8388608, 1])

Previously saved non_lora_trainables param

(Pdb) non_lora_trainables[ 'model.model.mm_projector.readout.0.weight'].shape
torch.Size([4096, 4096])

Please advise on how to resolve this.

The text was updated successfully, but these errors were encountered:

ApoorvFrontera · 2024-08-20T07:13:33Z

Hi Team,

Any help is highly appreciated.

Thanks :)

clownrat6 · 2024-08-27T15:27:12Z

You can check this issue #78

LiangMeng89 · 2024-11-13T18:29:12Z

Hi Team,

I have successfully finetuned a QLoRA adapter on a custom dataset. When I try to load it in full precision, it gets loaded and works well

But this takes too much time and GPU memory to run inference. So I wanted to load the model in 4-bit precision. So I pass in the load_4bit parameter:
model_path = '/home/apoorv/development/videollama2/VideoLLaMA2/work_dirs/videollama2_vllava/finetune_videollama2_vllava_qlora'
model_name = get_model_name_from_path(model_path)
tokenizer, model, processor, context_len = load_pretrained_model(model_path, None, model_name, load_4bit=True)
While running this I get the following error:
RuntimeError: Error(s) in loading state_dict for Videollama2MistralForCausalLM:
	size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
	size mismatch for model.mm_projector.readout.2.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
I have debugged and understood the reason that when we pass in the load_4bit=True, the VideoLLama2 model params are loaded as 4bit params. The LLM part weights gets initialized from the base model (in this case mistralai/Mistral-7B-Instruct-v0.2) in 4-bit but the mm_projector does not (I am guessing it is initialized from the random values but in Params4bit class).

Line: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/builder.py#L76 model = Videollama2MistralForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, config=lora_cfg_pretrained, **kwargs)

Now to initialize the weights for the mm_projector, we try to load it from the previously saved non_lora_trainables.bin which was stored in full precision format.

Line: https://github.com/DAMO-NLP-SG/VideoLLaMA2/blob/main/videollama2/model/builder.py#L101 model.load_state_dict(non_lora_trainables, strict=False)

Debugging Outputs

Model params
**(Pdb) model.model.mm_projector.readout**
Sequential(
  (0): Linear4bit(in_features=4096, out_features=4096, bias=True)
  (1): GELU(approximate='none')
  (2): Linear4bit(in_features=4096, out_features=4096, bias=True)
)
**(Pdb) model.model.mm_projector.readout[0]**
Linear4bit(in_features=4096, out_features=4096, bias=True)
**(Pdb) model.model.mm_projector.readout[0].weight**
Parameter containing:
Parameter(Params4bit([[193],
            [108],
            [250],
            ...,
            [231],
            [107],
            [ 92]], device='cuda:1', dtype=torch.uint8))
**(Pdb) model.model.mm_projector.readout[0].weight.shape**
torch.Size([8388608, 1])
Previously saved non_lora_trainables param
(Pdb) non_lora_trainables[ 'model.model.mm_projector.readout.0.weight'].shape
torch.Size([4096, 4096])
Please advise on how to resolve this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while loading custom finetuned QLoRA model in 4 bit : size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]). #71

Error while loading custom finetuned QLoRA model in 4 bit : size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]). #71

ApoorvFrontera commented Aug 7, 2024

ApoorvFrontera commented Aug 20, 2024

clownrat6 commented Aug 27, 2024

LiangMeng89 commented Nov 13, 2024

Debugging Outputs

Model params

Previously saved non_lora_trainables param

Please advise on how to resolve this.

Error while loading custom finetuned QLoRA model in 4 bit : size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]). #71

Error while loading custom finetuned QLoRA model in 4 bit : size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]). #71

Comments

ApoorvFrontera commented Aug 7, 2024

Debugging Outputs

Model params

Previously saved non_lora_trainables param

Please advise on how to resolve this.

ApoorvFrontera commented Aug 20, 2024

clownrat6 commented Aug 27, 2024

LiangMeng89 commented Nov 13, 2024

Debugging Outputs

Model params

Previously saved non_lora_trainables param

Please advise on how to resolve this.