ValueError: Cannot merge LORA layers when the model is gptq quantized #1318

goy-jin · 2024-09-23T01:58:44Z

微调Qwen_1.8b_chat_int4模型，分别使用lora和qlora方法，合并模型时报错
ValueError: Cannot merge LORA layers when the model is gptq quantized

解决该问题

python qwen_lora_merge.py

from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

path_to_adapter="/home/ren/Finetuning/Qwen-1.8-chat/"
new_model_directory="/home/ren/Finetuning/llm_model/Qwen-1_8B-Chat-Int4_law"

model = AutoPeftModelForCausalLM.from_pretrained(

path_to_adapter, # path to the output directory

device_map="auto",

trust_remote_code=True

).eval()merged_model = model.merge_and_unload()

max_shard_size and safe serialization are not necessary.

merged_model.save_pretrained(new_model_directory, max_shard_size="2048MB", safe_serialization=True)

- OS:Ubuntu 20.04
- Python:3.10
- Transformers:4.37.2
- PyTorch:2.2.1
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):11.8

no

danyow-cheung · 2024-09-24T15:38:50Z

如果你觉得这样一步到位的方式让你很不安心或者影响你接入下游应用，你可以选择先合并并存储模型（LoRA支持合并，Q-LoRA不支持），再用常规方式读取你的新模型，示例如下：

qlora不支持合并