We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
微调Qwen_1.8b_chat_int4模型,分别使用lora和qlora方法,合并模型时报错 ValueError: Cannot merge LORA layers when the model is gptq quantized
解决该问题
python qwen_lora_merge.py
from peft import AutoPeftModelForCausalLM from transformers import AutoTokenizer
path_to_adapter="/home/ren/Finetuning/Qwen-1.8-chat/" new_model_directory="/home/ren/Finetuning/llm_model/Qwen-1_8B-Chat-Int4_law"
model = AutoPeftModelForCausalLM.from_pretrained(
path_to_adapter, # path to the output directory
device_map="auto",
trust_remote_code=True
).eval()merged_model = model.merge_and_unload()
merged_model.save_pretrained(new_model_directory, max_shard_size="2048MB", safe_serialization=True)
- OS:Ubuntu 20.04 - Python:3.10 - Transformers:4.37.2 - PyTorch:2.2.1 - CUDA (`python -c 'import torch; print(torch.version.cuda)'`):11.8
no
The text was updated successfully, but these errors were encountered:
如果你觉得这样一步到位的方式让你很不安心或者影响你接入下游应用,你可以选择先合并并存储模型(LoRA支持合并,Q-LoRA不支持),再用常规方式读取你的新模型,示例如下:
qlora不支持合并
Sorry, something went wrong.
No branches or pull requests
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
微调Qwen_1.8b_chat_int4模型,分别使用lora和qlora方法,合并模型时报错
ValueError: Cannot merge LORA layers when the model is gptq quantized
期望行为 | Expected Behavior
解决该问题
复现方法 | Steps To Reproduce
python qwen_lora_merge.py
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
path_to_adapter="/home/ren/Finetuning/Qwen-1.8-chat/"
new_model_directory="/home/ren/Finetuning/llm_model/Qwen-1_8B-Chat-Int4_law"
model = AutoPeftModelForCausalLM.from_pretrained(
path_to_adapter, # path to the output directory
device_map="auto",
trust_remote_code=True
).eval()merged_model = model.merge_and_unload()
max_shard_size and safe serialization are not necessary.
They respectively work for sharding checkpoint and save the model to safetensors
merged_model.save_pretrained(new_model_directory, max_shard_size="2048MB", safe_serialization=True)
运行环境 | Environment
备注 | Anything else?
no
The text was updated successfully, but these errors were encountered: