用Lora和DDP微调LLaMA2-Chat

在八卡上以int4和int8精度微调Llama-2-7-chat模型。或者多机多卡以fb16精度微调Llama-2-7-chat模型。

用Lora和deepspeed微调LLaMA2-Chat

在八卡A800 上以int4和int8精度微调Llama-2-70b-chat模型。或者多机A800以fb16精度微调Llama-2-70b-chat模型

数据源采用了alpaca格式，由train和validation两个数据源组成。

1、显卡要求

Llama-2-70b-chat需要八卡A800，或者更多。 Llama-2-7b-chat需要八卡。

2、Clone源码

git clone https://github.com/yangzhipeng1108/llama-2-70b-chatbot.git
cd llama-2-70b-chatbot

3、安装依赖环境

# 创建虚拟环境
conda create -n llama2 python=3.9 -y
conda activate llama2
# 下载github.com上的依赖资源（需要反复试才能成功，所以单独安装）
export GIT_TRACE=1
export GIT_CURL_VERBOSE=1
pip install git+https://github.com/PanQiWei/AutoGPTQ.git -i https://pypi.mirrors.ustc.edu.cn/simple --trusted-host=pypi.mirrors.ustc.edu.cn
pip install git+https://github.com/huggingface/peft -i https://pypi.mirrors.ustc.edu.cn/simple
pip install git+https://github.com/huggingface/transformers -i https://pypi.mirrors.ustc.edu.cn/simple
# 安装其他依赖包
pip install -r requirements.txt -i https://pypi.mirrors.ustc.edu.cn/simple
# 验证bitsandbytes
python -m bitsandbytes

4、下载原始模型

python model_download.py --repo_id daryl149/llama-2-70b-chat-hf

5、扩充中文词表

# 使用了https://github.com/ymcui/Chinese-LLaMA-Alpaca.git的方法扩充中文词表
# 扩充完的词表在merged_tokenizes_sp（全精度）和merged_tokenizer_hf（半精度）
# 在微调时，将使用--tokenizer_name ./merged_tokenizer_hf参数
python merge_tokenizers.py \
  --llama_tokenizer_dir ./models/daryl149/llama-2-70b-chat-hf \
  --chinese_sp_model_file ./chinese_sp.model

6、微调参数说明

有以下几个参数可以调整：

参数	说明	取值
load_in_bits	模型精度	4和8，如果显存不溢出，尽量选高精度8
block_size	token最大长度	首选2048，内存溢出，可选1024、512等
per_device_train_batch_size	训练时每块卡每次装入批量数	只要内存不溢出，尽量往大选
per_device_eval_batch_size	评估时每块卡每次装入批量数	只要内存不溢出，尽量往大选
include	使用的显卡序列	如两块：localhost:1,2（特别注意的是，序列与nvidia-smi看到的不一定一样）
num_train_epochs	训练轮数	至少3轮

7、微调

chmod +x finetune-lora.sh
# 微调
./finetune-lora.sh
# 微调（后台运行）
pkill -9 -f finetune-lora
nohup ./finetune-lora.sh > train.log  2>&1 &
tail -f train.log

8、测试

CUDA_VISIBLE_DEVICES=0 python generate.py \
    --base_model './models/daryl149/llama-2-70b-chat-hf' \
    --lora_weights 'output/checkpoint-800' \
    --load_8bit #不加这个参数是用的4bit

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
data		data
ds		ds
utils		utils
README.md		README.md
accuracy.py		accuracy.py
app.py		app.py
chinese_sp.model		chinese_sp.model
example_chat_dialogue.py		example_chat_dialogue.py
example_text_dialogue.py		example_text_dialogue.py
finetune-lora-ddp-int8.py		finetune-lora-ddp-int8.py
finetune-lora-ddp.py		finetune-lora-ddp.py
finetune-lora-ddp.sh		finetune-lora-ddp.sh
finetune-lora-ds-int.py		finetune-lora-ds-int.py
finetune-lora-ds.py		finetune-lora-ds.py
finetune-lora-fp16.py		finetune-lora-fp16.py
finetune-lora-int8.py		finetune-lora-int8.py
finetune-lora.py		finetune-lora.py
finetune-lora.sh		finetune-lora.sh
generate.py		generate.py
main.py		main.py
merge_tokenizers.py		merge_tokenizers.py
model_download.py		model_download.py
multi-node-finetune-lora-ddp.sh		multi-node-finetune-lora-ddp.sh
multi-node-finetune-lora-ds.sh		multi-node-finetune-lora-ds.sh
multi-node-finetune-lora.sh		multi-node-finetune-lora.sh
requirements.txt		requirements.txt
run_7b.sh		run_7b.sh
run_7b_lora.sh		run_7b_lora.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

用Lora和DDP微调LLaMA2-Chat

用Lora和deepspeed微调LLaMA2-Chat

1、显卡要求

2、Clone源码

3、安装依赖环境

4、下载原始模型

5、扩充中文词表

6、微调参数说明

7、微调

8、测试

About

Releases

Packages

Languages

yangzhipeng1108/llama-2-70b-chatbot

Folders and files

Latest commit

History

Repository files navigation

用Lora和DDP微调LLaMA2-Chat

用Lora和deepspeed微调LLaMA2-Chat

1、显卡要求

2、Clone源码

3、安装依赖环境

4、下载原始模型

5、扩充中文词表

6、微调参数说明

7、微调

8、测试

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages