Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
删除了llama.cpp的使用
  • Loading branch information
onekid authored Jul 28, 2024
1 parent dad43e9 commit 591b6ed
Showing 1 changed file with 0 additions and 39 deletions.
39 changes: 0 additions & 39 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,42 +33,3 @@ python -m bitsandbytes
test-unlora.py 测试微调之前推理
fine-tuning.py 用数据集微调
若本地运行fine-tuning.py出错,出现gcc.exe无法编译,可以尝试下载llvm-windows-x64.zip解压,在系统环境变量path路径里添加llvm下的bin路径
三、4位量化需要安装llama.cpp,步骤如下:
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make GGML_CUDA=1(没有gpu的linux使用make)

# obtain the official LLaMA model weights and place them in ./models
ls ./models
llama-2-7b tokenizer_checklist.chk tokenizer.model
# [Optional] for models using BPE tokenizers
ls ./models
<folder containing weights and tokenizer json> vocab.json
# [Optional] for PyTorch .bin models like Mistral-7B
ls ./models
<folder containing weights and tokenizer json>

# install Python dependencies
python3 -m pip install -r requirements.txt

# 转换模型为ggml FP16格式(cd ./llama.cpp)
python convert-hf-to-gguf.py ../outputs --outfile ./mymodel/namemv my .gguf --outtype f16

# 四位量化 (using Q4_K_M method)(cd ./llama.cpp)
./llama-quantize ./mymodel/ggml-model-f16.gguf ./mymodel/ggml-model-Q4_K_M.gguf Q4_K_M

# update the gguf filetype to current version if older version is now unsupported
./llama-quantize ./models/mymodel/ggml-model-Q4_K_M.gguf ./models/mymodel/ggml-model-Q4_K_M-v2.gguf COPY

#直接使用模型
./llama-cli -m ./models/mymodel/ggml-model-Q4_K_M.gguf -n 128

交互模式:
# default arguments using a 7B model
./examples/chat.sh

# advanced chat with a 13B model
./examples/chat-13B.sh

# custom arguments using a 13B model
./llama-cli -m ./models/13B/ggml-model-q4_0.gguf -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt

0 comments on commit 591b6ed

Please sign in to comment.