Based on LLMFarm, we run MiniCPM on ios devices. Note that the models run on ios are quantized to 4-bit and may lose some performance. The original models can be found here.
Deploy MiniCPM on IOS
The first method is to directly download our converted model,You can skip the subsequent conversion steps.
The second method is to download the original model from the huggingface and follow the steps below to convert and quantify it.
- downloading model
- git clone https://github.com/OpenBMB/llama.cpp.git
- cd llama.cpp && make -j8
- python3 convert.py ${hf_model_dir} --vocab-type hfft --outtype f32
- ./quantize ${hf_model_dir}/ggml-model-f32.gguf ${output_dir}/minicpm-q4_1.gguf q4_1
- git clone https://github.com/OpenBMB/LLMFarm-MiniCPM.git
- cd LLMFarm-MiniCPM && git submodule update --init --recursive
- Open this project with Xcode
- Setting Siging & Capabilities
- Select a device My MaC or your iphone
- run
- add chat
- select a model
- Set template: CPM
- Start chat