Name		Name	Last commit message	Last commit date
parent directory ..
ggml-llama-wasm		ggml-llama-wasm
run-ggml-llama-inference		run-ggml-llama-inference
README.md		README.md

README.md

GGML Llama via WASI-NN

WasmEdge Runtime implements the WASI-NN proposal. This example demonstrates how to chat with Llama2 model driven by WasmEdge wasi-nn-ggml backend.

To run this example, the operating system should be Ubuntu-20.04 or above on x86_64 target.

Now let's build and run this example.

Install rustup and Rust

Go to the official Rust webpage and follow the instructions to install rustup and Rust.

It is recommended to use Rust 1.71 or above in the stable channel.

Then, add wasm32-wasi target to the Rustup toolchain:
```
rustup target add wasm32-wasi
```
Install openblas
```
apt install -y libopenblas-dev
```

Install WasmEdge Runtime

Use the following command to install WasmEdge Runtime and the wasi_nn-ggml plugin:

# NOTICE that the installation script needs `sudo` access

# install wasmedge to the directory /usr/local/
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- -v 0.14.0
source $HOME/.wasmedge/env

For users in China mainland (中国大陆地区), try the following command to install WasmEdge Runtime if failed to run the command above
# NOTICE that the installation script needs `sudo` access

bash install_zh.sh -v 0.14.0
source $HOME/.wasmedge/env

Download example

git clone [email protected]:second-state/wasmedge-rustsdk-examples.git
cd wasmedge-rustsdk-examples/ggml-llama-via-wasinn

Build the ggml-llama-wasm wasm app
```
cargo build -p ggml-llama-wasm --target wasm32-wasi --release
```
If the command runs successfully, you can find the ggml-llama-wasm.wasm file in the target/wasm32-wasi/release directory.

Download the Llama2 model of GGUF format

curl -LO https://huggingface.co/second-state/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_M.gguf

Build & run the run-ggml-llama-wasm app

cargo run -p run-ggml-llama-inference -- .:. target/wasm32-wasi/release/ggml-llama-wasm.wasm default

If the command runs successfully, you can try the multi-turn conversations like below:

[Question]:
What's the capital of France?
[Answer]:
The capital of France is Paris.
[Question]:
What about Norway?
[Answer]:
The capital of Norway is Oslo.
[Question]:
How many planets are in the solar system?
[Answer]:
There are 8 planets in the solar system: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-llama-via-wasinn

ggml-llama-via-wasinn

README.md

GGML Llama via WASI-NN

Files

ggml-llama-via-wasinn

Directory actions

More options

Directory actions

More options

Latest commit

History

ggml-llama-via-wasinn

Folders and files

parent directory

README.md

GGML Llama via WASI-NN