Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Commit

Permalink
docs(readme): update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
frgfm committed Mar 27, 2024
1 parent b4cd2f3 commit 8ca39f2
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions scripts/ollama/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ with the following LLMs (cf. [Ollama hub](https://ollama.com/library)):

and the following quantization formats: q3_K_M, q4_K_M, q5_K_M.

This [benchmark](latency.csv) was performed over 5 iterations on 4 different sequences, including on a **laptop** to better reflect performances that can be expected by common users.
This [benchmark](./benchmark_result.csv) was performed over 5 iterations on 4 different sequences, including on a **laptop** to better reflect performances that can be expected by common users.

## Run it on your hardware

Expand All @@ -39,7 +39,7 @@ docker compose exec -T ollama ollama pull MODEL

And run the evaluation
```
docker compose exec -T evaluator python scripts/ollama/evaluate_latency.py MODEL
docker compose exec -T evaluator python evaluate.py MODEL
```

### Remote instance
Expand All @@ -50,10 +50,10 @@ docker compose up -d evaluator --wait
```
And run the evaluation by targeting your remote instance:
```
docker compose exec -T evaluator python scripts/ollama/evaluate_latency.py MODEL --endpoint http://HOST:PORT
docker compose exec -T evaluator python evaluate.py MODEL --endpoint http://HOST:PORT
```

*All script arguments can be checked using `python scripts/ollama/evaluate_latency.py --help`*
*All script arguments can be checked using `python scripts/ollama/evaluate_perf.py --help`*

### Others

Expand Down

0 comments on commit 8ca39f2

Please sign in to comment.