Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Commit

Permalink
docs(benchmark): add throughput evaluation for NVIDIA L4 GPUs (#138)
Browse files Browse the repository at this point in the history
* docs(benchmark): add throughput for NVIDIA L4

* build(docker): update docker orchestration for ollama bench

* docs(readme): add GPU reference to benchmark readme
  • Loading branch information
frgfm authored Mar 27, 2024
1 parent f1c9bf5 commit 64c88bc
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 1 deletion.
1 change: 1 addition & 0 deletions scripts/ollama/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ We ran our tests on the following hardware:
- [NVIDIA GeForce RTX 3070](https://www.nvidia.com/fr-fr/geforce/graphics-cards/30-series/rtx-3070-3070ti/) ([Scaleway GPU-3070-S](https://www.scaleway.com/en/pricing/?tags=compute))
- [NVIDIA A10](https://www.nvidia.com/en-us/data-center/products/a10-gpu/) ([Lambda Cloud gpu_1x_a10](https://lambdalabs.com/service/gpu-cloud#pricing))
- [NVIDIA A10G](https://www.nvidia.com/en-us/data-center/products/a10-gpu/) ([AWS g5.xlarge](https://aws.amazon.com/ec2/instance-types/g5/))
- [NVIDIA L4](https://www.nvidia.com/en-us/data-center/l4/) ([Scaleway L4-1-24G](https://www.scaleway.com/en/pricing/?tags=compute))

*The laptop hardware setup includes an [Intel(R) Core(TM) i7-12700H](https://ark.intel.com/content/www/us/en/ark/products/132228/intel-core-i7-12700h-processor-24m-cache-up-to-4-70-ghz.html) for the CPU*

Expand Down
2 changes: 1 addition & 1 deletion scripts/ollama/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ services:
retries: 3

evaluator:
image: quackai/evaluator:latest
image: quackai/llm-evaluator:latest
build: .
depends_on:
ollama:
Expand Down
5 changes: 5 additions & 0 deletions scripts/ollama/latency.csv
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,8 @@ deepseek-coder:6.7b-instruct-q3_K_M,A10G (AWS g5.xlarge),99.83,35.41,84.47,1.69
pxlksr/opencodeinterpreter-ds:6.7b-Q4_K_M,A10G (AWS g5.xlarge),212.08,86.58,79.02,3.35
dolphin-mistral:7b-v2.6-dpo-laser-q4_K_M,A10G (AWS g5.xlarge),187.2,62.24,75.91,1
dolphin-mistral:7b-v2.6-dpo-laser-q3_K_M,A10G (AWS g5.xlarge),102.36,34.29,81.23,1.02
deepseek-coder:6.7b-instruct-q4_K_M,NVIDIA L4 (Scaleway L4-1-24G),213.46,76.24,49.97,1.01
deepseek-coder:6.7b-instruct-q3_K_M,NVIDIA L4 (Scaleway L4-1-24G),118.87,43.35,54.72,1.31
pxlksr/opencodeinterpreter-ds:6.7b-Q4_K_M,NVIDIA L4 (Scaleway L4-1-24G),225.62,60.21,49.39,1.9
dolphin-mistral:7b-v2.6-dpo-laser-q4_K_M,NVIDIA L4 (Scaleway L4-1-24G),211.52,72.76,47.27,0.58
dolphin-mistral:7b-v2.6-dpo-laser-q3_K_M,NVIDIA L4 (Scaleway L4-1-24G),120.13,41.09,51.9,0.71

0 comments on commit 64c88bc

Please sign in to comment.