Skip to content

Commit

Permalink
feat: Add vLLM counter metrics access through Triton (#7493)
Browse files Browse the repository at this point in the history
Report vLLM counter metrics through Triton server
  • Loading branch information
yinggeh authored Aug 16, 2024
1 parent 5320009 commit e7c8e7b
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 0 deletions.
4 changes: 4 additions & 0 deletions build.py
Original file line number Diff line number Diff line change
Expand Up @@ -1806,6 +1806,10 @@ def backend_clone(
os.path.join(build_dir, be, "src", "model.py"),
backend_dir,
)
clone_script.cpdir(
os.path.join(build_dir, be, "src", "utils"),
backend_dir,
)

clone_script.comment()
clone_script.comment(f"end '{be}' backend")
Expand Down
6 changes: 6 additions & 0 deletions docs/user_guide/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -378,3 +378,9 @@ Further documentation can be found in the `TRITONSERVER_MetricFamily*` and
The TRT-LLM backend uses the custom metrics API to track and expose specific metrics about
LLMs, KV Cache, and Inflight Batching to Triton:
https://github.com/triton-inference-server/tensorrtllm_backend?tab=readme-ov-file#triton-metrics

### vLLM Backend Metrics

The vLLM backend uses the custom metrics API to track and expose specific metrics about
LLMs to Triton:
https://github.com/triton-inference-server/vllm_backend?tab=readme-ov-file#triton-metrics

0 comments on commit e7c8e7b

Please sign in to comment.