-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Update README.md #63
base: main
Are you sure you want to change the base?
Conversation
counter_prompt_tokens | ||
# Number of generation tokens processed. | ||
counter_generation_tokens | ||
# Counter of prefill tokens processed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need both this section and the one right below it? Why not just use the below section instead?
VLLM stats are reported by the metrics endpoint in fields that are prefixed with
vllm:
. For example, the metrics reported by Triton will look similar to the following:
# HELP vllm:prompt_tokens_total Number of prefill tokens processed.
# TYPE vllm:prompt_tokens_total counter
vllm:prompt_tokens_total{model="vllm_model",version="1"} 10
# HELP vllm:generation_tokens_total Number of generation tokens processed.
# TYPE vllm:generation_tokens_total counter
vllm:generation_tokens_total{model="vllm_model",version="1"} 16
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather have another section listing all supported metrics. The sample metrics output is just hard to follow IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for example, changing counter_prompt_tokens
to vllm:prompt_tokens_total
allows reader to easily locate the corresponding output from below section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oandreeva-nv do you mind driving this PR if you have time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we create a dedicated doc for metrics? So that our front facing README is nice and non-cluttered ?
There will be changes to the vLLM backend metrics in the upcoming releases. Converting this PR to draft. @oandreeva-nv @rmccorm4 |
List metrics in
vllm:*
instead of the variable name.