You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using a vLLM model with simple_evaluate function on a benchmark task and I'm very much interested in extracing vLLM metrics such as time to first token, time in queue, etc. I've been reading through the source code to see if this is currently supported and I found the following in lm_eval/models/vllm_causallms.py
for output, context in zip(cont, context):
generated_text = output.outputs[0].text
res.append(generated_text)
...
I was wondering if the following could be added as an option for vLLM models:
for output, context in zip(cont, context):
generated_text = output.outputs[0].text
generated_metrics = output.metrics # extracting vLLM metrics
res.append(generated_text)
res.append(generated_metrics) # appending metrics to results
...
Thanks! This feature would be greatly appreciated.
The text was updated successfully, but these errors were encountered:
I'm using a vLLM model with simple_evaluate function on a benchmark task and I'm very much interested in extracing vLLM metrics such as time to first token, time in queue, etc. I've been reading through the source code to see if this is currently supported and I found the following in lm_eval/models/vllm_causallms.py
I was wondering if the following could be added as an option for vLLM models:
Thanks! This feature would be greatly appreciated.
The text was updated successfully, but these errors were encountered: