Extracting vLLM metrics #2365

vsmolyakov · 2024-09-29T14:40:09Z

I'm using a vLLM model with simple_evaluate function on a benchmark task and I'm very much interested in extracing vLLM metrics such as time to first token, time in queue, etc. I've been reading through the source code to see if this is currently supported and I found the following in lm_eval/models/vllm_causallms.py

            for output, context in zip(cont, context):
                generated_text = output.outputs[0].text
                res.append(generated_text)
                ...

I was wondering if the following could be added as an option for vLLM models:

            for output, context in zip(cont, context):
                generated_text = output.outputs[0].text
                generated_metrics = output.metrics        # extracting vLLM metrics
                res.append(generated_text)
                res.append(generated_metrics)             # appending metrics to results
                ...

Thanks! This feature would be greatly appreciated.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting vLLM metrics #2365

Extracting vLLM metrics #2365

vsmolyakov commented Sep 29, 2024

Extracting vLLM metrics #2365

Extracting vLLM metrics #2365

Comments

vsmolyakov commented Sep 29, 2024