Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix L0_backend_vllm* jobs #62

Merged
merged 3 commits into from
Sep 4, 2024
Merged

Conversation

yinggeh
Copy link
Contributor

@yinggeh yinggeh commented Sep 3, 2024

What does the PR do?

Fix L0_backend_vllm jobs. The metric outputs varies on the platform with non-greedy sampling. Move the checks for sampling parameter related metrics into a separate test case.
Broken pipeline ID: 109056988

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • fix

Related PRs:

n/a

Where should the reviewer start?

n/a

Test plan:

VLLMTritonMetricsTest.test_custom_sampling_params

  • CI Pipeline ID:
    18003888

Caveats:

n/a

Background

n/a

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

n/a

@yinggeh yinggeh added the bug Something isn't working label Sep 3, 2024
@yinggeh yinggeh self-assigned this Sep 3, 2024
@yinggeh yinggeh requested a review from oandreeva-nv September 3, 2024 18:35
@@ -134,7 +124,7 @@ def test_vllm_metrics(self):
# vllm:prompt_tokens_total
self.assertEqual(metrics_dict["vllm:prompt_tokens_total"], 18)
# vllm:generation_tokens_total
self.assertEqual(metrics_dict["vllm:generation_tokens_total"], 188)
self.assertEqual(metrics_dict["vllm:generation_tokens_total"], 48)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please update comment above with the expected answer and explain why we check 48 . I would like to avoid magic numbers in the code, to speed up the debugging time in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find a way to print generation tokens directly from vLLM engine. And the calculation of generation_tokens_total isn't straight forward (see https://github.com/vllm-project/vllm/blob/da1f7cc12a12ea4a744d26122e9a13ea4b3f4c7b/vllm/engine/llm_engine.py#L1086-L1088).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you define 48 as an expected number of tokens ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming vLLM reports the correct number of tokens in the current version (0.5.3-post1), this test makes sure the number stays consistent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thanks for this update !

Copy link
Collaborator

@oandreeva-nv oandreeva-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@yinggeh yinggeh merged commit 501f74d into main Sep 4, 2024
3 checks passed
@yinggeh yinggeh deleted the yinggeh-fix-L0-backend-vllm branch September 4, 2024 01:33
@yinggeh yinggeh restored the yinggeh-fix-L0-backend-vllm branch September 4, 2024 01:33
@nvda-mesharma nvda-mesharma removed the bug Something isn't working label Sep 4, 2024
@yinggeh yinggeh deleted the yinggeh-fix-L0-backend-vllm branch September 18, 2024 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants