Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Auto unload model if vLLM health check failed #73

Merged
merged 18 commits into from
Dec 5, 2024
Merged

Conversation

kthui
Copy link
Contributor

@kthui kthui commented Nov 19, 2024

What does the PR do?

Call the vLLM Health Check API upon receiving each request, and auto unload the model via BLS if the health check failed.

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • build
  • ci
  • docs
  • feat
  • fix
  • perf
  • refactor
  • revert
  • style
  • test

Related PRs:

N/A

Where should the reviewer start?

Start with the documentation, and then move on to the code, and finally the test case.

Test plan:

New tests are added to verify the health check, auto model unload, and enabling/disabling of the check.

  • CI Pipeline ID: 20829546

Caveats:

N/A

Background

N/A

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

N/A

* [WIP] vLLM check_health() is async

* [WIP] Fix model name query

* [WIP] Health check may only be enabled when instance count is 1
src/model.py Outdated Show resolved Hide resolved
src/model.py Outdated Show resolved Hide resolved
src/model.py Outdated Show resolved Hide resolved
src/model.py Show resolved Hide resolved
@kthui kthui added the PR: feat A new feature label Nov 26, 2024
@kthui kthui marked this pull request as ready for review November 26, 2024 20:34
@kthui kthui changed the title feat: Model to Auto Unload if vLLM Health Check Failed feat: Auto unload model if vLLM health check failed Nov 26, 2024
docs/health_check.md Outdated Show resolved Hide resolved
src/model.py Outdated Show resolved Hide resolved
docs/health_check.md Outdated Show resolved Hide resolved
src/model.py Outdated Show resolved Hide resolved
src/model.py Show resolved Hide resolved
@kthui kthui merged commit b594d07 into main Dec 5, 2024
3 checks passed
@kthui kthui deleted the jacky-vllm-health branch December 5, 2024 02:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: feat A new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants