[V1] Refactor model executable interface for all text-only language models #10374

ywang96 · 2024-11-15T21:50:13Z

This PR refactors the interface of all text-only decoder language models for V1 VLM re-arch and torch.compile support. In particular, all model implementations on vLLM will need to meet the following requirements:

get_input_embeddings(input_ids) implemented in XYZModel
get_input_embeddings(input_ids) implemented in XYZForCausalLM, XYZForConditionalGeneration, and/or XYZForClassification
inputs_embeds is a required (with default None) parameter in the forward function signature of XYZModel and XYZForCausalLM, XYZForConditionalGeneration, and/or XYZForClassification

This PR is a prerequisite of applying #9871 to all multimodal models on vLLM.

Signed-off-by: Roger Wang <[email protected]>

github-actions · 2024-11-15T21:50:23Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: Roger Wang <[email protected]>

DarkLight1337

LGTM!

After you refactor the multi-modal models, can you update the base interface for vLLM models (class VllmModel) to define get_input_embeddings and also update the required arguments of forward, then add a test to ensure that all of our models implement it?

ywang96 · 2024-11-16T07:12:56Z

LGTM!

After you refactor the multi-modal models, can you update the base interface for vLLM models (class VllmModel) to define get_input_embeddings and also update the required arguments of forward, then add a test to ensure that all of our models implement it?

@DarkLight1337 Thanks for the review! Yes eventually if we're going to enforce this interface, then I will update those model interfaces once the refactoring is done for all related models.

I also want others to take a look at this PR before we proceed so I disabled auto-merge and CI for now.

WoosukKwon

LGTM. Thanks for doing this!

vllm/model_executor/models/gpt_bigcode.py

…odels (vllm-project#10374) Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

update interface

dbd856f

Signed-off-by: Roger Wang <[email protected]>

ywang96 added 7 commits November 15, 2024 13:56

typo

98ad853

Signed-off-by: Roger Wang <[email protected]>

add all

b113ec2

Signed-off-by: Roger Wang <[email protected]>

update

bb24c3a

Signed-off-by: Roger Wang <[email protected]>

fix mamba

e92d165

Signed-off-by: Roger Wang <[email protected]>

fix xverse

719e34c

Signed-off-by: Roger Wang <[email protected]>

add qwen

37a9ac8

Signed-off-by: Roger Wang <[email protected]>

update

151415a

Signed-off-by: Roger Wang <[email protected]>

ywang96 marked this pull request as ready for review November 16, 2024 04:01

DarkLight1337 approved these changes Nov 16, 2024

View reviewed changes

DarkLight1337 enabled auto-merge (squash) November 16, 2024 06:10

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 16, 2024

ywang96 disabled auto-merge November 16, 2024 07:09

ywang96 requested review from youkaichao and WoosukKwon November 16, 2024 07:13

ywang96 removed the ready ONLY add when PR is ready to merge/full CI is needed label Nov 16, 2024

WoosukKwon approved these changes Nov 16, 2024

View reviewed changes

ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 17, 2024

Merge branch 'main' into v1-input-interface

95d877b

DarkLight1337 reviewed Nov 17, 2024

View reviewed changes

vllm/model_executor/models/gpt_bigcode.py Outdated Show resolved Hide resolved

Update vllm/model_executor/models/gpt_bigcode.py

1dc5901

DarkLight1337 enabled auto-merge (squash) November 17, 2024 03:47

DarkLight1337 merged commit 643ecf7 into vllm-project:main Nov 17, 2024
50 checks passed

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Nov 18, 2024

[V1] Refactor model executable interface for all text-only language m…

7fa97cf

…odels (vllm-project#10374) Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

DarkLight1337 mentioned this pull request Nov 18, 2024

[Core] generate from input embeds #6869

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[V1] Refactor model executable interface for all text-only language models #10374

[V1] Refactor model executable interface for all text-only language models #10374

ywang96 commented Nov 15, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Nov 15, 2024

DarkLight1337 left a comment •

edited

Loading

ywang96 commented Nov 16, 2024 •

edited

Loading

WoosukKwon left a comment

[V1] Refactor model executable interface for all text-only language models #10374

[V1] Refactor model executable interface for all text-only language models #10374

Conversation

ywang96 commented Nov 15, 2024 • edited by github-actions bot Loading

github-actions bot commented Nov 15, 2024

DarkLight1337 left a comment • edited Loading

Choose a reason for hiding this comment

ywang96 commented Nov 16, 2024 • edited Loading

WoosukKwon left a comment

Choose a reason for hiding this comment

ywang96 commented Nov 15, 2024 •

edited by github-actions bot

Loading

DarkLight1337 left a comment •

edited

Loading

ywang96 commented Nov 16, 2024 •

edited

Loading