Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support TRTLLM model in the benchmark script #442

Merged
merged 9 commits into from
Dec 4, 2023
Merged

Conversation

nv-hwoo
Copy link
Contributor

@nv-hwoo nv-hwoo commented Nov 29, 2023

  • Support TRTLLM (e.g. ensemble) model in the benchmark script
  • Use Triton vLLM backend and update the outdated doc

@matthewkotila
Copy link
Contributor

Does this PR essentially include the work of #412?

@nv-hwoo
Copy link
Contributor Author

nv-hwoo commented Nov 29, 2023

@matthewkotila I guess it sort of does 😅 I forgot about that PR. The intention was to update the doc since the tutorial guide no longer builds its own vllm container and just relies on the vllm backend.

@matthewkotila
Copy link
Contributor

@nv-hwoo: @matthewkotila I guess it sort of does 😅 I forgot about that PR. The intention was to update the doc since the tutorial guide no longer builds its own vllm container and just relies on the vllm backend.

That's totally ok if it does, I just wanted to know if the linked PR can be closed, that's all 🙏

@@ -420,6 +420,13 @@ def profile(args, export_file):
f"--input-data={INPUT_FILENAME} "
f"--profile-export-file={export_file} "
)
if args.model == "ensemble": # TRT-LLM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these TRT-LLM specific options or options specific to any ensemble?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a way of detecting if the model is trtllm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely but will any other model use ensemble as its top model?
feels like we are trying to use a reserved word as a variable type of thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe others will? It's not an ideal way of detecting, hence my suggestion in here:

#442 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added --backend argument to take in backend type as command line option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the behavior when the wrong backend is used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I think we should throw an error when user specifies unsupported backend type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@debermudez Added the check.

return input_data


def main(args):
input_data = construct_input_data(args)
if args.model == "ensemble":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want this to be trtllm?
do we have other backends that we plan to support that will also use ensemble?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good point. Not ideal for detecting the model is trtllm. Better might be allowing user of profile.py to specify what backend they're using (vllm/trtllm)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed that was the point of the -m option.
Is that not the case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-m provides model name, that's different from what backend the model uses

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i appreciate the clarification.
The terms are a bit overloaded so its great to get clarification.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree with your suggestion

@nv-hwoo nv-hwoo requested a review from debermudez December 1, 2023 17:25
@nv-hwoo nv-hwoo merged commit c8c5f14 into main Dec 4, 2023
3 checks passed
@nv-hwoo nv-hwoo deleted the hwoo-support-trtllm branch December 4, 2023 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants