Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

README.md: Add 3rd Party Inference Speed Dashboard #2244

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

matichon-vultureprime
Copy link
Contributor

Hi TensorRT-LLM team,

As an NVIDIA Inception startup,
I would like to add a community link resource about an inference speed dashboard.

The inference speed dashboard feature includes:

  • Comprehensive benchmarks with several optimization techniques like FP16, FP8, INT8-weight-only, and INT4-weight-only.
  • Comprehensive benchmarks with model architectures like Llama3-8b, Gemma2-27b, RecurrentGemma-9b, and Mamba2-2.7b.
  • Comprehensive batch sizes, input lengths, and output lengths. Batch sizes range from 1 to 32 (128 for Mamba), and input and output lengths range from 32 to 4096.

Additionally, I plan to publish the source code (website) and benchmark script by the end of this year.

Hi TensorRT-LLM team,

As an NVIDIA Inception startup, I would like to add a community link resource about an inference speed dashboard.

The inference speed dashboard feature includes:

Comprehensive benchmarks with several optimization techniques like FP16, FP8, INT8-weight-only, and INT4-weight-only.
Comprehensive benchmarks with model architectures like Llama3-8b, Gemma2-27b, RecurrentGemma-9b, and Mamba2-2.7b.
Comprehensive batch sizes, input lengths, and output lengths. Batch sizes range from 1 to 32 (128 for Mamba), and input and output lengths range from 32 to 4096.
Additionally, I plan to publish the source code (website) and benchmark script by the end of this year.
@hchings hchings added the documentation Improvements or additions to documentation label Sep 27, 2024
@Superjomn Superjomn requested a review from kaiyux October 14, 2024 09:58
@Superjomn Superjomn added the triaged Issue has been triaged by maintainers label Oct 14, 2024
@laikhtewari
Copy link
Collaborator

Hi @matichon-vultureprime, we're discussing the best way to manage community highlights -- thanks for the PR and your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation triaged Issue has been triaged by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants