Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorRT-LLM Requests #632

Open
24 of 41 tasks
ncomly-nvidia opened this issue Dec 11, 2023 · 15 comments
Open
24 of 41 tasks

TensorRT-LLM Requests #632

ncomly-nvidia opened this issue Dec 11, 2023 · 15 comments

Comments

@ncomly-nvidia
Copy link
Collaborator

ncomly-nvidia commented Dec 11, 2023

Hi all, this issue will track the feature requests you've made to TensorRT-LLM & provide a place to see what TRT-LLM is currently working on.

Last update: Jan 14th, 2024
🚀 = in development

Models

Decoder Only

Encoder / Encoder-Decoder

Multi-Modal

Other

Features & Optimizations

KV Cache

Quantization

Sampling

Workflow

Front-ends

Integrations

Usage / Installation

Platform Support

@teis-e
Copy link

teis-e commented Apr 4, 2024

Please add CohereAI!!

CohereForAI/c4ai-command-r-plus

@EwoutH
Copy link

EwoutH commented Apr 22, 2024

Llama 3 would be great (both 8B and 70B): #1470

Maybe quantized to 8 or even 4 bit.

@StephennFernandes
Copy link

currently llama 3 throws a bunch of errors converting to TensorRT LLM

any ideal about the support for llama 3

@EwoutH
Copy link

EwoutH commented Apr 23, 2024

Phi-3-mini should be amazing! Such a small 3.8B model could run quantized on many GPUs, with as little as 4GB VRAM.

@oscarbg
Copy link

oscarbg commented May 4, 2024

+1 for Phi-3

@user-0a
Copy link

user-0a commented May 18, 2024

+1 for Command R Plus!

CohereForAI/c4ai-command-r-plus

@khan-yin
Copy link

khan-yin commented Jun 25, 2024

hello @ncomly-nvidia, I am a student interested in the project! I want to ask if there are any good-first-issue feature request for Features & Optimizations recently? 🤣

@chenpinganan
Copy link

+1 for OpenBMB/MiniCPM-V-2

@FenardH
Copy link

FenardH commented Aug 5, 2024

Any news on support for jetson platform? Thanks in advance.

@anubhav-agrawal-mu-sigma

Requesting support for Meta's m4t v2 model, like how whisper support is provided.

@johnnynunez
Copy link

How is it going for Jetson AGX ? It would be nice if all is compatible before Jetson Thor launch

@ampdot-io
Copy link

LLaMa 3.2 multimodal vision models anytime soon?

@hello-11
Copy link
Collaborator

cc @laikhtewari for vis.

@johnnynunez
Copy link

congrats Nvidia: https://www.jetson-ai-lab.com/tensorrt_llm.html

@Mavericky-j
Copy link

Any news on support for jetson platform? Thanks in advance.

You can refer to the v0.12-jetson branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

16 participants