Skip to content

Issues: NVIDIA/TensorRT-LLM

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Error while importing tensorrt_llm
#2380 opened Oct 26, 2024 by Aaryanverma
build bert: build does not load model bug Something isn't working
#2379 opened Oct 26, 2024 by Alireza3242
2 of 4 tasks
FP8 Conversion failure when using Mixtral 8x7B with use_fp8_rowwise bug Something isn't working
#2377 opened Oct 25, 2024 by ValeGian
2 of 4 tasks
TPOT=0 without In-flight Batching in benckmark benchmark performance issue Issue about performance number question Further information is requested triaged Issue has been triaged by maintainers
#2374 opened Oct 25, 2024 by mltloveyy
Bug in build bert bug Something isn't working build triaged Issue has been triaged by maintainers
#2373 opened Oct 24, 2024 by Alireza3242
XQA kernel works slower with fp8 kv than with fp16 kv on H100 performance issue Issue about performance number question Further information is requested triaged Issue has been triaged by maintainers
#2372 opened Oct 24, 2024 by ttim
4 tasks
How to integrate Multi-LoRA Setup at Inference with NVIDIA Triton / TensorRT-LLM? I built the engine... build question Further information is requested triaged Issue has been triaged by maintainers
#2371 opened Oct 24, 2024 by JoJoLev
return_log_probs slow down generation bug Something isn't working Investigating performance issue Issue about performance number
#2367 opened Oct 24, 2024 by Desmond819
4 tasks
fast-forward tokens in logits post processor feature request New feature or request runtime triaged Issue has been triaged by maintainers
#2365 opened Oct 23, 2024 by mmoskal
Inconsistent Results Between Python Runtime and Python-Binding-C++ When Running TRT-LLM Multimodel bug Something isn't working runtime triaged Issue has been triaged by maintainers
#2362 opened Oct 22, 2024 by Oldpan
2 of 4 tasks
c++ inference example question Further information is requested runtime
#2361 opened Oct 22, 2024 by scuizhibin
openai_server error question Further information is requested triaged Issue has been triaged by maintainers
#2357 opened Oct 19, 2024 by imilli
convert_checkpoint report error bug Something isn't working build triaged Issue has been triaged by maintainers
#2356 opened Oct 19, 2024 by imilli
Build and run nvidia/Llama-3_1-Nemotron-51B-Instruct on a single A100 80Gb quantization Issue about lower bit quantization, including int8, int4, fp8 question Further information is requested triaged Issue has been triaged by maintainers
#2355 opened Oct 19, 2024 by edesalve
qwen, tensorrt-llm=0.12.0 question Further information is requested runtime
#2353 opened Oct 18, 2024 by yanglongbiao
2 of 4 tasks
[Question] Int8 Gemm's perf degraded in real models. quantization Issue about lower bit quantization, including int8, int4, fp8 question Further information is requested triaged Issue has been triaged by maintainers
#2351 opened Oct 18, 2024 by foreverlms
free_gpu_memory_fraction not working for examples/apps/openai_server.py bug Something isn't working triaged Issue has been triaged by maintainers
#2350 opened Oct 18, 2024 by anaivebird
2 of 4 tasks
unknown flag: --trt_root build Investigating triaged Issue has been triaged by maintainers
#2348 opened Oct 17, 2024 by Gu0725
trtllm-bench "No module named 'tensorrt_llm.bench.datamodels'" in v0.13.0 benchmark bug Something isn't working triaged Issue has been triaged by maintainers
#2347 opened Oct 17, 2024 by activezhao
2 of 4 tasks
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.