-
Notifications
You must be signed in to change notification settings - Fork 1k
Issues: NVIDIA/TensorRT-LLM
[Issue Template]Short one-line summary of the issue #270
#783
opened Jan 1, 2024 by
juney-nvidia
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
What does "weights_scaling_factor_2" mean in safetensor results of awq_w4a8
Investigating
Low Precision
Issue about lower bit quantization, including int8, int4, fp8
triaged
Issue has been triaged by maintainers
#2561
opened Dec 11, 2024 by
gujiewen
Correctly setting the
--max_encoder_input_len
when using trtllm-build with mllama models
#2558
opened Dec 10, 2024 by
here4dadata
batch-manager.md is removed since https://github.com/NVIDIA/TensorRT-LLM/pull/2532
#2556
opened Dec 10, 2024 by
spacewander
int8 slower than bf16 on A100
bug
Something isn't working
Investigating
Low Precision
Issue about lower bit quantization, including int8, int4, fp8
triaged
Issue has been triaged by maintainers
#2553
opened Dec 9, 2024 by
ShuaiShao93
4 tasks
[feature request] Can we add H200 in infer_cluster_key() method?
#2552
opened Dec 9, 2024 by
dongluw
trtllm-serve does not support dynamic batching like tritonserver
#2549
opened Dec 7, 2024 by
Alireza3242
Performance issue with long context
bug
Something isn't working
#2548
opened Dec 6, 2024 by
ShuaiShao93
4 tasks
trtllm-bench faild
bug
Something isn't working
#2545
opened Dec 6, 2024 by
dingjingzhen
2 of 4 tasks
Encoding error in stream response from Triton server
bug
Something isn't working
#2544
opened Dec 6, 2024 by
Wonder-donbury
3 of 4 tasks
lora doesn't work when kv_cache is disabled
Investigating
Lora/P-tuning
triaged
Issue has been triaged by maintainers
#2543
opened Dec 5, 2024 by
ShuaiShao93
4 tasks
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.