You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
We built our model using tritonserver container 24.10, the base server's CUDA was at 12.4. We are using a cloud provider for our GPU infra, and they updated their CUDA version to 12.7, and our trt-build model stopped working (the error said CUDA mismatch).
But we are still using tritonserver 24.10, so it shouldn't matter? If we use triton 24.10 in compatibility mode with CUDA 12.4, 12.5 and 12.7, do we need 3 different trt builds? And a fourth for 12.6? Is triton really that sensitive to the CUDA version?
Expected behavior
If triton container version is same, and GPU config is same, it should just work.
actual behavior
lauch_triton_server fails.
additional notes
Triton server container version: 24.10
GPU: H100 SXM 80GB
Base server CUDA during built: 12.4
server CUDA after update 12.7
The text was updated successfully, but these errors were encountered:
System Info
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
We built our model using tritonserver container 24.10, the base server's CUDA was at 12.4. We are using a cloud provider for our GPU infra, and they updated their CUDA version to 12.7, and our trt-build model stopped working (the error said CUDA mismatch).
But we are still using tritonserver 24.10, so it shouldn't matter? If we use triton 24.10 in compatibility mode with CUDA 12.4, 12.5 and 12.7, do we need 3 different trt builds? And a fourth for 12.6? Is triton really that sensitive to the CUDA version?
Expected behavior
If triton container version is same, and GPU config is same, it should just work.
actual behavior
lauch_triton_server fails.
additional notes
Triton server container version: 24.10
GPU: H100 SXM 80GB
Base server CUDA during built: 12.4
server CUDA after update 12.7
The text was updated successfully, but these errors were encountered: