[GPU][FP16][NVIDIA L4][SM89] Inference failed because of missing fp16 kernel with specific values of "s" and "d" #19259
Labels
ep:CUDA
issues related to the CUDA execution provider
model:transformer
issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.
platform:windows
issues related to the Windows platform
Describe the issue
While trying to make inference on a model on new NVIDIA hardware (the L4 Graphic Card), I encounter error because of a missing kernel.
When doing a simple search in the ORT repository, there is indeed no kernel for that version of SM (89) with the specified value of
s
andd
. I can run this ONNX model just fine on my GPU (which is not as new) as well as the previous NVIDIA GPU for this line, the T4 Card.I understand that those are generated but I don't know how to do or where to request this generation for this model to run correctly. Would it be possible to generate those ?
To reproduce
I'll try to give a model shortly but I am not sure there is a need for a model for this issue to be solved.
Urgency
No response
Platform
Windows
OS Version
11
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.16.3
ONNX Runtime API
C#
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
11.8
The text was updated successfully, but these errors were encountered: