Can onnxruntime.quantization.quantize_dynamic() work with onnx-trt? #21169
Labels
ep:TensorRT
issues related to TensorRT execution provider
quantization
issues related to quantization
stale
issues that have not been addressed in a while; categorized by a bot
Describe the issue
Hi,
I'd like to try quantize_dynamic() on our models, but I notice that it will insert
DynamicQuantizeLinear
into the graph. And onnx-trt doesn't support it yet. Are there any onnx official TRT plugin to support it? Or is there any other workaround for it?Thanks,
To reproduce
N/A
Urgency
No response
Platform
Linux
OS Version
20.04
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.17.0
ONNX Runtime API
Python
Architecture
X86
Execution Provider
CUDA
Execution Provider Library Version
CUDA 12.2
The text was updated successfully, but these errors were encountered: