[Performance] TensorrtEP bad allocation #18887
Labels
ep:CUDA
issues related to the CUDA execution provider
ep:TensorRT
issues related to TensorRT execution provider
platform:windows
issues related to the Windows platform
quantization
issues related to quantization
stale
issues that have not been addressed in a while; categorized by a bot
Describe the issue
This is my session code:
Erros as follow:
Task Manager Snipping:
To reproduce
First: process follow onnxruntime infer sample: pre-processing
Second: quantize_static
Next: create session and ready to infer:
install onnxruntime:
pip install onnxruntime-gpu
Windows version got as:
conda:
Urgency
No response
Platform
Windows
OS Version
win11
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.16.3
ONNX Runtime API
Python
Architecture
X64
Execution Provider
TensorRT
Execution Provider Library Version
CUDA 11.8, cudnn-windows-x86_64-8.9.7.29_cuda11-archive, TensorRT-8.6.0.12.for.cuda-11.8
Model File
onnx exported form gfpgan
Is this a quantized model?
Yes
The text was updated successfully, but these errors were encountered: