[Training] On device training doesn't work with INT8 Models #19078
Labels
ep:CUDA
issues related to the CUDA execution provider
platform:mobile
issues related to ONNX Runtime mobile; typically submitted using template
training
issues related to ONNX Runtime training; typically submitted using template
Describe the issue
I am re-training some onnx models from ONNX Model Zoo Repo, especially quantised Resnet50 with INT8 datatype. However, when creating the artifacts according to onnx-runtime-training-examples Repo I get the following error:
I would like to know what to do to solve it. Is there any way of retraining or doing Transfer Learning with ORT ?
For helping, my code looks like this:
To reproduce
I am running onnxruntime build from source for cuda 11.2, GCC 9.5, cmake 3.27 and python 3.8 with ubuntu 20.04.
Urgency
As soon as possible
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
onnxruntime-training 1.17.0+cu112
PyTorch Version
None
Execution Provider
CUDA
Execution Provider Library Version
Cuda 11.2
The text was updated successfully, but these errors were encountered: