[Training] Whether to support weight per_channel QAT #19241
Labels
ep:CUDA
issues related to the CUDA execution provider
quantization
issues related to quantization
stale
issues that have not been addressed in a while; categorized by a bot
training
issues related to ONNX Runtime training; typically submitted using template
Describe the issue
The model weight is quantified per channel:
When using onnxruntime-train to do QAT, the following error is reported. Does onnxruntime-train support per_channel QAT?
To reproduce
The model weight is quantified per channel:
When using onnxruntime-train to do QAT, the following error is reported. Does onnxruntime-train support per_channel QAT?
Urgency
No response
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.16.3
PyTorch Version
1.10
Execution Provider
Default CPU, CUDA
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: