[Mobile] QNN failed to finalize QNN graph for attention layer #21221
Labels
ep:QNN
issues related to QNN exeution provider
platform:mobile
issues related to ONNX Runtime mobile; typically submitted using template
quantization
issues related to quantization
Describe the issue
I train a qat self attention model by Pytorch FX, the model can be run in libQnnCpu.so but error in libQnnHtp.so.
The model run in linux x86.
QNN: 2.20.0.240223
ERROR Message:
To reproduce
I write the minimal reproduce code, the pytorch code to generate "test_int8.onnx", and use c++ code to run it.
I only test it in linux x86, but i guess it will be consistent on the Android side.
Init this model in c++
Urgency
No response
Platform
Android
OS Version
linux
ONNX Runtime Installation
Built from Source
Compiler Version (if 'Built from Source')
gcc9
Package Name (if 'Released Package')
None
ONNX Runtime Version or Commit ID
8c26898
ONNX Runtime API
C++/C
Architecture
X64
Execution Provider
Other / Unknown
Execution Provider Library Version
QNN: 2.20.0.240223
The text was updated successfully, but these errors were encountered: