[Performance] optimizers fail to detect optimization patterns #19423
Labels
converter:dynamo
issues related supporting the PyTorch Dynamo exporter
performance
issues related to performance regressions
Describe the issue
dynamo exports models with opset 18. Some optimized ops are missing after optimization (FusedMatMul) and consecutive reshape are still presents. The following patterns were detected while exported llama attention model. They were found by doing a side by side between the results (comparing shape, type and content).
Every result is described as follows:
Left side: torch script exporter
Right side: dynamo exporter
Reshape + Reshape -> Reshape
Mul + Transpose + Mul -> Mul + Transpose
Reshape + MatMul + Reshape + Div -> FusedMatMul
To reproduce
-- to be updated soon --
Urgency
No response
Platform
Linux
OS Version
Ubuntu 22.04
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
9f68a27
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
CUDA 11.8
Model File
No response
Is this a quantized model?
Yes
The text was updated successfully, but these errors were encountered: