-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Transformers Optimizer] CLIP-ViT encoder attention not getting fused #21208
Comments
Can you specify transformers version as well? They made a lot of changes recently. |
Hi,
I also tried with
I'll run the test again with PyTorch 1.13.1 later Edit: my mistake, 4.28.1 is much older than either of the two PRs that added CLIP attention fusion |
Can you try your code with the nightly ORT package instead of the stable ORT 1.18.0 package? New fusions for CLIP were recently added in this PR and aren't in the stable ORT package currently. Alternatively, you can try the "from source" instructions in the PR's description. |
Sweet! I assumed that PR was already part of the stable release. It's looking much better with
|
Describe the issue
I'm trying to optimize the vision encoder of a CLIP model exported from HuggingFace Transformers, but the attention subgraphs don't get fused.
I tried with a BERT model using the vanilla implementation of attention and the optimizer is able to fuse the operations (but it doesn't when using SDPA, though that's another story).
To reproduce
Outputs:
No
Attention
fused operators.Urgency
No response
Platform
Linux
OS Version
Ubuntu 20.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.18.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: