-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable gemm activation for non-float data types #19612
Conversation
Based on my understanding, there is another bug that it will fuse FusedGemm for CUDA EP, but FusedGemm has no CUDA implementation. Could you check that both nodes are assigned to CPU EP before the fusion? |
AFAIK GemmActivationFusion already checks GetCompatibleExecutionProviders() and graph_transformer_utils.cc initializes the GemmActivationFusion transformer with only the cpu_ep as being compatible. |
### Description Disable gemm activation for non-float data types ### Motivation and Context When a float16 model contains a Gemm+Relu subgraph, the gemm_activation_fusion will kick in and cause the two nodes to be eliminated and replaced with a FusedGemm. This however is only registered for the float data type. This causes model load failures. Disable the fusion for non-float data types. --------- Co-authored-by: Sheil Kumar <[email protected]>
Description
Disable gemm activation for non-float data types
Motivation and Context
When a float16 model contains a Gemm+Relu subgraph, the gemm_activation_fusion will kick in and cause the two nodes to be eliminated and replaced with a FusedGemm. This however is only registered for the float data type. This causes model load failures.
Disable the fusion for non-float data types.