Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fusion patterns for conformer-transducer model #18461

Merged
merged 6 commits into from
Nov 19, 2023
Merged

Conversation

apsonawane
Copy link
Contributor

Description

Add conformer-transducer model type to optimizer. This PR adds pattern matches for attention shown below:
Unfused attention:
ct_unfused

Fused attention:
ct_fused

@tianleiwu
Copy link
Contributor

Pease add a test case for the attention fusion. Otherwise, it is not able to prevent regression in the future.

@apsonawane apsonawane force-pushed the asonawane/conformer branch 2 times, most recently from 8fc8a31 to fd8d044 Compare November 17, 2023 10:33
from typing import List

import numpy as np
import onnx

Check notice

Code scanning / CodeQL

Module is imported with 'import' and 'import from' Note test

Module 'onnx' is imported with both 'import' and 'import from'.
Module 'onnxruntime.test.onnx' is imported with both 'import' and 'import from'.
class ConformerOnnxModel(BertOnnxModel):
def __init__(self, model, num_heads, hidden_size):
super().__init__(model, num_heads, hidden_size)
self.attention_mask = AttentionMask(self)

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class Warning

Assignment overwrites attribute attention_mask, which was previously defined in superclass
BertOnnxModel
.
def __init__(self, model, num_heads, hidden_size):
super().__init__(model, num_heads, hidden_size)
self.attention_mask = AttentionMask(self)
self.attention_fusion = FusionConformerAttention(self, self.hidden_size, self.num_heads, self.attention_mask)

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class Warning

Assignment overwrites attribute attention_fusion, which was previously defined in superclass
BertOnnxModel
.
@apsonawane apsonawane force-pushed the asonawane/conformer branch 2 times, most recently from d2ea8e5 to 93e9bda Compare November 18, 2023 12:00
@apsonawane apsonawane merged commit 97cc40d into main Nov 19, 2023
90 of 91 checks passed
@apsonawane apsonawane deleted the asonawane/conformer branch November 19, 2023 07:39
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
### Description
Add conformer-transducer model type to optimizer. This PR adds pattern
matches for attention shown below:
Unfused attention:

![ct_unfused](https://github.com/microsoft/onnxruntime/assets/111780983/46c71ed8-67e0-4607-85b1-bcadba5a2956)

Fused attention:

![ct_fused](https://github.com/microsoft/onnxruntime/assets/111780983/fbb91c96-0d4b-4f0b-8674-1ae3b9b9a92e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants