-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Zipformer recipe for GigaSpeech #1254
Conversation
yfyeung
commented
Sep 14, 2023
•
edited
Loading
edited
- Add Zipformer recipe for GigaSpeech
- Upload checkpoints, exported models, logs, and bpe models trained using the GigaSpeech XL dataset to https://huggingface.co/yfyeung/icefall-asr-gigaspeech-zipformer-2023-10-17
- Update https://github.com/k2-fsa/icefall/blob/master/egs/gigaspeech/ASR/RESULTS.md
- Update https://github.com/k2-fsa/icefall/blob/master/README.md
- Update https://github.com/yfyeung/icefall/blob/phone2/egs/gigaspeech/ASR/README.md
@yfyeung do you have a pretrained model you could share? |
@desh2608 Sure, check https://huggingface.co/yfyeung/icefall-asr-gigaspeech-zipformer-2023-10-17 |
README.md
Outdated
@@ -148,8 +148,11 @@ in the decoding. | |||
|
|||
### GigaSpeech | |||
|
|||
We provide two models for this recipe: [Conformer CTC model][GigaSpeech_conformer_ctc] | |||
and [Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss][GigaSpeech_pruned_transducer_stateless2]. | |||
We provide three models for this recipe: [Zipformer] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I removed it. It's redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
The PR gives the state-of-the-art WER for gigapseech in icefall!
Left some minor comments.
@@ -0,0 +1,444 @@ | |||
# Copyright 2021 Piotr Żelasko |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to replace it with a symlnik?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified this file. In the past, GigaSpeech XL splits need to be merged. This one uses lhotse.mux
.
@@ -0,0 +1,436 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace it with a symlink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
@@ -0,0 +1,775 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace it with a symlink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
@@ -0,0 +1,522 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace it with a symlink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed some comments like huggingface link in it.
@@ -0,0 +1,280 @@ | |||
#!/usr/bin/env python3 | |||
# Copyright 2021-2023 Xiaomi Corporation (Author: Fangjun Kuang, Zengwei Yao) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace it with a symlink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
@@ -0,0 +1,436 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you replace it with a symlink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
@@ -0,0 +1,273 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace it with a symlink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
@@ -0,0 +1,240 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace it with a symlink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
- normal-scaled model, number of model parameters: 65549011, i.e., 65.55 M | ||
|
||
You can find a pretrained model, training logs, decoding logs, and decoding results at: | ||
<https://huggingface.co/yfyeung/icefall-asr-gigaspeech-zipformer-2023-10-17> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you upload the tensorboard log to
https://wandb.ai/site
and post a link to it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, the link has not been posted. please leave a message when you think this PR is ready to merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I have posted that.
- normal-scaled model, number of model parameters: 65549011, i.e., 65.55 M | ||
|
||
You can find a pretrained model, training logs, decoding logs, and decoding results at: | ||
<https://huggingface.co/yfyeung/icefall-asr-gigaspeech-zipformer-2023-10-17> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a CI test for your model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have added.
Please re-review this PR. @csukuangfj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I want a brand new streaming English model |