-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Init commit for swbd #1146
Init commit for swbd #1146
Conversation
typo fixed
removed commented scripts
fixed a KeyError occurred during the computation of WER for nbest-oracle
Updated nbest-oracle WERs
Lower WERs reported
Thanks, this looks very promising. There are also 3 other stale PRs for SWBD, one of them is mine -- you can use it for data preparation with Lhotse. It also uses Fisher and ports most of text norm etc from Kaldi to Python. See here: https://github.com/k2-fsa/icefall/pull/184/files |
Thank you so much! This is really helpful!
Let me see if I can have access to Fisher and test this.
On Jul 17, 2023, at 21:47, Piotr Żelasko ***@***.***> wrote:
Thanks, this looks very promising. There are also 3 other stale PRs for SWBD
<https://github.com/k2-fsa/icefall/pulls?q=is%3Apr+is%3Aopen+swbd>, one of
them is mine -- you can use it for data preparation with Lhotse. It also
uses Fisher and ports most of text norm etc from Kaldi to Python. See here:
https://github.com/k2-fsa/icefall/pull/184/files
—
Reply to this email directly, view it on GitHub
<#1146 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOON42FLT3LYWYVFVAAO7OTXQU67RANCNFSM6AAAAAAZT4AFZE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@JinZr any progress on this? I was looking for some pretrained English narrowband zipformer model. |
Dear Desh,
This PR is pending review now and should be able to merge at any moment. I
intend to make a separate PR for Zipformer as there are still some
parameters to tune.
Best Regards
Desh Raj ***@***.***>于2023年9月28日 周四22:28写道:
… @JinZr <https://github.com/JinZr> any progress on this? I was looking for
some pretrained English narrowband zipformer model.
—
Reply to this email directly, view it on GitHub
<#1146 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOON42CLTVAMLSOZ3JSP73DX4WCRZANCNFSM6AAAAAAZT4AFZE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@desh2608 Dear Desh, The PR has been merged, please check. Thank you! Best |
Hello, I was wondering about the sampling rate of swbd data in the recipe: The original swbd data is 8kHz, but here you are specifying 16kHz instead. Is there any reason we do this? Thanks. |
Dear Huangruizhe,
the resampling here is for the musan related data augmentation we applied
later, you can skip the resampling stage if you want to get rid of the
musan part.
hope this would help.
Best Regards
Jin
huangruizhe ***@***.***>于2023年10月13日 周五00:42写道:
… Hello, I was wondering about the sampling rate of swbd data in the recipe:
https://github.com/k2-fsa/icefall/blob/master/egs/swbd/ASR/local/compute_fbank_swbd.py#L108
The original swbd data is 8kHz
<https://github.com/lhotse-speech/lhotse/blob/master/lhotse/recipes/switchboard.py#L4>,
but here
<https://github.com/k2-fsa/icefall/blob/master/egs/swbd/ASR/local/compute_fbank_swbd.py#L108>
you are specifying 16kHz instead. Is there any reason we do this? Thanks.
—
Reply to this email directly, view it on GitHub
<#1146 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOON42GKAXIGS6JPTWBFIJTX7AMXXANCNFSM6AAAAAAZT4AFZE>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Thanks for the explanation. Is it more natural to downsample MUSAN to 8kHz instead? E.g., pyannote/pyannote-audio#300 If we are using upsampled 16kHz swbd audio to compute features -- fbanks here -- we are probably wasting the 8kHz~16kHz part in the feature, as it contains no information except noise. By speculation, will it degrade the performance? |
Dear Huangruizhe,
I haven’t done any experiment in terms of the performance but I suppose
that makes sense, especially when you want a system for narrowband
scenarios.
i’ll work on that later.
Best Regards
Jin
huangruizhe ***@***.***>于2023年10月13日 周五01:39写道:
… Thanks for the explanation. Is it more natural to downsample MUSAN to 8kHz
instead? E.g., pyannote/pyannote-audio#300
<pyannote/pyannote-audio#300>
If we are using upsampled 16kHz swbd audio to compute features -- fbanks
here -- we are probably wasting the 8kHz~16kHz part in the feature, as it
contains no information except noise. By speculation, will it degrade the
performance?
—
Reply to this email directly, view it on GitHub
<#1146 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOON42HNIKBZT5U3SVME5HTX7ATMBANCNFSM6AAAAAAZT4AFZE>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Yes, I agree. Looking forward to the updates. |
Dear huangruizhe,
Sorry for kept you waiting, a new narrowband setup is created
#1391, PR is currently pending
results from the latest experiments.
Stay tuned!
best
jin
…On Fri, Oct 13, 2023 at 2:04 AM huangruizhe ***@***.***> wrote:
Yes, I agree. Looking forward to the updates.
—
Reply to this email directly, view it on GitHub
<#1146 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOON42AEGR2W44ONTSV2NQ3X7AWMBANCNFSM6AAAAAAZT4AFZE>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
This pull request creates the init commit for the Switchboard dataset. It is currently very rough and produces a high Word Error Rate, requiring more improvement and refinement.
More info please refer to the README and RESULTS markdown files.