-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi_zh-Hans Recipe #1238
Multi_zh-Hans Recipe #1238
Conversation
…into dev_multi_zh-hans
…into dev_multi_zh-hans
script for fbank computation not done yet
…into dev_multi_zh-hans
…into dev_multi_zh-hans
…into dev_multi_zh-hans
@csukuangfj all requested changes have been applied, thank you! |
Hi, I was exploring this recipe and noticed that the Chinese modeling unit has been adjusted from Moreover, I'm wondering if this modification is expected to improve performance in any way? If so, could you elaborate on how and in which specific scenarios this might be beneficial? Thank you very much for your time and for sharing this valuable recipe! Looking forward to your insights. |
Dear User,
thanks for raising the question!
the motivation for this modeling unit partitioning protocol is that when we
looked into the histogram of characters exported from all corpora involved,
we found that the distribution of the characters was highly unbalanced,
i.e. a few portions of the characters makes most of the appearances in the
training data.
of course you can use the vanilla char based modeling solution, but the
main concern is that you will eventually have a output layer with an
enormous amount of parameters (i couldn’t recall the exact number) and most
of them might not be properly trained.
Best Regards
Jin
…On Tue, 13 Aug 2024 at 17:04 xiaoxi91 ***@***.***> wrote:
Hi, I was exploring this recipe and noticed that the Chinese modeling unit
has been adjusted from char to high-freq-char + byte-symbol. I'm curious
about the rationale behind this change and would love to understand the
thought process. Could you please shed some light on why this decision was
made?
Moreover, I'm wondering if this modification is expected to improve
performance in any way? If so, could you elaborate on how and in which
specific scenarios this might be beneficial?
Thank you very much for your time and for sharing this valuable recipe!
Looking forward to your insights.
—
Reply to this email directly, view it on GitHub
<#1238 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOON42HSM2NV7YD7VOL7AILZRHDYJAVCNFSM6AAAAABMN2EDQCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBVG4ZTSNRQGQ>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
This PR includes scripts for training Zipformer model using multiple Chinese datasets.
Included Training Sets
Included Test Sets