[G2pPack + Phonetic Assistant] Give same phonetic result for uppercase and lowercase graphemes #1209

lottev1991 · 2024-07-08T02:39:24Z

Currently, in the Phonetic Assistant as well as the DiffSinger G2P phonemizers, uppercase graphemes get a different phonetic result when compared to lowercase graphemes. This is inconvenient since the end user may sometimes capitalize words, and sometimes not. If the end user wants to use a different pronunciation, they can use number suffixes, e.g. the(1).

In theory, this issue could affect any G2P-powered function (such as phonemizers), but in practice it currently only affects the Phonetic Assistant as well as the DiffSinger G2P phonemizers.

What this PR does NOT do

Affect SP and AP (this has been tested). If they are defined in the dictionary, or the dictionary contains no graphemes, they will work normally. (Note that they have to be defined in their uppercase form in the dsdict if there are any conflicting graphemes (e.g. lowercase sp and/or ap) ; however, this is currently the case as well).
Related to the above, but any capitalized graphemes that are manually defined in a custom dsdict (e.g. KA vs. ka) will not be affected either, so you can still distinguish by capitalization manually if so desired.
Affect phonemes. This affects G2P graphemes (e,g, words) only.

stakira · 2024-09-01T20:52:02Z

It's not always correct to do this. Acronyms like CIA should be pronounced differently.

lottev1991 · 2024-09-01T22:37:50Z

It works like this in the classic phonemizers as well, so I wanted it to work the same across the board. Perhaps I could ignore all-caps instances though.

stakira · 2024-09-03T00:38:43Z

That should be a decision per phonemizer. If it's a Japanese one that all ka, KA, Ka should be treated the same, sure. For English uppercase and lowercase shouldn't be treated the same.

lottev1991 added 3 commits July 8, 2024 04:20

No longer distinguish between uppercase and lowercase letters

b980d36

Fix for Phonetic Assistant too

634e396

Switch capitalization fix to G2pPack

ead094c

lottev1991 changed the title ~~[G2P Remapper + Phonetic Assistant] Give same phonetic result for uppercase and lowercase graphemes~~ [G2pPack + Phonetic Assistant] Give same phonetic result for uppercase and lowercase graphemes Jul 9, 2024

lottev1991 added 6 commits July 23, 2024 20:25

Merge branch 'stakira:master' into G2pCapitalLetterFix

8b040de

Merge branch 'stakira:master' into G2pCapitalLetterFix

ee89beb

Merge branch 'stakira:master' into G2pCapitalLetterFix

e523b32

Merge branch 'stakira:master' into G2pCapitalLetterFix

c41c1ae

Merge branch 'stakira:master' into G2pCapitalLetterFix

ca5dcbf

Merge branch 'stakira:master' into G2pCapitalLetterFix

2ed9948

stakira force-pushed the master branch from 73824ad to 443ea52 Compare August 3, 2024 20:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[G2pPack + Phonetic Assistant] Give same phonetic result for uppercase and lowercase graphemes #1209

[G2pPack + Phonetic Assistant] Give same phonetic result for uppercase and lowercase graphemes #1209

lottev1991 commented Jul 8, 2024 •

edited

Loading

stakira commented Sep 1, 2024 •

edited

Loading

lottev1991 commented Sep 1, 2024

stakira commented Sep 3, 2024 •

edited

Loading

[G2pPack + Phonetic Assistant] Give same phonetic result for uppercase and lowercase graphemes #1209

Are you sure you want to change the base?

[G2pPack + Phonetic Assistant] Give same phonetic result for uppercase and lowercase graphemes #1209

Conversation

lottev1991 commented Jul 8, 2024 • edited Loading

What this PR does NOT do

stakira commented Sep 1, 2024 • edited Loading

lottev1991 commented Sep 1, 2024

stakira commented Sep 3, 2024 • edited Loading

lottev1991 commented Jul 8, 2024 •

edited

Loading

stakira commented Sep 1, 2024 •

edited

Loading

stakira commented Sep 3, 2024 •

edited

Loading