Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guided Synthesis #252

Merged
merged 32 commits into from
Mar 10, 2022
Merged
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
a73892b
forced alignment, f0 extraction and entry point
Patchethium Dec 28, 2021
28cf7c2
Merge branch 'master' into guided_synthesis
Patchethium Dec 28, 2021
a060398
kind of finished
Patchethium Dec 28, 2021
f7a3713
change julius4seg, doesn't seem to help
Patchethium Dec 29, 2021
6b0651f
run pysen format
Patchethium Dec 29, 2021
f1a663a
add speaker id to api
Patchethium Dec 29, 2021
668df80
run pysen format
Patchethium Dec 29, 2021
ad4bdbd
add accent_phrase api, finish
Patchethium Dec 30, 2021
ea95405
add request parameter
Patchethium Dec 30, 2021
6dff2ec
improve error handling
Patchethium Dec 30, 2021
34eec39
run pysen format
Patchethium Dec 30, 2021
a0cba4d
add parameters
Patchethium Dec 30, 2021
90e41e2
run pysen format
Patchethium Dec 30, 2021
e889207
a little boundary check
Patchethium Dec 30, 2021
c98c8be
add normalization for different WAV format
Patchethium Dec 31, 2021
1c6d96e
run format
Patchethium Dec 31, 2021
2d74993
run format
Patchethium Dec 31, 2021
ca356df
Merge branch 'master' into guided_synthesis
Patchethium Dec 31, 2021
f088176
move synthesis and accent phrase to synthesis engine
Patchethium Dec 31, 2021
cf18c3c
add test for mock
Patchethium Dec 31, 2021
98d387c
change url for apis
Patchethium Dec 31, 2021
48b629f
simplify
Patchethium Dec 31, 2021
061483c
error type
Patchethium Jan 11, 2022
fc45886
Merge branch 'master' into guided_synthesis
Patchethium Jan 24, 2022
0e26bbb
do something
Patchethium Feb 21, 2022
365ed92
do something
Patchethium Feb 21, 2022
29427d9
run format
Patchethium Feb 21, 2022
ddc6537
Merge branch 'master' into guided_synthesis
Patchethium Feb 21, 2022
ca6df3b
resolve conflict
Patchethium Feb 21, 2022
730917f
add usage to README
Patchethium Feb 22, 2022
3522370
Merge branch 'master' into guided_synthesis
Patchethium Feb 27, 2022
9b75c6c
add comments and experimental flag for guided api
Patchethium Mar 9, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
run format
Patchethium committed Dec 31, 2021

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
commit 2d749930373a9ba1348f0415cf69c590cbe86a65
26 changes: 13 additions & 13 deletions run.py
Original file line number Diff line number Diff line change
@@ -215,10 +215,10 @@ def accent_phrases(
summary="Create Audio Query Guided by External Audio",
)
def guided_accent_phrase(
kana: str = Form(...),
speaker_id: int = Form(...),
normalize: int = Form(...),
audio_file: UploadFile = File(...),
kana: str = Form(...), # noqa: B008
speaker_id: int = Form(...), # noqa: B008
normalize: int = Form(...), # noqa: B008
audio_file: UploadFile = File(...), # noqa: B008
):
try:
accent_phrases = guided.accent_phrase(
@@ -448,15 +448,15 @@ def _synthesis_morphing(
summary="Audio synthesis guided by external audio and phonemes",
)
def guided_synthesis(
Hiroshiba marked this conversation as resolved.
Show resolved Hide resolved
kana: str = Form(...),
speaker_id: int = Form(...),
normalize: int = Form(...),
audio_file: UploadFile = File(...),
stereo: int = Form(...),
sample_rate: int = Form(...),
volumeScale: float = Form(...),
pitchScale: float = Form(...),
speedScale: float = Form(...),
kana: str = Form(...), # noqa: B008
speaker_id: int = Form(...), # noqa: B008
normalize: int = Form(...), # noqa: B008
audio_file: UploadFile = File(...), # noqa: B008
stereo: int = Form(...), # noqa: B008
sample_rate: int = Form(...), # noqa: B008
volumeScale: float = Form(...), # noqa: B008
pitchScale: float = Form(...), # noqa: B008
speedScale: float = Form(...), # noqa: B008
):
try:
wave = guided.synthesis(
2 changes: 1 addition & 1 deletion voicevox_engine/guided.py
Original file line number Diff line number Diff line change
@@ -36,7 +36,7 @@
"tmp.wav",
]

_JULIUS_DICTATION_URL = "https://github.com/julius-speech/dictation-kit/archive/refs/tags/dictation-kit-v4.3.1.tar.gz"
_JULIUS_DICTATION_URL = "https://github.com/julius-speech/dictation-kit/archive/refs/tags/dictation-kit-v4.3.1.tar.gz" # noqa: B950
JULIUS_DICTATION_DIR = os.environ.get(
"JULIUS_DICTATION_DIR",
# they did put two "dictation-kit"s in extracted folder name