Replies: 1 comment
-
Followup: When re-TTSing the file the that voice-changes to "emma", it still remains "emma", and sometimes never switches to "William". It is only this file that this happens to. The contents of this are:
Is there something in this text that is 'doing' something? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, I am new to Tortoise, so, I may be overlooking something obvious, but I notice that when I read in a text file with
occasionally, the voice switches to "emma" then switches back to "William" on the next sentence in the same file. I created a new voice using samples my myself and used that, which sounded great, but again, occasionally my new voice would have a Indian accent (which I do not have). Here is an example comparison.
Accent example:
https://github.com/neonbjb/tortoise-tts/assets/56138158/5f4d7a2f-55fb-4ce5-befb-6e77f2d31a20
2_Q.mp4
emma/william example:
2_R.mp4
Is this something that can be "fixed" in some way?
There's another weird quirk where occasionally it repeats one or two previous words, so the text "It's apparent to all" comes out as "It's apparent apparent to all". They always tend to be larger words, like 3 syllables or more.
Also, what determines the time it takes to process some text? The following sentence:
took 15m44s and the file is 241.6Kb, but the sentence
took only 4m43s, and is 727Kb
So, what makes a file twice as large with 25% more text get processed 4x faster than the smaller file? (Note. This is also the file that voice-switched to "emma". Are some voices 'faster' than others? )
In these examples, the exact same code was used. The only difference was the text file. nothing else was running on the system or the GPU (GTX 4090Ti).
Beta Was this translation helpful? Give feedback.
All reactions