random voices and accents appearing #791

tholonia · 2024-06-11T16:56:34Z

tholonia
Jun 11, 2024

Hello, I am new to Tortoise, so, I may be overlooking something obvious, but I notice that when I read in a text file with

scripts/tortoise_tts.py -p fast -v william < file1.txt
scripts/tortoise_tts.py -p fast -v william < file2.txt
...

occasionally, the voice switches to "emma" then switches back to "William" on the next sentence in the same file. I created a new voice using samples my myself and used that, which sounded great, but again, occasionally my new voice would have a Indian accent (which I do not have). Here is an example comparison.

Accent example:
https://github.com/neonbjb/tortoise-tts/assets/56138158/5f4d7a2f-55fb-4ce5-befb-6e77f2d31a20

2_Q.mp4

emma/william example:

2_R.mp4

Is this something that can be "fixed" in some way?

There's another weird quirk where occasionally it repeats one or two previous words, so the text "It's apparent to all" comes out as "It's apparent apparent to all". They always tend to be larger words, like 3 syllables or more.

Also, what determines the time it takes to process some text? The following sentence:

I'd like to establish some new definitions. I'd like to use the term 'awareness' as the archetype and 'consciousness' as an instance of awareness.

took 15m44s and the file is 241.6Kb, but the sentence

Yes, if all of reality is a product of awareness, then everything that exists can be viewed as an instance of consciousness, manifesting in varying degrees and forms throughout the universe.

took only 4m43s, and is 727Kb
So, what makes a file twice as large with 25% more text get processed 4x faster than the smaller file? (Note. This is also the file that voice-switched to "emma". Are some voices 'faster' than others? )
In these examples, the exact same code was used. The only difference was the text file. nothing else was running on the system or the GPU (GTX 4090Ti).

tholonia · 2024-06-11T17:24:23Z

tholonia
Jun 11, 2024
Author

Followup: When re-TTSing the file the that voice-changes to "emma", it still remains "emma", and sometimes never switches to "William". It is only this file that this happens to. The contents of this are:

Yes, that's correct. Perception and observation are products of awareness. Awareness allows for the detection and interpretation of sensory information, making observation an act of awareness that influences quantum phenomena

Is there something in this text that is 'doing' something?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

random voices and accents appearing #791

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

random voices and accents appearing #791

tholonia Jun 11, 2024

Replies: 1 comment

tholonia Jun 11, 2024 Author

tholonia
Jun 11, 2024

tholonia
Jun 11, 2024
Author