Voice Cloning: When to Fine-Tune Pretrained TTS Models and How Much Data is Needed? #102
Unanswered
ClaudiuFilip110
asked this question in
Q&A
Replies: 1 comment
-
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I'm a ML Engineer (with a few years experience, but) new to TTS (and audio ML). I have experience primarily with NLP and LLMs, but I’m working on a Voice Cloning project, and the transition into TTS has been a bit confusing.
Here's my plan, if you have any tips or suggestions please feel free to add them here.
English Voice Cloning
Another language Voice Cloning
I'm planning to train my own model from scratch (or from a checkpoint).
P.S. Any tips are welcome, as I said, I'm quite the novice when it comes to anything audio ML-related.
Beta Was this translation helpful? Give feedback.
All reactions