You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# I generate the target audio file with cloningpath=self.model.tts_to_file(text=text, speaker_wav=speaker_wav, language=language,
file_path=f"/tmp/output_{output_random}.wav")
# to increase the cloning quality i force the generated audio to be converted with the original speaker audio, i don't know if this # have sense or not, but the bug exist :-)self.conversion.voice_conversion_to_file(path, speaker_wav, file_path=new_output_path)
To Reproduce
Generate an audio with cloning
take the generate audio and use the conversion method with the source audio
Expected behavior
No response
Logs
Traceback (most recent call last):
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/cog/server/worker.py", line 349, in _predict
result = predict(**payload)
^^^^^^^^^^^^^^^^^^
File "/src/predict.py", line 55, in predict
self.conversion.voice_conversion_to_file(path, speaker_wav, file_path=new_output_path)
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/TTS/api.py", line 377, in voice_conversion_to_file
wav = self.voice_conversion(source_wav=source_wav, target_wav=target_wav)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/TTS/api.py", line 358, in voice_conversion
wav = self.voice_converter.voice_conversion(source_wav=source_wav, target_wav=target_wav)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/TTS/utils/synthesizer.py", line 257, in voice_conversion
output_wav = self.vc_model.voice_conversion(source_wav, target_wav)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/TTS/vc/models/freevc.py", line 527, in voice_conversion
g_tgt = self.enc_spk_ex.embed_utterance(wav_tgt)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/TTS/vc/modules/freevc/speaker_encoder/speaker_encoder.py", line 163, in embed_utterance
partial_embeds = self(mels).cpu().numpy()
^^^^^^^^^^
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/TTS/vc/modules/freevc/speaker_encoder/speaker_encoder.py", line 68, in forward
_, (hidden, _) = self.lstm(mels)
^^^^^^^^^^^^^^^
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.pyenv/versions/3.12.6/lib/python3.12/site-packages/torch/nn/modules/rnn.py", line 917, in forward
result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Input and parameter tensors are not at the same device, found input tensor at cpu and parameter tensor at cuda:0
Environment
- coqui-ai-TTS latest version
- Linux
- CUDA 12.4.1
- Python 3.12
Additional context
No response
The text was updated successfully, but these errors were encountered:
Please share the full code, so that it's possible to reproduce. But what you're trying to do probably is not very useful, the FreeVC model isn't very good for a lot of applications.
Describe the bug
i have the following code:
To Reproduce
Expected behavior
No response
Logs
Environment
Additional context
No response
The text was updated successfully, but these errors were encountered: