We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Command: tts --text "Czesc, jestem syntezator mowy" --model_name "tts_models/pl/mai_female/vits" --vocoder_name "vocoder_models/universal/libri-tts/wavegrad" --out_path /output/
tts --text "Czesc, jestem syntezator mowy" --model_name "tts_models/pl/mai_female/vits" --vocoder_name "vocoder_models/universal/libri-tts/wavegrad" --out_path /output/
Enviroument: I have too new python so I use docker. Here is my dockerfile:
FROM python:3.7 RUN python3 -m pip install TTS # run bash CMD ["/bin/bash"]
output:
> tts_models/pl/mai_female/vits is already downloaded. > vocoder_models/universal/libri-tts/wavegrad is already downloaded. > Using model: vits > Setting up Audio Processor... | > sample_rate:22050 | > resample:False | > num_mels:80 | > log_func:np.log10 | > min_level_db:0 | > frame_shift_ms:None | > frame_length_ms:None | > ref_level_db:None | > fft_size:1024 | > power:None | > preemphasis:0.0 | > griffin_lim_iters:None | > signal_norm:None | > symmetric_norm:None | > mel_fmin:0 | > mel_fmax:None | > pitch_fmin:None | > pitch_fmax:None | > spec_gain:20.0 | > stft_pad_mode:reflect | > max_norm:1.0 | > clip_norm:True | > do_trim_silence:False | > trim_db:60 | > do_sound_norm:False | > do_amp_to_db_linear:True | > do_amp_to_db_mel:True | > do_rms_norm:False | > db_level:None | > stats_path:None | > base:10 | > hop_length:256 | > win_length:1024 > initialization of speaker-embedding layers. > initialization of language-embedding layers. > Vocoder Model: wavegrad > Text: Czesc, jestem syntezator mowy > Text splitted to sentences. ['Czesc, jestem syntezator mowy'] Traceback (most recent call last): File "/usr/local/bin/tts", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.7/site-packages/TTS/bin/synthesize.py", line 439, in main reference_speaker_name=args.reference_speaker_idx, File "/usr/local/lib/python3.7/site-packages/TTS/utils/synthesizer.py", line 393, in tts vocoder_input = self.vocoder_ap.normalize(mel_postnet_spec.T) File "/usr/local/lib/python3.7/site-packages/TTS/utils/audio/processor.py", line 286, in normalize raise RuntimeError(" [!] Mean-Var stats does not match the given feature dimensions.") RuntimeError: [!] Mean-Var stats does not match the given feature dimensions. root@1affd8f1442e:/#
Fri Jul 26 19:51:50 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.90.07 Driver Version: 550.90.07 CUDA Version: 12.4 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 2080 ... Off | 00000000:01:00.0 On | N/A | | N/A 51C P8 2W / 80W | 718MiB / 8192MiB | 10% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 2959 G /usr/bin/gnome-shell 372MiB | | 0 N/A N/A 3236 C+G /usr/bin/xwaylandvideobridge 13MiB | | 0 N/A N/A 3548 G /usr/bin/Xwayland 11MiB | | 0 N/A N/A 3787 G /usr/libexec/xdg-desktop-portal-gnome 51MiB | | 0 N/A N/A 4010 C+G /usr/libexec/mutter-x11-frames 10MiB | | 0 N/A N/A 13250 G /usr/bin/gnome-clocks 35MiB | | 0 N/A N/A 19016 G ...ures=SpareRendererForSitePerProcess 130MiB | | 0 N/A N/A 20426 G /usr/bin/evolution 2MiB | | 0 N/A N/A 24450 G ...bin/plasma-browser-integration-host 2MiB | | 0 N/A N/A 26279 G /usr/libexec/kactivitymanagerd 2MiB | +-----------------------------------------------------------------------------------------+
I followed instructions from readme. Am I missing something?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Command:
tts --text "Czesc, jestem syntezator mowy" --model_name "tts_models/pl/mai_female/vits" --vocoder_name "vocoder_models/universal/libri-tts/wavegrad" --out_path /output/
Enviroument:
I have too new python so I use docker. Here is my dockerfile:
Dockerfile
output:
console output
nvidia-smi output
I followed instructions from readme. Am I missing something?
The text was updated successfully, but these errors were encountered: