Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to generate anything #785

Open
gucio321 opened this issue Jul 26, 2024 · 0 comments
Open

unable to generate anything #785

gucio321 opened this issue Jul 26, 2024 · 0 comments

Comments

@gucio321
Copy link

gucio321 commented Jul 26, 2024

Command: tts --text "Czesc, jestem syntezator mowy" --model_name "tts_models/pl/mai_female/vits" --vocoder_name "vocoder_models/universal/libri-tts/wavegrad" --out_path /output/

Enviroument:
I have too new python so I use docker. Here is my dockerfile:

Dockerfile
FROM python:3.7

RUN python3 -m pip install TTS

# run bash
CMD ["/bin/bash"]

output:

console output
 > tts_models/pl/mai_female/vits is already downloaded.
 > vocoder_models/universal/libri-tts/wavegrad is already downloaded.
 > Using model: vits
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > resample:False
 | > num_mels:80
 | > log_func:np.log10
 | > min_level_db:0
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:None
 | > fft_size:1024
 | > power:None
 | > preemphasis:0.0
 | > griffin_lim_iters:None
 | > signal_norm:None
 | > symmetric_norm:None
 | > mel_fmin:0
 | > mel_fmax:None
 | > pitch_fmin:None
 | > pitch_fmax:None
 | > spec_gain:20.0
 | > stft_pad_mode:reflect
 | > max_norm:1.0
 | > clip_norm:True
 | > do_trim_silence:False
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linear:True
 | > do_amp_to_db_mel:True
 | > do_rms_norm:False
 | > db_level:None
 | > stats_path:None
 | > base:10
 | > hop_length:256
 | > win_length:1024
 > initialization of speaker-embedding layers.
 > initialization of language-embedding layers.
 > Vocoder Model: wavegrad
 > Text: Czesc, jestem syntezator mowy
 > Text splitted to sentences.
['Czesc, jestem syntezator mowy']
Traceback (most recent call last):
  File "/usr/local/bin/tts", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/TTS/bin/synthesize.py", line 439, in main
    reference_speaker_name=args.reference_speaker_idx,
  File "/usr/local/lib/python3.7/site-packages/TTS/utils/synthesizer.py", line 393, in tts
    vocoder_input = self.vocoder_ap.normalize(mel_postnet_spec.T)
  File "/usr/local/lib/python3.7/site-packages/TTS/utils/audio/processor.py", line 286, in normalize
    raise RuntimeError(" [!] Mean-Var stats does not match the given feature dimensions.")
RuntimeError:  [!] Mean-Var stats does not match the given feature dimensions.
root@1affd8f1442e:/# 

nvidia-smi output
Fri Jul 26 19:51:50 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07              Driver Version: 550.90.07      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2080 ...    Off |   00000000:01:00.0  On |                  N/A |
| N/A   51C    P8              2W /   80W |     718MiB /   8192MiB |     10%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      2959      G   /usr/bin/gnome-shell                          372MiB |
|    0   N/A  N/A      3236    C+G   /usr/bin/xwaylandvideobridge                   13MiB |
|    0   N/A  N/A      3548      G   /usr/bin/Xwayland                              11MiB |
|    0   N/A  N/A      3787      G   /usr/libexec/xdg-desktop-portal-gnome          51MiB |
|    0   N/A  N/A      4010    C+G   /usr/libexec/mutter-x11-frames                 10MiB |
|    0   N/A  N/A     13250      G   /usr/bin/gnome-clocks                          35MiB |
|    0   N/A  N/A     19016      G   ...ures=SpareRendererForSitePerProcess        130MiB |
|    0   N/A  N/A     20426      G   /usr/bin/evolution                              2MiB |
|    0   N/A  N/A     24450      G   ...bin/plasma-browser-integration-host          2MiB |
|    0   N/A  N/A     26279      G   /usr/libexec/kactivitymanagerd                  2MiB |
+-----------------------------------------------------------------------------------------+

I followed instructions from readme. Am I missing something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant