Skip to content

Commit

Permalink
Merge branch 'sounddevice'
Browse files Browse the repository at this point in the history
  • Loading branch information
PeterBowman committed Dec 12, 2023
2 parents 89473c4 + 21eef38 commit c74f814
Show file tree
Hide file tree
Showing 5 changed files with 210 additions and 141 deletions.
34 changes: 24 additions & 10 deletions programs/speechSynthesis/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,46 @@

## Installation

Through pip:
Note that **Python 3.9+ is required**. Through pip:

```bash
pip3 install mycroft-mimic3-tts
pip install piper-tts
```

Alternatively, install from sources: https://github.com/MycroftAI/mimic3
Alternatively, install from sources: <https://github.com/rhasspy/piper>

## Download voice models

All voice data is located in a separate repository: https://github.com/MycroftAI/mimic3-voices
All voice data is stored in Hugging Face: <https://huggingface.co/rhasspy/piper-voices>

To manually issue the download of all Spanish voices, run:
By default, `speechSynthesis` assumes `--context speechSynthesis --from speechSynthesis.ini`, i.e. it will spawn a `ResourceFinder` instance and look for a `speechSynthesis.ini` placed in a `speechSynthesis/` directory following the [YARP data directory specification](https://www.yarp.it/latest/yarp_data_dirs.html). These default context and configuration file can be modified via command line arguments, although it shouldn't be necessary. Voice models need to be downloaded either manually or via `piper` into the sibling directory of the .ini configuration file.

It is advised to import the `speechSynthesis` context after installating the speech repository:

```bash
mimic3-download 'es_ES/*'
yarp-config context --import speechSynthesis
```

In case the process gets stuck, download and unpack the files into `${HOME}/.local/share/mycroft/mimic3/voices`. However, you'll probably need to download the *generator.onnx* file separately (via GitHub) since it is handled by Git LFS.
This command will copy the installed context into a writable user-local path such as `$HOME/.local/share/yarp/contexts/speechSynthesis`. Change into this directory and run `piper` (see examples below) to automatically download the voice models, or download them manually from the Hugging Face repository and place them here.

The following command will output nothing, it simply downloads the model (if available in Hugging Face) and blocks the terminal since it expects input from stdin (kill it with Ctrl+C after the download is complete):

```bash
piper --model es_ES-davefx-medium
```

## Troubleshooting

Try this:
Try this (requires `pip install aplay`):

```bash
echo "hola, me llamo teo y tengo 10 años" | piper --model es_ES-davefx-medium --output-raw | aplay -r 22050 -f S16_LE -t raw -
```

Alternatively, keep the application open while reading from stdin:

```bash
mimic3 --voice es_ES/m-ailabs#tux "hola, me llamo teo y tengo 10 años"
piper --model es_ES-davefx-medium --output-raw | aplay -r 22050 -f S16_LE -t raw -
```

To enable GPU acceleration, run `pip3 install onnxruntime-gpu` and issue the `mimic3` command with `--cuda`. The `speechSynthesis` app also accepts this parameter.
To enable GPU acceleration, run `pip install onnxruntime-gpu` and issue the `piper` command with `--cuda`. The `speechSynthesis` app also accepts this parameter.
Loading

0 comments on commit c74f814

Please sign in to comment.