Skip to content

Copy of the US English HTS demo for statistical parametric speech synthesis.

License

Notifications You must be signed in to change notification settings

MattShannon/hts-demo-en-US-cmudict

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hts-demo-en-US-cmudict

This repository contains a copy of the US English HTS demo. The HTS demos are designed to demonstrate the capabilities of the HMM-based Speech Synthesis System (HTS) for statistical parametric speech synthesis.

Please note that this repository is an unofficial copy of the US English HTS demo and is not endorsed in any way by the HTS working group who maintains the HTS demos. Accordingly any questions or bug reports should be directed to the HTS working group using the contact details given below.

Overview

Starting from a corpus of text data and speech audio data from a human speaker, this software trains a statistical parametric speech synthesis system designed to imitate that speaker. This trained system can then be used to synthesize speech audio for new pieces of text data.

The text data may be provided in one of two forms:

The speech data should be provided in the form of raw 48 kHz 16-bit audio files, which are essentially just wav files with the header information removed.

Modifications from the official version

This repository contains some minor modifications from the official version of the US English HTS demo:

  • it does not include a corpus of text data and speech data for training the synthesis system. The corpus of data used by the official version is based on the CMU ARCTIC corpus for speaker SLT and may obtained by downloading the official version of the US English HTS demo from the HTS demo website.
  • it adds this README file
  • it adds the License file
  • it removes the configure script since a suitable script can be automatically generated from the provided configure.ac file using autoconf. This follows standard version control practices.

License

Please see the file License for details of the license and warranty for hts-demo-en-US-cmudict.

Installation

The source code is hosted in the hts-demo-en-US-cmudict github repository. To obtain the latest source code using git:

git clone git://github.com/MattShannon/hts-demo-en-US-cmudict.git

hts-demo-en-US-cmudict depends on the following software packages:

If STRAIGHT vocoding is used (recommended for better quality, though it is not available under a permissive license) then the following software packages are also required:

The versions of these software packages required can be found in INSTALL.

There are two high-level choices that need to be made when setting-up this directory:

  • whether to provide the text data for the corpus you wish to train on as simple text files (in which case the configure variable USEUTT should be set to 0) or as Festival utterance files (in which case USEUTT should be set to 1). Using Festival utterance files is the default.
  • whether to use the STRAIGHT vocoder (in which case the configure variable USESTRAIGHT should be set to 1) or the non-STRAIGHT vocoder (in which case USESTRAIGHT should be set to 0). Using the non-STRAIGHT vocoder is the default, but using the STRAIGHT vocoder is recommended if possible for better quality.

To set-up this directory:

  • add appropriate text files in data/txt (if USEUTT is 0), Festival utterance files in data/utts (if USEUTT is 1) and raw audio files in data/raw for the corpus you wish to use during training (see above for details of the formats used and details of where to obtain the processed CMU ARCTIC corpus typically used with this HTS demo)
  • generate the configure script from configure.ac using autoconf
  • follow the instructions for the official version included in INSTALL, setting the USEUTT and USESTRAIGHT configure variables appropriately. If you are using a corpus other than the CMU ARCTIC corpus for speaker SLT, you may wish to also specify the LOWERF0, UPPERF0, DATASET and SPEAKER configure variables.

Bugs

Please use the HTS users mailing list to submit bugs related to the US English HTS demo, preferably after verifying that the bug still occurs with the most recent official version available from the HTS demo website. Bugs specifically about this copy of the HTS demo can be submitted to the issue tracker.

Contact

The author of the US English HTS demo is the HTS working group. More information is available on the HTS website and from the HTS users mailing list. The host of the hts-demo-en-US-cmudict github repository is Matt Shannon.

About

Copy of the US English HTS demo for statistical parametric speech synthesis.

Resources

License

Stars

Watchers

Forks

Packages

No packages published