TIMIT Preprocessor

timit-preprocessor extract mfcc vectors and phones from TIMIT dataset for advanced use on speech recognition.

Overview

The TIMIT corpus of read speech is designed to provide speech data for acoustic-phonetic studies and for the development and evaluation of automatic speech recognition systems. More information on website or Wiki The instructions and scripts used here are built upon timit-preprocessor. make_dataset.py relies on kaldi-io-for-python

Dependencies

You must have downloaded the TIMIT dataset. You must have a compiled version of Kaldi.

Note that to install Kaldi first by following the instructions in INSTALL.

(1)
go to tools/ and follow INSTALL instructions there.

(2) go to src/ and follow INSTALL instructions there.

After running the scripts instructed by INSTALL in tools/, there will be reminder as followed. Go and run it.

Kaldi Warning: IRSTLM is not installed by default anymore. If you need IRSTLM, use the script extras/install_irstlm.sh

Preprocessing

Steps

source the python interpreter matching the requirement.txt file.

$ source ../../pyenv/bin/activate

Edit the default of variables KALDI_ROOT, TIMIT_ROOT, DATA_OUT in the Makefile to match your installation. You can also leave the default as is and use make with location arguments.

$ make KALDI_ROOT=abc/kaldi  TIMIT_ROOT=abc/timit DATA_OUT=abc/out ...

Run the following commands (here without location arguments):

$ make convert
$ make -j 4

Note 1: noisy .wav files will be created alongside timit clean ones.

Note 2: In case of errors, display the remaining steps:

$ make -n

and try to debug them one by one.

Note 3: For serious problems you can always contact us in the [issues] section.

Acknowledgment

Some codes of the TIMIT Preprocessor are from the following repo: TIMIT Preprocessor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

TIMIT Preprocessor

Overview

Dependencies

Preprocessing

Steps

Acknowledgment

Files

README.md

Latest commit

History

README.md

File metadata and controls

TIMIT Preprocessor

Overview

Dependencies

Preprocessing

Steps

Acknowledgment