Skip to content

Releases: uncommoncode/clean_spoken_digits

v1.0.0

19 Aug 06:23
Compare
Choose a tag to compare

Initial release containing 1200 test set clips and 6000 training set clips with labels for the digit, speaker id, speed, language code, gender, and other metadata.

Release has raw 16khz wav files in the .zip, or preprocessed 8 or 32 mel-bin spaced spectrograms split into training and test for convenience. These files have the features encoded as float16 to save space and documented more in the readme.