Skip to content

Latest commit

 

History

History
16 lines (10 loc) · 1021 Bytes

README.md

File metadata and controls

16 lines (10 loc) · 1021 Bytes

Word Generator

Using a dataset of phonetic representations of words in a given language, it should be possible to build a model that can generate new words that sound like the target language, but don't actually exist

Preliminarily using the CMU Pronouncing Dictionary dataset, subject to change

Use

Run script.py, it will load the CMUDict dataset and allow you to start generating brand new words.

Words are generated in IPA format. If you, like me, don't know how to read IPA, Amazon's Polly service can generate speech from the symbols to let you hear what your new word sounds like, just switch to the SSML tab and enter the following tag:

<phoneme alphabet="ipa" ph="YOUR IPA TEXT HERE"></phoneme>

The ARPAbet Wikipedia page also has a useful table of ARPAbet/IPA symbols to spoken sounds, so you can try to piece together the word yourself if Polly has trouble with it.