Skip to content

Latest commit

 

History

History
29 lines (19 loc) · 1.56 KB

README.md

File metadata and controls

29 lines (19 loc) · 1.56 KB

Homer MarkovText

Text Generation through Markov Chains

This is a custom Markov chain text engine that generates passages like Homer!

Training Data

There are some data sets that have been provided in there which have been scoured from the web. What's included: 1. Homer's Illiad 2. Homer's Oddysey

Generating a Dictionary

To generate a dictionary file, you'll need to run the genMarkovDict.py script as follows:


python genMarkovDict.py -k (the order of the markov chain; i.e. do you generate one word at a time or pairs of words) -i (input file with wild card) -d (output dictionary file)

For example, the following generates a dictionary of order 2 where the text was generated using two words at a time:
python genMarkovDict.py -k 2 -i "Data - Oddysey*.*" -d homerdict.txt

Generating Text

To generate the actual text, you'll need to run the genMarkovText.py script as follows:

python genMarkovText.py -w (maximum number of words in sentence) -n (number of sentences to generate) -d (source dictionary file)

For example, the following creates 5 generated text sentences with each one having a maximum of 20 words (if the end of sentence is found, then it will only go up to that last word)
python genMarkovText.py -w 20 -n 5 -d homerdict.txt

Credits

Special thanks to Pubs Abayasiri for open sourcing their code and making this as easy as possible! Also thank you to MIT for having an online resource with the entire Homer series that required only specific preprocessing.