So far you have used basic models to understand and predict words. In this next task, your goal is to use all the resources you have available to you (from the Data Science Specialization, resources on the web, or your own creativity) to improve the predictive accuracy while reducing computational runtime and model complexity (if you can). Be sure to hold out a test set to evaluate the new, more creative models you are building.
- Explore new models and data to improve your predictive model.
- Evaluate your new predictions on both accuracy and efficiency.
- What are some alternative data sets you could consider using?
- What are ways in which the n-gram model may be inefficient?
- What are the most commonly missed n-grams? Can you think of a reason why they would be missed and fix that?
- What are some other things that other people have tried to improve their model?
- Can you estimate how uncertain you are about the words you are predicting?