Designed a content based filtering model in PyTorch to recommend movies to users based on their previous movie ratings. Trained the model on the MovieLens open source dataset (linked below), and the model was tuned in the model_tuning.ipynb file.
In model tuning, different combinations of linear layers, embedding dimensions, dropout layers, and batch normalizations were tested in order to balance model complexity with generalization ability. The goal was to find the best model that:
- Did not over or underfit the data
- Had the same ratings distribution as the original data
All code in this repository was written by me
Medium Article: https://medium.com/@scottpitcher_/intermediate-machine-learning-project-customer-recommendation-system-8e6f20e15477
used the ml-latest-small.zip file instead of the main .zip file to save storage.