Skip to content

The Roger-Ebert Bot that predicts late renowned movie critic's rating with regression algorithms

License

Notifications You must be signed in to change notification settings

crystal-ctrl/regression_project

Repository files navigation

The Roger-Ebert Bot

ML Regression Project

Goal

The goal of this project is to use linear regression model to predict the renowned late movie critic Roger Ebert's film rating if he were alive today. My primary and secondary datasets were obtained through webscraping process. Using numerical and categorical features, along with some feature engineering, I built some linear regression models. After multiple trials of 5-folds cross-validation test, modifying datasets, and more feature engineering, my final linear regression model with lasso regularization has R-squared value of 0.396 and mean absolute error of 0.55, which in layman's term, the prediction is off by 0.5 star.

To learn more, see my blog post and presentation slides.

Workflow

  1. Data Acquisition
    a. Ebert Data
    b. Other Data
  2. EDA & Feature Engineer & Selection
  3. Regression Modeling & Testing
    a. More Feature Engineering
  4. Attempt on Random Forest Regressor

Technologies

  • BeautifulSoup
  • Selenium
  • Python (Pandas, numpy, pickle)
  • Matplotlib
  • seaborn
  • scikitlearn
  • statsmodel

About

The Roger-Ebert Bot that predicts late renowned movie critic's rating with regression algorithms

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published