The goal of this project is to build a model that can accurately predict the price of a house based on its features. To do this, we will train a model using a dataset consisting of 2930 residential properties in Ames, Iowa from 2006 to 2010. The dataset contains 79 explanatory variables describing the houses.
We will build and refine models using a combination of feature engineering, feature selection tools, and linear regression. To evaluate our model, we will use the Root-Mean-Squared-Error (RMSE) between predicted values and observed sales prices. The Kaggle competition using this dataset can be found here.
This dataset was assembled as a tool to help students practice creating predictive models, but the results could be used to determine what features add the most value to houses. This could be beneficial for those who are designing or remodeling a house.