Skip to content

Prediction of house-prices in King’s County, USA, using a number of features such as area of living room, waterfront etc. | Involves extensive exploratory data analysis and comprehensive predictive modeling using data pipelines

Notifications You must be signed in to change notification settings

rafayk330/house-price-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HOUSE PRICE PREDICTION MODEL

PROJECT SYNOPSIS

In this project, I play the role of a Data Analyst for a Real Estate Investment Trust. The Trust would like to start investing in residential real estate. The task at hand is to determine the market price of a house given a set of features. The project predicts housing prices using attributes or features such as square footage, number of bedrooms, number of floors, and so on. A Jupyter notebook has been provided in this repository.


TOOLS USED

This project uses Python for both the analysis and the visualization. An eclectic range of Python libraries have, however, been used:

  • Python 3.8 (visualization + analysis)
  • Jupypter Notebook (IDE)

DATASET DESCRIPTION

This dataset contains house sale prices for King County, which includes Seattle. It includes houses sold between May 2014 and May 2015. It was taken from a Kaggle upload (https://www.kaggle.com/harlfoxem/housesalesprediction).

Here is the description of the data:

  • id: A notation for a house
  • date: Date house was sold
  • price: Price is prediction target
  • bedrooms: Number of bedrooms
  • bathrooms: Number of bathrooms
  • sqft_living: Square footage of the home
  • sqft_lot: Square footage of the lot
  • floors: Total floors (levels) in house
  • waterfront: House which has a view to a waterfront
  • view: Has been viewed
  • condition: How good the condition is overall
  • grade: overall grade given to the housing unit, based on King County grading system
  • sqft_above: Square footage of house apart from basement
  • sqft_basement: Square footage of the basement
  • yr_built: Built Year
  • yr_renovated: Year when house was renovated
  • zipcode: Zip code
  • lat: Latitude coordinate
  • long: Longitude coordinate
  • sqft_living15: Living room area in 2015 (implie some renovations) | This might or might not have affected the lotsize area
  • sqft_lot15: LotSize area in 2015 (implies some renovations)

METHODOLOGY

  1. Imports of Libraries and Packages
  2. Import of Dataset
  3. Data Wrangling/Preprocessing
  4. Exploratory Data Analysis
  5. Feature Selection
  6. Model Development
  7. Creaation of Data Pipeline
  8. Model Evaluation and Refinement

ANALYSIS

All the steps in the analysis have been explained in the Jupyter Notebook for this project. Some examples of visualizations used are as follows: sqft_corr waterfront_boxplot


SUMMARY AND REFLECTION

This is an intermediate-level project which involves some advanced concepts of Machine Learning and Predictive Modeling in Python using an IDE.

All rights related to the published dataset are reserved with the issuing authorities of the same (Kaggle).

The project may be used only as a learning resource; no part of the same must be copied for any other usage whatsover.

About

Prediction of house-prices in King’s County, USA, using a number of features such as area of living room, waterfront etc. | Involves extensive exploratory data analysis and comprehensive predictive modeling using data pipelines

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages