Skip to content

Latest commit

 

History

History
78 lines (54 loc) · 2.11 KB

README.md

File metadata and controls

78 lines (54 loc) · 2.11 KB

Salary Prediction

This is the project code used for a salary prediction project using standard Human Resource Data. The problem statement being addressed in the project is, "How can we better predict the expected salary for various positions such that the offerred salary can align with the market?"

Data Used for Analysis

  1. HR Salary Information

Pre-requisites

Option 1: WSL (Windows Sub-Linux)

  1. Enable WSL in windows
  2. Install Ubuntu App from Windows Store
  3. Create Login and sudo password for Linux

Option 2: Google-colab

  1. Login to google colab
  2. Copy forked GitHub files to google colab
  3. Run code

Getting Started

  1. Open Windows Sub Linux (Ubuntu App)

  2. Run the following command

git clone https://github.com/narquette/salarypredictionportfolio
  1. Change install script to executable and run install file
chmod +x prereq_install.sh
./prereq_install.sh
  1. Open Jupyter Notebook
jupyter notebook --no-browser
  1. Copy URL from command line

  2. Run Salary Prediction Notebook.ipynb in the Code folder

Risk Salary Prediction App

No Show Prediction

  1. Go to Heroku App
  2. Enter in the following values:
    • Miles from Metropolis = 45
    • Years Experience = 10
    • Industry = Health
  3. View Salary Prediction: "The predicted salary is 157420.0"

Folder Overview

Code

  • Salary Prediction Notebook (all of the code required to produce a final model)
  • HelperFile.py (contains the machine learning class needed to run in the Salary Prediction Notebook

Data

  • Original (original salary data)
  • Cleaned (cleaned salary data)
  • Prediction (the predicted salary data for the test data)

Logs

  • Previous Model Logs and Where New Logs Information will be places

Models

  • The final model produced from running the notebook

Visualizations

  • Visualizations produced in the EDA (exploratory data analysis phase)
  • Pandas Profile HTML file for the original data set