This is the project code used for a salary prediction project using standard Human Resource Data. The problem statement being addressed in the project is, "How can we better predict the expected salary for various positions such that the offerred salary can align with the market?"
- HR Salary Information
Option 1: WSL (Windows Sub-Linux)
- Enable WSL in windows
- Install Ubuntu App from Windows Store
- Create Login and sudo password for Linux
Option 2: Google-colab
- Login to google colab
- Copy forked GitHub files to google colab
- Run code
-
Open Windows Sub Linux (Ubuntu App)
-
Run the following command
git clone https://github.com/narquette/salarypredictionportfolio
- Change install script to executable and run install file
chmod +x prereq_install.sh
./prereq_install.sh
- Open Jupyter Notebook
jupyter notebook --no-browser
-
Run Salary Prediction Notebook.ipynb in the Code folder
No Show Prediction
- Go to Heroku App
- Enter in the following values:
- Miles from Metropolis = 45
- Years Experience = 10
- Industry = Health
- View Salary Prediction: "The predicted salary is 157420.0"
Code
- Salary Prediction Notebook (all of the code required to produce a final model)
- HelperFile.py (contains the machine learning class needed to run in the Salary Prediction Notebook
Data
- Original (original salary data)
- Cleaned (cleaned salary data)
- Prediction (the predicted salary data for the test data)
Logs
- Previous Model Logs and Where New Logs Information will be places
Models
- The final model produced from running the notebook
Visualizations
- Visualizations produced in the EDA (exploratory data analysis phase)
- Pandas Profile HTML file for the original data set