Skip to content

My final project for the course How to Win a Data Science Competition at Coursera.

Notifications You must be signed in to change notification settings

diego-fustes/coursera_predict_future_sales

Repository files navigation

coursera_predict_future_sales

My final project for the course "How to Win a Data Science Competition at Coursera", celebrated in October 2018. The project is actually a competition hosted by Kaggle:

https://www.kaggle.com/c/competitive-data-science-final-project

The project is split in two Jupyter notebooks. final_project_EDA contains the data exploration, while final_project_modelling contains the tasks of feature engineering, model optimization and ensembling. The notebooks should be self-explanatory. There is a script called data_io.py that is used by the notebooks to read the data files from disk and one configuration file called settings.ini with the data file paths.

The notebooks can be executed on a Python 3.6 environment with the libraries described in the file requirements.txt. To reproduce the results you just need to copy the competition datasets under the datasets/ folder and run the notebooks cells.

Final submission can be found in submission.csv and final models are also serialized in pickle files.

About

My final project for the course How to Win a Data Science Competition at Coursera.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published