This is the analysis I developed for the ML test in Ixpandit.
The 4 requested points were developed on 4 Jupyter Notebooks using a miniconda environment.
In the repository you can find the environment.yml
file to install the libraries.
Here are the instructions to download miniconda and create such an environment.
Due to the size of the data.csv
file, it was not included in the repo. I assume that the evaluators
have such a file.
Given data collected from paid and unpaid loans, the goal is to predict whether a customer will pay or not.
Below are the codes developed for the required items. The whole analysis was performed in 48hs as specified by the test requirements.
- Exploratory data analysis:
- Exploratory data analysis.ipynb
- Development of a predictive model:
- Data processing and Random Forest Classifier.ipynb
- Optimization of hyper parameters:
- Bayesian optimization for hyperparameters.ipynb
- Natural language processing:
- Word2Vec + K-means for titles feature extraction.ipynb
# Installation of Miniconda from scratch
- Get and install Miniconda:
cd your_project/
(Miniconda packages might require a significant space ~Gbs)wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
export PATH="/home/user/your_project/miniconda3/bin:$PATH"
(or where you have decided to install miniconda3)
# Create an environment
- An environment file is provided for version compatibility.
conda env create -f environment.yml