A final project of Data Science Bootcamp Batch 20 in Rakamin Academy. In this project, as a team of Data Scientists from an ecommerce company, we were given the task of being able to predict the purchase intention of customers by machine learning. all files attached to this repository are original, as memories of the first data science project as a team
This dataset obtained from Kaggle Online Shoppers Purchasing Intention. The dataset consists of feature vectors belonging to 12,330 sessions, was formed so that each sessionwould belong to a different user in a 1-year period to avoid any tendency to a specific campaign, special day, userprofile, or period.
On the company's website there are 3 categories of visitors, namely returning visitors, new visitors and other. The number of returning visitors is much higher than new visitors, but the conversion rate for returning visitors is lower than new visitors
Increase revenue (conversion rate) especially in the returning visitor category by following up predicted users will make purchases/transactions with Call To Action and provide promos to targets.
Build machine learning model to predict which users visit a website with intention to buy or not.
- Conversion Rate (conversion rate of returning visitors).
- Company revenue before and after modeling.
In this project, we divide into 4 stages:
- Stage 1 - Preparation: We study the selected project and dataset. The key takeaways in this stage are who we are in the project, the problem statement, the goals and objectives we want to achieve and lastly are the business metrics.
- Stage 2 - EDA: We start to reach the dataset to get the characteristics of the data. We separated the process into 3 steps, starting from data exploration, EDA, and gathering insights on datasets related to the main problem and objectives.
- Stage 3 - Preprocessing: We handle the data to be the cleanest data before starting the modeling process. We address outliers, feature encoding, and determine which feature engineering we choose.
- Stage 4 - Supervised Learning: We enter the modeling process and explore several algorithms to improve based on the features and targets we have selected.