diff --git a/projects/data-analyst-certification.html b/projects/data-analyst-certification.html index ef8f213..71c97d9 100644 --- a/projects/data-analyst-certification.html +++ b/projects/data-analyst-certification.html @@ -1,15 +1,24 @@ --- -layout: default -title: Data Analyst Certification Project +layout: default +title: Data Analyst Certification Project ---

Data Analyst Certification Project

Website Sales Analysis and Optimisation

+

Introduction

+

This project focuses on analyzing and optimizing website sales using data-driven insights. The main objective is to uncover purchasing patterns, predict sales trends, and provide actionable recommendations for the website's owner.

+

What datasets were used to achieve the objectives of this project?

-

Four datasets: one file containing behavioural data (events.csv), two files containing item properties (item_properties.сsv), and one file describing the category tree (category_tree.сsv). The data was collected from a real e-commerce website. This is raw data, i.e. without any transformation of the content, but all values were anonymised for confidentiality reasons. The data was freely available on Kaggle.

+

We leveraged four datasets collected from a real e-commerce website:

+ +

All datasets were anonymized to protect user privacy. These datasets, originally from Kaggle, were adapted for this analysis to ensure scalability.

-

Data Volumetrics:

+

Data Volumetrics

- -Kaggle Dataset Link
-Note: Original files were not used due to their volume.

- - -GitHub Link to Used Files
-Note: Files used for the project are a lighter version of the original Kaggle files.

- - -Link to Python Project Code
-Note: This link will take you to the Python code used in this project.

- - -Link to Streamlit Application
-Note: This link will take you to the live Streamlit application for this project.

+Note: A lighter version of the original datasets was used due to volume considerations.

+ +GitHub Link to Project Files
+View Python Project Code
+Explore the Streamlit Application

Project Objectives

The main objectives of this project were to:

@@ -45,23 +44,34 @@

Project Objectives

  • Prepare for Advanced Analytics: Cleanse and transform the data to facilitate advanced analytics, including machine learning and predictive modeling.
  • -

    These objectives aim to leverage data to drive growth and operational efficiency in a dynamic business environment.

    -

    Conclusion and Recommendations

    +
    + "Ensuring product availability in high-demand clusters is crucial for optimizing conversion rates." +

    - Based on the analysis and the results obtained through KMeans clustering, it is clear that user behavior can be segmented into distinct groups. By focusing on clusters 1 and 3, the website owner can target promising market niches for additional sales opportunities. -

    -

    - Ensuring product availability is a critical factor in optimizing conversion rates. Special attention should be given to products in high-demand clusters to ensure stock availability and improve customer satisfaction. -

    -

    - Although linear regression provided some valuable insights, it should not be the sole predictive model, as its performance in transaction prediction was limited. Further refinement of the data and additional variables could help improve prediction accuracy. + Based on the KMeans clustering results, user behavior can be segmented into distinct groups, allowing the website owner to focus on clusters 1 and 3 for targeted sales opportunities.

    + +

    - Lastly, the use of Isolation Forest in detecting anomalies in sales data, although interesting, requires further calibration to reduce false positives and enhance its utility in spotting opportunities for cross-selling or targeted promotions. + While linear regression provided insights, its predictive performance was limited for transaction forecasting. Additional variables, such as demographic or historical behavior data, may improve model accuracy.

    - For more detailed insights and specific recommendations, you can check the full conclusion report here. + The Isolation Forest model, though promising for anomaly detection, requires further fine-tuning to reduce false positives. It can potentially help identify cross-selling opportunities or target specific user segments for promotional activities.

    +

    Key Recommendations:

    + + +

    For more details, you can check the full conclusion report here.

    + +