diff --git a/projects/data-analyst-certification.html b/projects/data-analyst-certification.html index ef8f213..71c97d9 100644 --- a/projects/data-analyst-certification.html +++ b/projects/data-analyst-certification.html @@ -1,15 +1,24 @@ --- -layout: default -title: Data Analyst Certification Project +layout: default +title: Data Analyst Certification Project ---
Website Sales Analysis and Optimisation
+This project focuses on analyzing and optimizing website sales using data-driven insights. The main objective is to uncover purchasing patterns, predict sales trends, and provide actionable recommendations for the website's owner.
+Four datasets: one file containing behavioural data (events.csv), two files containing item properties (item_properties.сsv), and one file describing the category tree (category_tree.сsv). The data was collected from a real e-commerce website. This is raw data, i.e. without any transformation of the content, but all values were anonymised for confidentiality reasons. The data was freely available on Kaggle.
+We leveraged four datasets collected from a real e-commerce website:
+All datasets were anonymized to protect user privacy. These datasets, originally from Kaggle, were adapted for this analysis to ensure scalability.
-The main objectives of this project were to:
@@ -45,23 +44,34 @@These objectives aim to leverage data to drive growth and operational efficiency in a dynamic business environment.
-+ "Ensuring product availability in high-demand clusters is crucial for optimizing conversion rates." +
- Based on the analysis and the results obtained through KMeans clustering, it is clear that user behavior can be segmented into distinct groups. By focusing on clusters 1 and 3, the website owner can target promising market niches for additional sales opportunities. -
-- Ensuring product availability is a critical factor in optimizing conversion rates. Special attention should be given to products in high-demand clusters to ensure stock availability and improve customer satisfaction. -
-- Although linear regression provided some valuable insights, it should not be the sole predictive model, as its performance in transaction prediction was limited. Further refinement of the data and additional variables could help improve prediction accuracy. + Based on the KMeans clustering results, user behavior can be segmented into distinct groups, allowing the website owner to focus on clusters 1 and 3 for targeted sales opportunities.
+- Lastly, the use of Isolation Forest in detecting anomalies in sales data, although interesting, requires further calibration to reduce false positives and enhance its utility in spotting opportunities for cross-selling or targeted promotions. + While linear regression provided insights, its predictive performance was limited for transaction forecasting. Additional variables, such as demographic or historical behavior data, may improve model accuracy.
- For more detailed insights and specific recommendations, you can check the full conclusion report here. + The Isolation Forest model, though promising for anomaly detection, requires further fine-tuning to reduce false positives. It can potentially help identify cross-selling opportunities or target specific user segments for promotional activities.
+Key Recommendations:
+For more details, you can check the full conclusion report here.
+ +