This repository contains a Jupyter notebook that explores and analyzes product listing data for fan shop jerseys from Dick's Sporting Goods. It goes into price distributions, brand categorization, city-based segmentation, and listing details to derive actionable insights and patterns.
Also, there is a dashboard named dashboard.py that will display information about the cleaned data after going through the notebook.
The notebook handles a dataset containing listings with attributes such as price, brand, city, and other descriptive elements.
- Price Analysis: Studies how pricing strategies impact price distribution.
- Brand Recognition: Uses regex and manual checks to categorize products by brand, including special considerations for brands like Nike, Adidas, and Fanatics.
- City Categorization: Analyzes listings to extract and correct city names, including considerations for non-American sports jerseys.
- Listing Length: Investigates whether the length of the product description affects pricing or customer perception.
To run this notebook:
- Ensure you have Jupyter installed, or use an online service like Google Colab.
- Open the notebook in your Jupyter environment.
- Install required libraries as listed in the
requirements.txt
(if provided). - Execute the cells in sequence to view the analysis.
For any additional questions or contributions, please contact [email protected].