Company ABC have invested in top-tier sports teams. The dataset in their possession comprises crucial information about all the teams that have participated in premier league (assume that it is the data for all teams). It includes data on the number of goals scored and conceded by each team, the number of times they have finished in the top two positions and other relevant details.
Data: Premier League Final Data.csv- : The data set contains information on all the teams so far participated in all the premier league tournaments.
Data Dictionary:
- Club: Name of the football club
- Matches: Number of matches the club has played in the Premier League
- Wins: Number of matches won by the club in the Premier League
- Loss: Number of matches lost by the club in the Premier League
- Draws: Number of matches drawn by the club in the Premier League
- Clean Sheets: Number of matches in which the club has prevented the opposing side from scoring
- Team Launch: Year in which the club was founded
- Winners: Number of times the club has won the Premier League
- Runners-up: Number of times the club has finished as runners-up in the Premier League
- Lastplayed: Last played in premier league
The management of Company ABC aims to invest in some of the top-performing clubs in the English Premier League. To aid in their decision-making process, the analytics department has been tasked with creating a comprehensive report on the performance of various clubs. However, some of the more established clubs have already been owned by the competitors. As a result, Company ABC wishes to identify the clubs they can approach and potentially invest to ensure a successful and profitable deal.
Note:
- Unauthorised use or distribution of this project prohibited @dataanalystduo
- Dataset has been downloaded from the internet using multiple sources. All the credit for the dataset goes to the original creator of the data
- Data cleaning is the process of identifying and correcting or removing errors, inconsistencies, and inaccuracies in a dataset.
- Observation writing involves examining the data and noting any notable findings, anomalies, or areas of interest.
- Exploratory Data Analysis (EDA) is the process of examining and visualizing a dataset to understand its main characteristics, such as the distribution of data, the relationships between variables, and any anomalies or patterns that may exist. The goal of EDA is to uncover insights and trends that can help inform further analysis or decision-making. It is often the first step in any data analysis project, as it provides a foundation for more advanced statistical methods and models.
- Treat Null values basis domain knowledge aka using Domain-specific imputation