I performed some exploratory data analysis, and implemented a linear regression model on a dataset containing the average life expectancy of 193 countries, as well as various features that may affect the average life expectancy of each country (vaccination rates, healthcare expenditure, etc.). Through linear regression, I was able to implement a model that can able to predict the average life expectancy of a country.
In addition, the purpose of my data analysis was to answer the following important questions:
• Should a country having a lower life expectancy value(<65) increase its healthcare expenditure in order to improve its average lifespan?
• Does Life Expectancy has positive or negative correlation with eating habits, lifestyle, exercise, smoking, drinking alcohol etc.
• What is the impact of schooling on the lifespan of humans?
• Do densely populated countries tend to have lower life expectancy?
• NumPy
• Pandas
• Matplotlib
• Seaborn
• Scikit-learn
The dataset used can be found here:
https://www.kaggle.com/datasets/kumarajarshi/life-expectancy-who
• Contains source code for data analysis, visualization, and the linear regression model stored in a jupyter notebook
• Contains a CSV file detailing the entire dataset used in this project