CSE 351: Introduction to Data Science - Final Project

View our presentation at https://tinyurl.com/cse351project

View code on Google colab: https://tinyurl.com/cse351projectcode

Background

What makes people in a country happy?

The World Happiness Report is a landmark survey of the state of global happiness that ranks countries by how happy their citizens perceive themselves to be. The report gains global recognition as governments, organizations and civil society increasingly use happiness indicators to inform their policy-making decisions. This project allows us to gain insight into the state of happiness in the world today.

Datasets

The "World Happiness Report" found on Kaggle contains the happiness data for different countries from year 2015 to year 2019. We will treat data of year 2015 to year 2018 as the training set, and year 2019 data as the test set. Description of the data fields can be found on the FAQ page of World Happiness Report at https://worldhappiness.report/faq/

Python Libraries

Pandas - Data Analysis
NumPy - Scientific Computing
Matplotlib - Data Visualization
- Seaborn - Statistical Visualization in Matplotlib
scikit-learn - Machine Learning
- XGBoost - Gradient Boosting

Exploratory Data Analysis

Merge and clean the data.
What are the central tendencies of happiness score over the years? Did they increase or decrease?
Which countries have stable rankings over the years? Which countries improved their rankings?
Visualize the relationship between happiness score and other features such as GDP, social support, freedom, etc.
If you are the president of a country, what would you do to make citizens happier?

Modeling - Machine Learning

Linear Regression - finds the linear relationship between x (input) and y (output) and predicts the dependent variable (y) based on the independent variable (x).
Random Forest - creates a set of decision trees from a few randomly selected subsets of the training set and picks predictions from each tree.
XGBoost - minimizes a regularized (L1 and L2) objective function that combines a convex loss function (based on the difference between the predicted and target outputs) and a penalty term for model complexity (in other words, the regression tree functions).

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.ipynb_checkpoints		.ipynb_checkpoints
world_happiness		world_happiness
README.md		README.md
main.ipynb		main.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CSE 351: Introduction to Data Science - Final Project

Background

What makes people in a country happy?

Datasets

Python Libraries

Exploratory Data Analysis

Modeling - Machine Learning

About

Releases

Packages

Contributors 2

Languages

amaraim22/CSE351_Project

Folders and files

Latest commit

History

Repository files navigation

CSE 351: Introduction to Data Science - Final Project

Background

What makes people in a country happy?

Datasets

Python Libraries

Exploratory Data Analysis

Modeling - Machine Learning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages