Skip to content

Raw data cleaning, data visualization, data analysis, and machine learning for a predictive model.

Notifications You must be signed in to change notification settings

Gedion-Alemayehu/Suicide-Rates-1985-2016-Analysis-and-Prediction

Repository files navigation

Suicide-Rates-1985-2016-Analysis-and-Prediction

Raw data cleaning, data visualization, data analysis, and machine learning for a predictive model. If for instance, a country wanted to lower their suicide rates it would have to take into consideration some variables that contribute to the rates and perhaps analyze other countries in the world that face the same problem and analyze some commonalities and differences to help them approach the problem. In this process, it would be imperative to set a goal and finding the variables with high correlation to predict the overall rate and compare and contast it with their solutions. My project is to help find the best correlation factors between a variable and suicide rate to help better predict the future rates, simultaneously identifying the most conducive problem to the overwhelmigly high number of suicides, consequently narrowing a possible target for a solution. After consulting multiple data-set sources, I decided to use Kaggle as my main data source. The data set that I chose is the Suicide Rates Data Set. Evidently, suicide is a pertinent and perennial problem on a global basis. Especially in the western world. The contents of this data set are bulky and detailed. The data set contains the recorded suicide number in 101 countries from 1985 to 2016. It presents multiple variables as columns to add more detail about the recorded suicides; accordingly, it entails 12 factors: the country where it was recorded, the population of each country, the grossdomestic product of each country, the human development index of each country, the age group of each record, the generation to which each record belongs to, and the sex for each record. It presents alternative representations for some of these factors for further clarification, for instance, it comprises suicide number per 100 thousand in addition to the total population to represent each country in a fair comparison by countering their differences in the total population.

About

Raw data cleaning, data visualization, data analysis, and machine learning for a predictive model.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages