Skip to content

Latest commit

 

History

History
27 lines (16 loc) · 1.61 KB

README.md

File metadata and controls

27 lines (16 loc) · 1.61 KB

Stroke_prediction

Stroke_Prediction model for DSTI python labs project

What this project is for

The objective of this project was to train a machine learning model to predict whether a patient had a stroke or not, using a data set of 5110 patients. Each patient represented an observation with variables such as stroke (yes/no), as well as demographic variables (i.e., gender, age), lifestyle (i.e., smoking) and health history (i.e., hypertension, BMI, glucose, etc.) that could be used to predict stroke. The complete methods of this project included an exploratory data analysis, feature engineering and selection for a model, model training, and model evaluation. There is a summary project report as well.

Origin story

This project was conceived as part of a lab course in Machine Learning with Python through the Data Science Tech Institute. The stroke prediction project was based on the popular data set on Kaggle.com, with many examples of machine learning: https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset.

Python Libraries and more

See requirements.txt for project dependencies. At a high level, critical python libraries for this project included:

numpy np

pandas pd

matplotlib.pyplot plt

seaborn sns

scipy stats

This project was first implemented in the Jupyter Notebook environment (6.4.11) of Anaconda using python version 3.7.

How to use this project

As of this writing, 934 code examples performed machine learning models with this data. Therefore the approach here is not novel, but please share any comments and suggestions in the issues section. They will indeed be helpful.