Disaster Response Project

Udacity DataScientist Nanodegree

Andrzej Wodecki, 08.2019

Project overview

The goal of this project is to:

analyze disaster massages data provided by FigureEight
create a model for classification of new incoming messages into a set of pre-defined categories
create a web app displaying key characteristics of data provided in a dataset and enabling an emergency worker to classify a new message.

Project structure

There are 3 main components of the project:

an ETL (Extract, Transform, Load) pipeline stored in a 'data' subfolder
a modelling component, where a preprocessed data is used to fit and evaluate a final model ('model' subfolder)
a web app, which display both data and a classification engine online ('webapp' subfolder).

ETL pipeline

data/process_data.py file is used to:

load and merge the 'messages' and 'categories' datasets
perform necessary cleaning and transformations
store the resulting dataframe in a SQLlite database file

Modelling component

model/train_classifier.py file is a real heart of the solution. The machine learning pipeline implemented there:

Loads data from a database
Splits the data into training and test datasets
Fits the model (applying GridSearchCV)
Evaluates the final model
Exports it as a pickle file.

Web application

This final component uses Flask to generate a website enabling an emergency worker to classify a new message. It is stored in a webapp subfolder and consists of:

run.py app performing necessary data operations, generating figures and rendering a final website
two templates stored in templates subfolder: master.html with a main page and it's extension (go.html) displaying new message classification results.

Implementation

To run the app:

Run the ETL pipeline:
1. go to data folder
2. type python process_data.py disaster_messages.csv disaster_categories.csv disaster.db to run process_data.py, read-in csv files and finally store them into disaster.db SQLlite file.
Run the ML (Machine Learning) pipeline:
1. go to model folder
2. type python train_classifier.py ../data/disaster.db model.pkl to execute a ML pipeline, taking a disaster.db as input and storing a final model into a model.pkl file (pickle).
Finally, run the web app:
1. go to app folder
2. run python run.py and follow the on-screen instruction (just open http://0.0.0.0:3001 in Your browser).

Requirements

You will need:

Flask==1.0.2 nltk==3.4 numpy==1.15.4 pandas==0.22.0 plotly==3.4.2 scikit-learn==0.20.1 SQLAlchemy==1.2.14

Acknowledgments

Udacity.com: for a great idea for the project, and a 'starter' pack (useful scripts)
FigureEight.com for very good datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
app		app
data		data
model		model
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Disaster Response Project

Project overview

Project structure

ETL pipeline

Modelling component

Web application

Implementation

Requirements

Acknowledgments

About

Releases

Packages

Languages

wodecki/DSND-disaster

Folders and files

Latest commit

History

Repository files navigation

Disaster Response Project

Project overview

Project structure

ETL pipeline

Modelling component

Web application

Implementation

Requirements

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages