The goal of this project was to Wrangle WeRateDogs Twitter data to create interesting and trustworthy analyses and visualizations, and specifically to:
- Perform data wrangling (gathering, assessing and cleaning) on provided thee sources of data.
- Store, analyze, and visualize the wrangled data.
- Report on 1) data wrangling efforts and 2) data analyses and visualizations.
In addition, as per project specificacion, only original tweets/ratings that have images should be used in the analysis (no retweets nor replies).
This project was completed as part of Udacity's Data Analyst Nanodegree certification.
We Rate Dogs is a Twitter account with funny or interesting facts and pictures about dogs (mainly)...
The data wrangling process (data collection from different sources, data assessing and cleaning) resulted in two analytics-ready datasets used for a simplified analysis (full analysis is out of scope of this exercise).
- Data wrangling on Twitter datasets (Jupyter notebook online HTML version)
- Data wrangling on Twitter datasets (Jupyter notebook online version, 11MB)
- Data wrangling report (PDF)
- Data analysis insights (PDF)
- Basic descriptive statistics
- Timeseries visualization using moving (rolling) averages
- Python
- Jupyter Lab
- Libraries: pandas, numpy, requests, tweepy, datetime, os, json, matplotlib