Skip to content

Data Wrangle: Gather, Clean, Assess, and Visualize/ Twitter API -JSON- HTML

Notifications You must be signed in to change notification settings

hwangmpaula/data-wrangling

Repository files navigation

Data-Wrangling

Note: This is the report of my wrangling effort My definition of data wrangling is like doing a chore. Imagine I have a messy and dirty room. What do I need first? I get a box to gather all my items. I am looking around the room and detects the clutters and dirt to plan. Then, I organize them by dividing my clothes, papers, books, and electronics.
In this project, I was doing chores on the data with my computer. Three things that I am doing is to gather, assess, and clean data for the purpose to complete my assignment at Data Analysis Udacity Nanodegree.

Introduction:

I wrangled the data from a popular Twitter account called, WeRateDogs, where people check or share cute and funny pictures of their dogs. This report consisted of the details of my wrangling effort for this project.
https://twitter.com/dog_rates?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eautho r

The Goal to Complete the project:
  • Gather Data
  • Assessing Data
  • Clean Data
  • Vizualize Data
  • Answers the questions
Another Files in data-wrangling:

Gather: There are three files that I have gathered.

  • wrangle_act.ipynb: code for gathering, assessing, cleaning, analyzing, and visualizing data
  • wrangle_report.pdf: documentation for data wrangling steps: gather, assess, and clean
  • act_report.pdf or act_report.html: documentation of analysis and insights into final data
  • twitter_archive_enhanced.csv: file as given
  • image_predictions.tsv: file downloaded programmatically
  • tweet_json.txt: file constructed via API
  • twitter_archive_master.csv: combined and cleaned data any additional files (e.g. files for additional pieces of gathered data or a database file for your stored clean data)

data-wrangling uses some open source projects to work properly:
  • [Jupyter Notebook] or [Python 3] - jupyter notebook is an open source and used to data analyze with python code
  • [matplotlib] - uses to facilitate the data analyzation by displaying the plots
  • [numpy] - is a fundamental scientific computing.
  • [Pandas] - uses to clean, organize, convert, and merge the data.
  • [requests] - Requests is an elegant and simple HTTP library for Python, built for human beings. (source: http://docs.python-requests.org/en/master/)

License

MIT

About

Data Wrangle: Gather, Clean, Assess, and Visualize/ Twitter API -JSON- HTML

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published