Skip to content
apanella edited this page Oct 14, 2013 · 7 revisions

For this project, we used three primary sources of data, detailed below.

311 Service Requests from Chicago's Open Data Portal

The City of Chicago publishes the twelve most popular service requests on it's open data portal:

The data is available starting from January 2011, and is updated daily with new entries. As of summer 2013, more than 4,000,000 311 are available on the data portal.

Most service requests types have the same data model, as illustrated by this example of a pothole filling request from 2012:

field value
CREATION DATE 07/17/2012
STATUS Completed
COMPLETION DATE 07/30/2012
SERVICE REQUEST NUMBER 12-01275916
TYPE OF SERVICE REQUEST Pot Hole in Street
CURRENT ACTIVITY Final Outcome
MOST RECENT ACTION Pothole Patched
NUMBER OF POTHOLES FILLED ON BLOCK 1
STREET ADDRESS 5100 S NATCHEZ AVE
ZIP 60638
X COORDINATE 1133903.99013187
Y COORDINATE 1870063.02512769
Ward 23
Police District 8
Community Area 56
LATITUDE 41.7996563192028
LONGITUDE -87.78447500121634
LOCATION (41.7996563192028°, -87.78447500121634°)

Some of the fields are redundant (such as the pair latitude and longitude, and location), but provide enough flexibility to be used without preprocessing in many situations. For instance, the X and Y coordinates contain the same information as longitude and latitude, but using a different coordinate reference system, that is usually employed in local maps of the Chicago area.

Some fields are specific to the kind of service request, such as "number of potholes filled on block." We do not get into every detail of these ad-hoc fields, as they are self-explanatory.

The Python script munging/get_portal_311.py allows to automatically download the most updated databases of requests in JSON format for all 12 types.

311 Service Requests from Chapin Hall and Chicago's Department of Information and Technology (DoIT)

Chapin Hall has a much larger dataset 311 service requests for academic research purposes. It dates back to 1998 and contains more than 600 requests types - a much longer time period and set of data than is available on the data portal, making this dataset a goldmine for researchers.

Like the portal datasets, this one is a subset of the City's official 311 database. Because of the way the data was extracted from the city's system, it actually has fewer fields than the open datasets. This could be a big drawback depending on the questions you're trying to answer.

field value
Date 01-JAN-08,
Type code WBT
Type Hydrant Open
Ward 3X
Community Area 2X
Address 2XXX N XXXXXX AVE CHICAGO, IL 606XX
Location (113330X.XXXXXXXX,191658X.XXXXXXXX)

(Note: the X's are added to mask the real entry, since it is not part of the City's open data.)

Big caveat about our 311 data

Neither of these sources of 311 data differentiate between requests coming from citizens and requests entered as work orders by City of Chicago employees, and there's no obvious way to distinguish between the two. This means that any analysis of these requests will pick up on service requests trends produced by both residents and city employees.

Census data

The demographics data we used are openly available and were obtained from the U.S. Census Bureau and other indirect sources, and come from the last two censuses (2000, 2010) as well as from American Community Surveys (2007-1012).

We used Census data at different levels of aggregation, as described below.

Demographics by community area

For each of the 77 Chicago community areas, we collected the following information:

  • Number ID
  • Name
  • Total population
  • Median household income
  • Proportion of Hispanic population
  • Proportion of Black population
  • Proportion of White population
  • Proportion of Asian Population
  • Proportion of population of other ethnicity/race

This data comes from the 2010 Census was downloaded from the website of Rob Paral and Associates.

Demographics by census tract

For each of the 800 census tracts in the City of Chicago, we retrieved the following information from the American Community Survey (2007-2011).

  • Tract ID
  • Total population
  • Proportion of makes and females
  • Proportion of population divided by age groups (0-14, 15-24, 25-64, 64+)
  • Proportion of population divided by ethnicity/race (White, Black, Asian, Hispanic, Other)
  • Median income
  • Proportion of families below the poverty line
  • Unemployment rate

This data was retrieved using the FactFinder interface on the Census Bureau website, and was organized and cleaned using the two scripts in the munging/ACS_Census directory.