Last year 28 U.S. cities published open data policies, signaling a national move from reactive responses to public records requests to systematically and proactively publishing open data. For most of the last 50 years, Freedom of Information Laws (FOI) have ensured that residents have access to information on government activities as a fundamental, democratic right. But now that many of these cities are “setting the default to open,” city staff are left guessing about how to balance these two varying but equally crucial channels for public access to information.
From a research perspective, the relationship between FOI laws and open data policies can be framed as: are the channels of open data and FOI law competitors or complements? Understanding the relationship between open data and public records requests (PRRs) could help city staff better allocate internal resources to improving one channel or the other. City staff need to understand exactly when and why residents are still struggling to engage with public information and apply resources to improve the way both open data and FOI laws work.
This research aims to answer the following questions:
- Does adopting an open data policy affect the volume of PRRs a city receives? Does the robustness of a city’s open data program affect the magnitude of its effect?
- What types of information are most requested via PRR? Do we see changes in the types of information requested by citizens after the adoption of an open data program?
We originally aimed to answer two other research questions, but were unable to do so due to insufficient available data:
- Does adopting an open data policy affect the variety of requestors submitting PRRs?
- Does adopting an open data policy affect the time it takes cities to complete PRRs?
Version 1.0 (edited 06/28/2018) of our Pre-Analysis Plan can be found at the link.
This project is currently completed and all documents in this repository are final files. The requirements.txt file is currently undergoing updates (10/9/18)
The requirements.txt file is up to date (12/26/18)
- Blog Posts:
- White Paper (full methodology and results)
This repository is structured with the following directories:
src/
: contains code for this project, the key directories and files are as follows:data/
: contains code for data gathering and cleaning for analysisCensus_Data.ipynb
: gathering and cleaning American Community Survey data from Census APINR_Scrape.ipynb
: scraping and cleaning data from Next Request online public record request platformsData_Cleaning.ipynb
: cleaning data for research question 1 analysiscities.json
: contains metadata on cities included in sample
analysis/
q1_analysis_final.Rmd
: R markdown file containing analysis for research question 1analysis_for_blogs.R
: R file containing additional analysis and plots for blog posts and white paperPRR_Topic_Popularity_LDA.ipynb
: main file containing analysis for research question 2LDA_Model_Tests.ipynb
: contains full Latent Dirichlet Allocation (LDA) modeling for research question 2plots
: contains all plots generated by theanalysis_for_blogs.R
file
results/
: contains model output fromq1_analysis_final.Rmd
file