Repository for web scraping services used to support the 'Realizing Rights' research program at Brown University. Multithreaded application which crawls through school district websites to find the location on each website where school board meeting information is contained. Also tracks the presence of school district social media links. Outputs data to .csv.
- Python 3
- Conda
- Navigate to repo location and create conda environment
conda env create -f env.yml
- Activate the new environment before running any code locally
conda activate real_right_env
- Save Excel/CSV output of school district information in the
/data
folder - Specify filepath of data from step 1 in the
source_info
dictionary inmain.py
- Specify filepath and file name of output data in
main.py
- Specify number of districts to process in the program with
max_dist_runs
- Run
main.py