(New Office) Budget Office (Phase 1): create scraper for this office #159

higorspinto · 2020-05-26T22:44:00Z

Description

The Budget Office is among the list of new offices whose datasets need to be ingested into the data portal. For this to happen, we need to create a new scraper to crawl/parse the available webpages of the office.

https://www2.ed.gov/about/overview/budget/index.html

Acceptance Criteria

We have a functional crawler that crawls through the webpages of the offices
We have a functional parser that understands the page structures and generates structured data
Datasets are produced when the scraper is run

Tasks

Identify the possible page structures in the target site
Write one or multiple parsers that cover as many cases as possible
Test if it runs well within the pipeline

Jira Card

higorspinto changed the title ~~(New Office) Office of the General Counsel: create scraper for this office~~ (New Office) Office of the General Counsel: create scraper for this office (Phase 1) May 27, 2020

osahon-okungbowa changed the title ~~(New Office) Office of the General Counsel: create scraper for this office (Phase 1)~~ Budget Office (New Office): create scraper for this office May 27, 2020

osahon-okungbowa changed the title ~~Budget Office (New Office): create scraper for this office~~ (New Office) Budget Office (Phase 1): create scraper for this office May 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(New Office) Budget Office (Phase 1): create scraper for this office #159

(New Office) Budget Office (Phase 1): create scraper for this office #159

higorspinto commented May 26, 2020 •

edited by osahon-okungbowa

Loading

(New Office) Budget Office (Phase 1): create scraper for this office #159

(New Office) Budget Office (Phase 1): create scraper for this office #159

Comments

higorspinto commented May 26, 2020 • edited by osahon-okungbowa Loading

Description

Acceptance Criteria

Tasks

higorspinto commented May 26, 2020 •

edited by osahon-okungbowa

Loading