Improve scraper for Office of the Chief Information Officer (Phase 2) #165

higorspinto · 2020-05-27T00:12:39Z

During phase 1 we created a functional scraper for crawling and parsing data from this office. The scraped data was successfully ingested into the data portal.

For phase 2, we need to improve the quality of metadata and data-content for the datasets being generated by the scraper.

https://www2.ed.gov/about/offices/list/ocio/index.html

Acceptance Criteria

we have marked improvement in the quality of metadata and data-content of datasets produced by the scraper.
the improved quality datasets are visible on the data portal

Tasks

Ensure datasets produced have a description metadata
Ensure datasets have a publisher metadata
Improve other metadata (use defaults where available)

Jira Card

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve scraper for Office of the Chief Information Officer (Phase 2) #165

Improve scraper for Office of the Chief Information Officer (Phase 2) #165

higorspinto commented May 27, 2020 •

edited

Loading

Improve scraper for Office of the Chief Information Officer (Phase 2) #165

Improve scraper for Office of the Chief Information Officer (Phase 2) #165

Comments

higorspinto commented May 27, 2020 • edited Loading

Acceptance Criteria

Tasks

higorspinto commented May 27, 2020 •

edited

Loading