Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EN] Extraction of Electronic Catalogue of Armenian Cultural Values #12

Open
vvbabayan opened this issue Jun 7, 2023 · 0 comments
Open
Labels
extraction Task that require data extraction (scraping) skills topic-culture Tasks dedicatated Armenian culture, language and history

Comments

@vvbabayan
Copy link
Collaborator

vvbabayan commented Jun 7, 2023

Goal
The goal is to collect the data from the Armenian Treasury website.

Tasks
The sections are listed on the left side of the main page, many of them contain subsections. For each piece regardless the section, the same metadata are provided: the section, subsection, genre, owning museum, description, material, the place of production, the link in the catalogue, etc. As an output, we would welcome any machine-readable file (such as JSON, or XML, or CSV in a flat structure) listing the treasuries and the metadata provided for them. After that, you need to download all the images of the exhibits in this dataset and put them in some temporary storage from which the Open Data Armenia command will transfer to the permanent one.

Context
The website consists of various sections classifying the cultural heritage available in the listed museums. The data are available in Armenian only, however, the language knowledge is not a necessary skill to accomplish this task. Should you have any questions, address them in our Telegram chat or to Valeria in Telegram.

Requirements
A public GitHub repository should be created to store and publish the code and the data under one of the free and open licenses, such as Creative Commons or MIT.

Wishes
It would be best if your code is reusable, that is can be launch again by anyone who might want to update the dataset at a later point. For the same reason, we encourage you to comment your code, supplement it with at least a very brief README description, and specify the requirements and dependencies necessary to use the code.

Resources

  1. Armenian Treasury website

Prepared by
The Open Data Armenia team prepared this task.

@vvbabayan vvbabayan added extraction Task that require data extraction (scraping) skills topic-culture Tasks dedicatated Armenian culture, language and history labels Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extraction Task that require data extraction (scraping) skills topic-culture Tasks dedicatated Armenian culture, language and history
Projects
None yet
Development

No branches or pull requests

1 participant