Skip to content
This repository has been archived by the owner on Jan 25, 2024. It is now read-only.

Online Resources And Data Sets

Mike Caprio edited this page Oct 13, 2016 · 80 revisions

#AMNH Library Systems

These systems are the central focus of the AMNH API Portal challenge and are key resources for many of the challenges - see each challenge for details. Teams working on challenges that use these systems should collaborate with teams working on the AMNH API Portal!


###Library catalog (Sierra)

Sierra is the online catalog for all analog library media. It contains descriptions of books, serials, archives, art, videos, and special collections. Some records have links to DSpace and Biodiversity Heritage Library. It's the best place to begin a search as it crosses formats and location. Downside: for collections-based material, the descriptions may be too general and relevant content may not come up in a search.

Sierra example: American Museum novitates.


###Digital Special Collections (Omeka)

Omeka contains digitized images and catalog numbers from the library's vast photo negative and slide collection, primarily used in web publishing. It also includes Rare Books and some archives materials. Pull for image-based results and metadata (location, content, and identities may be included).


###Digital Library (DSpace)

DSpace is a digital repository for AMNH publications such as scientific publications, Annual Reports, and other documents. It also includes some research data sets, manuscripts and dissertations from the Richard Gilder Graduate School. There is duplication of metadata from Sierra, the Library's catalog. It can possibly be used for keyword search of OCR text for relevant hits. This could borrow from the results display made for Snippet Search to highlight the best matches. Publication covers may be used as graphic representations. Images may be searched based on their captions.


###Special Collections Archives (ArchivesSpace)

Descriptions of archives on collection and container levels which are specific to series or folders within a collection. An abbreviated collection description is also available in the Library catalog. ArchivesSpace goes deeper in describing the materials found in a collection. Can be used to pull relevant data from folder-level descriptions, offering better potential for discovery.


###AMNH Library Authorities (xEAC)

EAC-CPF - Encoded Archival Context for Corporate Bodies, Persons, and Families - is an XML-Schema. It provides a grammar for encoding names of creators of archival materials and related information. xEAC is an open-source XForms-based application for creating and managing EAC-CPF collections. The AMNH implmentation is a database of museum related entities: people, departments (some), permanent halls, expeditions. it provides general information about the entities, some entries more detailed than others. It includes links to related entities and related materials providing a very rich resource for entity networks, or identity constellations.

See the Whitney South Sea Expedition. The names and resources have all been hard-coded into the record, but we anticipate a future where the relationships may be pulled together dynamically. Very useful to note: controlled and local versions of the names are identified, geographic locations are listed separately and linked to external databases, includes structured data such as timelines, associated dates and roles. There is huge potential for visualization using this metadata. Unlike the above resources, xEAC provides information the who, what, when, where and why of content creators. All identities in the database have unique IDs.


###SNAC, International cooperative for entities

SNAC is also a database for entities. It has been called the "Facebook for dead people" because of the relational networks. The current beta prototype includes hundreds of thousands of records, mostly derived from catalog and finding aid (archives) descriptions. This may not be helpful in this challenge, but we are planning to contribute many of our records to the database. Below is a [rough visualization] (http://socialarchive.iath.virginia.edu/ark:/99166/w6wx3qrw?mode=RGraph) of their [Whitney South Sea Expedition] (http://socialarchive.iath.virginia.edu/ark:/99166/w6wx3qrw) identity constellation.

Whitney South Sea Expedition identity constellation by SNAC


###Biodiversity Heritage Library (BHL)

AMNH is a participating member of the BHL consortium of natural history and botanical libraries. We contribute digitized publications and field work to be accessed openly in a global "biodiversity commons." There is some overlap between DSpace and BHL. Like DSpace, this is useful for searching content within a resource and pulling graphic elements or images for the results page.


###Wordpress, blog platform for special projects (Hidden Collections)

An outreach tool used to highlight archival material housed in our collections and describe our grant project goals. Posts feature unique artifacts and general backgrounds written by student interns. We published lists of entities in spreadsheets and other useful resources as a way of documenting our process of building out our descriptive metadata. This site can be mined for images relating to search queries. Not all images are captioned, but the content is described in the blog posts.


#Computer Vision / Text Reading Implementations