An Omeka Classic plugin originally developed for the digital editions of the European Holocaust Research Infrastructure. The plugin supports the editorial workflow which links document annotations to controlled vocabularies (EHRI, Geonames), EHRI archival descriptions and other resources. The plugin makes it possible to use documents encoded in the TEI P5 XML format to build a rich Omeka presentation.
The plugin allows you to:
- enhance headers of TEI documents with metadata from the EHRI Portal and Geonames
- create Omeka items from uploaded TEI files, with Omeka metadata elements populated via customisable XPath mappings
- associate images and other tertiary files
- create Neatline exhibits from location data and other metadata in the TEI headers
It also adds various view helpers for rendering TEI-derived info and a few Neatline shortcodes for use within SimplePages and ExhibitBuilder text blocks.
The plugin can be used together with the EHRI Omeka Editions Theme.
This plugin contains a command-line tool for looking up entity references in TEI body text and
adding enriched canonical entity data to the header <sourceDesc>
section. See the tools
README file for details.
Since 0.0.3 this functionality is available on file import, with caveats.
An edition consists of a set of master TEI XML documents and associated files which might consist of:
- images
- PDFs
- extra TEIs containing translations etc
The plugin relies on file naming conventions to map uploaded TEIs and associated files to the Dublin Core identifier field of Omeka items. The master TEI XML file and associated TEIs are named as follows:
[dc-identifier]_[langcode].xml
For example, for an item with the identifer abc-123-def-456
in English the TEI would be named:
abc-123-def-456_EN.xml
Note: an underscore must separate the ISO-639-1 language code from the identifier.
Associated images, PDFs etc must be named using an ascending index number instead of the language code, for example:
abc-123-def-456_01.jpg
When uploading master TEI documents the plugin will extract information from the TEI header and use it to populate Omeka metadata fields. These XML-Omeka-element mappings are configurable but the defaults are as follows:
- Identifier
tei:TEI/tei:teiHeader/tei:profileDesc/tei:creation/tei:idno
- Title
tei:TEI/tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:title
- Subject
tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:list/tei:item/tei:name
- Description
tei:TEI/tei:teiHeader/tei:profileDesc/tei:abstract
- Creator
-
tei:TEI/tei:teiHeader/tei:profileDesc/tei:creation/tei:persName
,tei:TEI/tei:teiHeader/tei:profileDesc/tei:creation/tei:orgName
- Source
-
tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:bibl
,tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:collection/@ref
- Publisher
tei:TEI/tei:teiHeader/tei:fileDesc/tei:publicationStmt/tei:publisher/tei:ref
- Date
tei:TEI/tei:teiHeader/tei:profileDesc/tei:creation/tei:date/@when
- Rights
tei:TEI/tei:teiHeader/tei:fileDesc/tei:publicationStmt/tei:availability/tei:licence
- Format
tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:physDesc
- Language
-
tei:TEI/tei:teiHeader/tei:profileDesc/tei:langUsage/tei:language
,tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:bibl/tei:textLang
- Coverage
/tei:TEI/tei:teiHeader/tei:profileDesc/tei:creation/tei:placeName
In addition to the DC fields, the plugin will also map the TEI body text to the Text item type
metadata Text
element and create a new item type TEI with elements Person
, Organisation
, and Place
to which the tei:sourceDesc/tei:listPerson/tei:person/tei:persName
,
tei:sourceDesc/tei:listOrg/tei:org/tei:orgName
, and
tei:sourceDesc/tei:listPlace/tei:place/tei:placeName
respectively will be mapped.
The plugin global options are as follows:
- Default Item Type: Newly-created Omeka items will be assigned this item type
- Template Exhibit: When the plugin creates Neatline exhibits from TEI data is can use an existing Neatline exhibit as a template from which existing settings and Neatline records will be copied.
The plugin has three main areas of functionality:
- Ingesting, updating, and associating tertiary files with TEI-based Omeka items
- Exporting TEI data and associated files
- Configuring XPath-to-Omeka field mappings
Once TEI files have been created and named correctly they can be ingested into Omeka. Doing so will create one new Omeka item per master TEI file with metadata populated as per the above XPATH mappings.
Documents can either be ingested one-by-one or as a zip file containing multiple files.
The plugin counts on any changes to documents being made in the TEI files rather than directly in Omeka. Updated TEI files can be reingested any number of times and metadata in Omeka fields will be updated accordingly.
If you change XPath-to-Omeka field mappings and have existing Omeka items you can re-extract the harvested data in bulk using this function.
Once Omeka items have been created from the master TEI documents it is possible to upload any associated files, which will be assigned to Omeka items according to the naming convention described above. As with master TEI documents, multiple associated files can be uploaded in a zip.
Note: uploading associated files will error if files exist within the uploaded archive that cannot be paired with an existing Omeka item.
If ingesting a large number of files at once via a zip archive it is easy to exceed the default limits on
PHP's post_max_size
and upload_max_filesize
settings. Check your php.ini
and increase the limits if
you find this to be the case.
The export function allows you to download single archives containing either:
- the master TEI files, or
- all associated files
This function is mainly for synchronising the data across Omeka instances.
Importing, enriching, and creating Neatline items for a lot of files is quite slow and the admin UI doesn't present any feedback while this is happening.
Integration tests need to be written for more admin functionality. A test-case base class has been created for this purpose.