Skip to content

Exist db mashup App

Pietro Liuzzo edited this page May 29, 2017 · 4 revisions

Exist-db mashup application working group

This is a test app built from the exist-db, a natively XML database which uses XQuery.

The app uses different data sources with different methods to bring together useful resources for an epigraphic corpus.

Code, without data, has been copied to https://github.com/EpiDoc/OEDUc/pull/1 and is awaiting for merge

Visible, with data from EDH dumps at

http://betamasaheft.aai.uni-hamburg.de:8080/exist/apps/OEDUc/

Preliminary twicks to the data included:

  • adding an xml:id to the text element to speed up retrival of items in exist. (the xquery doing this is in the AddIdToTextElement.xql file)
  • note that there is no pleiades id in the EDH xml (or in any EAGLE dataset), but there are trismegistos geo ID! This is because it was planned to get all places of provenance in Trismegistos GEO to map them later to Pleiades. This was started using Wikidata but is far from complete and is currently in need for update. More details on this on request.

The features

  • In the list view you can select an item. Each item can be edited normally (create, update, delete)
  • The editor that updates files reproduces in simple XSLT a part of the Leiden+ logic and conventions for you to enter data or update existing data. It validates the data after performing the changes against the tei-epidoc.rng schema. Plan is to have it validate before it does the real changes.
  • The search simply searches in a number of indexed elements. It is not a full text index. There are also range indexes set to speed up the queries beside the other indexes shipped with exist.
  • You can create a new entry with the leiden plus editor and save it. it will be first validated and in case is not ok you are pointed to the problems. Here I did not yet have times to add the vocabularies and update the editor.
  • Once you view an item you will find in nasty hugly tables a first section with metadata, the text, some additional information on persons and a map:
  • The text exploits some of the parameters of the EpiDoc Stylesheets. You can change the desired value, hit change and see the different output.
  • The ids of corresponding inscriptions, are pulled from the EAGLE ids API here in Hamburg, using Trismegistos data. This app will be soon moved to Trismegistos itself, hopefully.
  • The EDH id is instead used to query the EDH api and get the information about persons, which is printed below the text.
  • For each element with a @ref in the XML files you will find the name of the element and a link to the value. E.g. to link to the EAGLE vocabularies
  • In case this is a TM Geo ID, then the id is used to query Wikidata SPARQL endpoint and retrive coordinates and the corresponding Pleiades id (given those are there). Same logic could be used for VIAF, geonames, etc. There were uploads of ids last year and an attempt to align the non matched Pleiades and Trismegistos ids was also made in 2015/16. This task is done via a http request directly in the xquery powering the app.
  • The pleiades id thus retrieved (which could be certainly obtained in other ways) is then used in javascript to query Pelagios and print the map below (taken from the hello world example in the pelagios repository)
  • at http://betamasaheft.aai.uni-hamburg.de/api/OEDUc/places/all and http://betamasaheft.aai.uni-hamburg.de/api/OEDUc/places/all/void two rest XQ function provide the ttl files for pelagios. The places annotations, at the moment only for the first 20 entries. See rest.xql.

future tasks

For the purpose of having a sample app to help people get started with their projects and see some of the possibilities at work, beside making it a bit nicer I think it would be useful if this could also have the following, which I did not manage to do

  • add more data from EDH-API, especially from edh_geography_uri which Frank has added and has the URI of Geo data; adding .json to this gets the JSON Data of place of finding, which has a "edh_province_uri" with the data about the province.
  • validate before submitting
  • add more support for parameters in the EpiDoc example xslt (e.g. for Zotero bibliography contained in div[@type='bibliography'])
  • improve the upconversion and the editor with more and more precise matchings
  • provide functionality to use xpath to search the data
  • add advanced search capabilities to filter results by id, content provider, etc.
  • add images support
  • include all EAGLE data (currently only EDH dumps data is in, but the system scales nicely as far as I can judge)
  • include query to the EAGLE media wiki of translations (api currently unavailable)
  • show related items based on any of the values
  • include in the editor the possibility to tag named entities