Skip to content

SNAP:DRGN integration for Recogito

Gabriel Bodard edited this page Dec 11, 2020 · 13 revisions

Hack day: Thursday, July 27, 2017 (Institute of Classical Studies, London)

Participants: Jonathan Blaney (IHR, British History Online), Gabriel Bodard (ICS, SNAP, ), Karl Grossner (Pitt WH Gazetteer, Linked Pasts), Timothy Hill (EDL, Linked Pasts), Wolfgang Schmidle (DAI, Arachne), Rainer Simon (Recogito, Linked Pasts), Gethin Rees (British Library), Valeria Vitale (ICS, Linked Pasts)

Agenda:

This one day event will bring together a group of developers and scholars with a view to creating a prototype implementation of the Recogito geo-annotation tool for tagging and identifying person-references in target texts. We need to decide on the databases and data-formats that this implementation will recognise/require.

Some desiderata for such an interface were discussed at the second unconference day, but the basic idea is:

  1. Recogito currently enables the annotation of a text or image fragment with a place, person, or event reference, but only the place annotations allow disambiguation. (A search interface attempts to identify placenames in Pleiades, Geonames, and the other gazetteers ingested in Pelagios Commons.) Person and event have no search or disambiguation features because Pelagios does not include authority lists for non-geographical entities.
  2. It would be very useful, and in our opinion possible, to enable person disambiguation along the same lines, either by allowing the user to enter a person URI (Wikidata, VIAF, etc.) manually, or by searching in the SNAP:DRGN triplestore in the same way that the Pelagios place data is queried.
  3. Unlike for places, a simple string search for personal names would be almost useless for person-data, since homonyms are in the thousands rather than the at most dozens with places. A person-search feature would therefore have to involve at least a couple other fields of disambiguating information (place, date, titulature, etc.).
  4. An incomplete and imperfect implementation would already be a useful starting point for further development and would better highlight user needs, desiderata and specific challenges.
  5. Being able to search across only limited databases of person data (e.g. persons in Wikipedia) would be the most useful application of this feature in Recogito within pedagogical contexts.

Outcomes:

  1. We created a fork of the Recogito codebase, under the title ProsopoCogito, in the DigiClass organisation in Github.
  2. Code enhancements were by Rainer Simon and Gethin Rees, and sample person-data provided by Gabriel Bodard and Jonathan Blaney.
  3. Some recognition of SNAP:DRGN data headings is implemented, but more work needs to be done. Volunteers welcome!