Skip to content

Latest commit

 

History

History

models

Linked data model for EMPlaces

This directory contains diagrams and documentation and sample data for a proposed linked data model for EMPlaces.

See also links for related information.

NOTE: this is WORK-IN-PROGRESS, and NOT FINAL.

(In due course, the documentation itself may be moved into this page)

TODO

  • Finalize some URIs to be used

  • Finalize prefixes

  • Finish assembling example data for Opole

    • Fix structure and vocabulary for bibliography (using BIBO)
    • Historical hierarchies
    • Fix vocabularies for timespans (PeriodO?)
    • Uncertainties, approximations, etc
    • Related resource, include optional license (see LPiF proposal)
      • also, type of resource?
  • Pin down location/timespan vocabularies (the structure here follows ideas from Topotime/GeoJSON-LDT (also a work-in-progress?) but the vocabularies haven't yet been checked.)

  • Review format for timespan data; align with emerging LPiF activity (loosely aliogned, but not same vocabulary for now)

  • Update Data model notes; resolve remaining TODOs there, as far as possible.

  • Generate data from EMLO data

    • first 500 done; awaiting feedback
  • note 2 kinds of Julian calendars

    • currently using "Old style" and "New style", but this is probably not enough.
    • suggestion that we might add start-of-year date to calendar details, but even that isn't always enough: in some cases 25 Dec start of year occurs in the year BEFORE that indicated?
    • currently handled in Annalist definitions. Can revisit later if needed.
  • Review handling of date uncertainty/aprroximation, and how it relates to PGiF proposal to use ISO 8601-2 (https://github.com/LinkedPasts/lpif)

    • currently using LPiF-related structures, rather than ISO date formats
  • Type URIs for places that don't have same kind of info (cf. "related places", 20180726-St-Adalbert-example.ttl)

  • Record metadata design (creator, contributor, license, etc.)

    • waiting to see what Timbuctoo provides
  • Record time periods as identified resources (rather than inline blank nodes)

    • Handled in Annalist data; not yet in Opole example
  • URIs for language codes (consider lexvo?)

    • holding pattern in place, and Annalist data. Revisit later.
  • ...

  • Decide how to flag "core data" in structures used for both core and additional data (needed for refresh of core data from source). It seems what is really needed is a reference source indication.

    • A declared set of properties of the central em:Place resource are always core data: see em:Place class declaration.
    • Qualified relations, settings and annotation resources have (optional) em:source properties: the corresponding values are considered to be core data if the em:source/em:link value is the same as the place's em:coreDataRef/em:link value.
    • Note that the place's em:coreDataRef/em:link and corresponding em:source/em:link values refer to the gazeteer defining document, not the gazetteer place Id.
  • Decide on structure for place categories and annotation types (using skos:Concepts)

  • Check that proposed Web Annotation extensions don't break anything (email with Robert Sanderson)

  • Update diagrams

  • Crosswalk between UI and data model

  • Dealing with uncertainty

    • See example, em:competence
    • How to represent?
      • Confidence flag (high/medium/low)? Not really going to work for us?
      • Permutation of (Uncertain, Inferred, Approximate) per EMLO. But be explicit in all cases.
    • Where to represent: calendars, date of hierarchical relations
    • For places: Uncertain, Inferred
    • For dates: Uncertain, Inferred, Approximate (implicit the timespan value)
  • Characterize calendars (inherited calendars; need to materialize for indexing; need source indication that it is inherited)

    • Handled by em:competence
    • general issue here about materialization of inferred/implied/deduced information.
    • note that name attestations are additions to the primary info; no such for calendars.
  • Review the way that core data is represented

Notes (questions and ideas)

@@TODO: allocate URIs for attestations (where for our purposes, the oa:Annotation denotes the attestation)?

Priorities (2018-08-01, 2018-08-14)

  1. Modelling of core data to deal with multiple core data sources
    • update diagrams
    • update sample data
  2. Need to start getting data into RDF and/or Timbuctoo
    • e.g. 4 more towns like Opole to comparable detail
    • Create Annalist definitions for data entry
  3. Bulk data into RDF/Timbuctoo
    • e.g. 2000 places core data from GeoNames
    • update core data extractor to generate multi-source structure
    • Wikidata cross-referencing for alternate authorities
  4. Loose ends
    • Modelling and Opole data
    • Related places
    • Historical place record: minimum viable information
      • differs from current place
      • discuss with Arno
  5. Generate EMPlaces data mapped from Annalist
    • Basic mapping seems to work
    • Figure how to handle local URIs (which are roughly blank nodes) [Note: depends on Annalist code updates in 0.5.13 in development branch]
  6. Annalist data
    • Generate Annalist data from geonames; get full update data
    • Review Annalist definitions with Arno
  • Sample bibliographies to enter
  • Dealing with sources (see issues #19, #13, #7 ?)
  • Integration of date conversions

Actions