Skip to content

Latest commit

 

History

History
58 lines (35 loc) · 2.07 KB

poc-corpus-historical-enrichment.md

File metadata and controls

58 lines (35 loc) · 2.07 KB

Historical enrichment: Proof of concept corpus exploitation

Metadata

  • Status: Proposed
  • Type: Specific
  • Work Package: WP3/6
  • Research Coordinators: Katrien Depuydt on behalf of end user group
  • Coordinators for CLARIAH: Jesse de Does
  • Participating Institutes: (what institutes are participating in handling this use case?)
  • End-users: User group for historical enrichment (Prof. dr. L.C.J. Barbiers, prof. dr. K.H. van Dalen-Oskam, prof. dr. J.M. van Koppen, dr. M. Rem en prof. dr. N. van der Sijs, dr. G. Bouma, prof. dr. A. Breitbarth, dr. E. Coussé, prof. dr. F. Van Eynde, prof. dr. A.M.S. van Kemenade, dr. M. Kestemont, drs. M. van der Meulen, dr. G.J. Postma, prof. dr. P.Th. van Reenen, dr. G.J. Rutten, T. Struik MA, dr. F. Van de Velde, M. de Vos MPhil, prof. dr. R. Vosters en dr. C. De Wulf.); any historical linguist interested in Dutch
  • Developers: INT
  • Interest Groups: IG Text
  • Task IDs: Wp3/Wp6: Improved Infrastructure for Historical Dutch

Description

Automatically tagged proof of concept diachronic corpus, consisting of a substantial reliably metadated, non-OCR subset of Nederlab data.

What is the research about?

Concrete use cases still have to be defined; A possibility could be the historical development of the article in Dutch

What problem is hindering the research?

  • Reliably enriched historical corpus

What is needed to do the research?

(How can we go about solving this problem?)

Data

A substantial reliably metadated, open data, non-OCR subset of Nederlab data.

Tools

Available state of the art tagging/lemmatization tools, a.o.

  • PIE (Kestemont & manjavacas)
  • DeepFrog
  • Transformer-based approaches (to be selected)

What software and services are involved?

cf. previous

How to evaluate this?

  • By accuracy, precision and recall using ground truth data
  • By usability for the investigation of diachrony-related research questions

References

References to related resources and publications and especially links to related use-cases: