- Status: Proposed
- Type: Specific
- Work Package: WP3/6
- Research Coordinators: Katrien Depuydt on behalf of end user group
- Coordinators for CLARIAH: Jesse de Does
- Participating Institutes: (what institutes are participating in handling this use case?)
- End-users: User group for historical enrichment (Prof. dr. L.C.J. Barbiers, prof. dr. K.H. van Dalen-Oskam, prof. dr. J.M. van Koppen, dr. M. Rem en prof. dr. N. van der Sijs, dr. G. Bouma, prof. dr. A. Breitbarth, dr. E. Coussé, prof. dr. F. Van Eynde, prof. dr. A.M.S. van Kemenade, dr. M. Kestemont, drs. M. van der Meulen, dr. G.J. Postma, prof. dr. P.Th. van Reenen, dr. G.J. Rutten, T. Struik MA, dr. F. Van de Velde, M. de Vos MPhil, prof. dr. R. Vosters en dr. C. De Wulf.); any historical linguist interested in Dutch
- Developers: INT
- Interest Groups: IG Text
- Task IDs: Wp3/Wp6: Improved Infrastructure for Historical Dutch
Automatically tagged proof of concept diachronic corpus, consisting of a substantial reliably metadated, non-OCR subset of Nederlab data.
Concrete use cases still have to be defined; A possibility could be the historical development of the article in Dutch
- Reliably enriched historical corpus
(How can we go about solving this problem?)
A substantial reliably metadated, open data, non-OCR subset of Nederlab data.
Available state of the art tagging/lemmatization tools, a.o.
- PIE (Kestemont & manjavacas)
- DeepFrog
- Transformer-based approaches (to be selected)
cf. previous
- By accuracy, precision and recall using ground truth data
- By usability for the investigation of diachrony-related research questions
References to related resources and publications and especially links to related use-cases: