Historical enrichment: Proof of concept corpus exploitation

Metadata

Status: Proposed
Type: Specific
Work Package: WP3/6
Research Coordinators: Katrien Depuydt on behalf of end user group
Coordinators for CLARIAH: Jesse de Does
Participating Institutes: (what institutes are participating in handling this use case?)
End-users: User group for historical enrichment (Prof. dr. L.C.J. Barbiers, prof. dr. K.H. van Dalen-Oskam, prof. dr. J.M. van Koppen, dr. M. Rem en prof. dr. N. van der Sijs, dr. G. Bouma, prof. dr. A. Breitbarth, dr. E. Coussé, prof. dr. F. Van Eynde, prof. dr. A.M.S. van Kemenade, dr. M. Kestemont, drs. M. van der Meulen, dr. G.J. Postma, prof. dr. P.Th. van Reenen, dr. G.J. Rutten, T. Struik MA, dr. F. Van de Velde, M. de Vos MPhil, prof. dr. R. Vosters en dr. C. De Wulf.); any historical linguist interested in Dutch
Developers: INT
Interest Groups: IG Text
Task IDs: Wp3/Wp6: Improved Infrastructure for Historical Dutch

Description

Automatically tagged proof of concept diachronic corpus, consisting of a substantial reliably metadated, non-OCR subset of Nederlab data.

What is the research about?

Concrete use cases still have to be defined; A possibility could be the historical development of the article in Dutch

What problem is hindering the research?

Reliably enriched historical corpus

What is needed to do the research?

(How can we go about solving this problem?)

Data

A substantial reliably metadated, open data, non-OCR subset of Nederlab data.

Tools

Available state of the art tagging/lemmatization tools, a.o.

PIE (Kestemont & manjavacas)
DeepFrog
Transformer-based approaches (to be selected)

What software and services are involved?

cf. previous

How to evaluate this?

By accuracy, precision and recall using ground truth data
By usability for the investigation of diachrony-related research questions

References

References to related resources and publications and especially links to related use-cases:

CLARIAH

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

poc-corpus-historical-enrichment.md

poc-corpus-historical-enrichment.md

Historical enrichment: Proof of concept corpus exploitation

Metadata

Description

What is the research about?

What problem is hindering the research?

What is needed to do the research?

Data

Tools

What software and services are involved?

How to evaluate this?

References

Files

poc-corpus-historical-enrichment.md

Latest commit

History

poc-corpus-historical-enrichment.md

File metadata and controls

Historical enrichment: Proof of concept corpus exploitation

Metadata

Description

What is the research about?

What problem is hindering the research?

What is needed to do the research?

Data

Tools

What software and services are involved?

How to evaluate this?

References