Skip to content

4 Computational Linguistics

Gabriel Bodard edited this page Oct 29, 2020 · 23 revisions

Sunoikisis Digital Classics, Fall 2020

Session 4. Computational Linguistics

Thursday Oct 29, 16:00 UK = 17:00 CET

Convenors: Alek Keersmaekers (KU Leuven), Marton Ribary (Surrey), Thea Sommerschield (Oxford)

YouTube link: https://youtu.be/zjkyZUpvhAQ

Slides: Combined slides

Session outline

This session will give a broad overview computation linguistics as applied to ancient languages, and illustrate some of the methods and techniques through a series of short case studies, by the authors of the projects shown. Dr Ribary will discuss the importance of pre-processing texts, and show a specific case study of semantic clustering and vector representation in Latin text. Dr Sommerschield will talk about the concept of machine learning and discuss her own Pythia project on machine-assisted restoration of damaged Greek inscriptions. Dr Keersmaekers will walk through a pipeline to automatically analyze Greek papyrus texts and their specific problems.

Seminar readings

[For discussion in this forum thread.]

Further reading

  • Yannis Assael, Thea Sommerschield & Jonathan Prag. 2019. “Restoring ancient text using deep learning: a case study on Greek epigraphy.” Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 6368–6375. Available: https://arxiv.org/abs/1910.06262
  • John Bodel. 2012. “Latin Epigraphy and the IT revolution.” Epigraphy and the Historical Sciences, edited by John Davies & John Wilkes. Proceedings of the British Academy 177. Pp. 275–296. Available: https://www.academia.edu/2389348/Latin_Epigraphy_and_the_IT_Revolution_2012
  • Giuseppe G. A. Celano, Gregory Crane & Saeed Majidi, 2016. “Part of Speech Tagging for Ancient Greek.” Open Linguistics 1. Available: https://doi.org/10.1515/opli-2016-0020
  • Kristina Gulordava. 2018. Word order variation and dependency length minimisation : a cross-linguistic computational approach. Thèse de doctorat : Univ. Genève. Available: https://doi.org/10.13097/archive-ouverte/unige:106855. (Esp. chapter 3, “The DLM principle and word order variability at the language level,” pp. 64-106).
  • Alek Keersmaekers. 2020. “Automatic Semantic Role Labeling in Ancient Greek Using Distributional Semantic Modeling.” Proceedings of 1st Workshop on Language Technologies for Historical and Ancient Languages, pp. 59–67. Available: https://www.aclweb.org/anthology/2020.lt4hala-1.9
  • Alek Keersmaekers, 2020. “Creating a richly annotated corpus of papyrological Greek: The possibilities of natural language processing approaches to a highly inflected historical language.” Digital Scholarship in the Humanities 35-1, pp. 67–82. DOI: https://doi.org/10.1093/llc/fqz004 (not open access)
  • Francesco Mambrini & Marco Passarotti, 2012. “Will a Parser Overtake Achilles? First experiments on parsing the Ancient Greek Dependency Treebank.” _ Proceedings of the Eleventh International Workshop on Treebanks and Linguistic Theories (TLT11). 30 November – 1 December 2012, Lisbon, Portugal._ Available: https://publicatt.unicatt.it/handle/10807/37956
  • Barbara McGillivray. 2013. Methods in Latin Computational Linguistics. Brill. (Not open access.)
  • Marton Ribary. 2020. “A Relational Database of Roman Law Based on Justinian’s Digest.” Journal of Open Humanities Data, 6(1), p.5. Available: http://doi.org/10.5334/johd.17

Other resources

Exercise

  1. Using one (or more) of the following resources, familiarise yourself with the principles of treebanking in either Greek or Latin.
  2. Looking at the exercise described in this Treebanking session, create an account on Perseids and attempt to Treebank the first few sentences in either: Phaedrus Fable 1.1, Aesop Fable 38, or a Greek or Latin text of your choice that you know very well. Compare your trees with your teammate·s before bringing the results to class next week.
  • If you have any technical problems or questions about Treebanking or Arethusa, you may ask in this forum thread and we or your colleagues may be able to help.