4 Computational Linguistics

Sunoikisis Digital Classics, Fall 2020

Session 4. Computational Linguistics

Thursday Oct 29, 16:00 UK = 17:00 CET

Convenors: Alek Keersmaekers (KU Leuven), Marton Ribary (Surrey), Thea Sommerschield (Oxford)

YouTube link: https://youtu.be/zjkyZUpvhAQ

Session outline

This session will give a broad overview computation linguistics as applied to ancient languages, and illustrate some of the methods and techniques through a series of short case studies, by the authors of the projects shown. Dr Ribary will discuss the importance of pre-processing texts, and show a specific case study of semantic clustering and vector representation in Latin text. Dr Sommerschield will talk about the concept of machine learning and discuss her own Pythia project on machine-assisted restoration of damaged Greek inscriptions. Dr Keersmaekers will walk through a pipeline to automatically analyze Greek papyrus texts and their specific problems.

Seminar readings

[For discussion in this forum thread.]

Mike Kestermont & Justin A. Stover. 2016. "The Authorship of the Historia Augusta: Two new computational studies." Bulletin of the Institute of Classical Studies 59.2. Pp. 140–157. Available: https://onlinelibrary.wiley.com/doi/epdf/10.1111/j.2041-5370.2016.12043.x
Marton Ribary & Barbara McGillivray. 2020. “A Corpus Approach to Roman Law Based on Justinian’s Digest.” Informatics 7, 44. Available: https://doi.org/10.3390/informatics7040044

Other resources

Pythia notebooks and tutorials
Linking Latin project (and especially Latin word embeddings)
Marton's Justinian Digest project, including demo codes and Collab notebook
Pedalion, Alek's semantic role labeller on Github

Exercise

Using one (or more) of the following resources, familiarise yourself with the principles of treebanking in either Greek or Latin.
- Print guidelines:
  - Latin Guidelines (Bamman et al. 2007)
  - Ancient Greek Guidelines (Celano 2018)
- Video tutorials:
Looking at the exercise described in this Treebanking session, create an account on Perseids and attempt to Treebank the first few sentences in either: Phaedrus Fable 1.1, Aesop Fable 38, or a Greek or Latin text of your choice that you know very well. Compare your trees with your teammate·s before bringing the results to class next week.

If you have any technical problems or questions about Treebanking or Arethusa, you may ask in this forum thread and we or your colleagues may be able to help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly