-
Notifications
You must be signed in to change notification settings - Fork 26
Schematizing XML: TEI and Project Constraints
XML can be schema checked so that only certain elements and attribute values are permitted. An XML schema can also define a particular order elements are expected to appear in the existing "well-formed" hierarchy of XML. Such validation abilities are what has brought us the TEI and are why most projects have specific documentation regarding the elements, attributes, and the document hierarchy of the project's XML files.
-
Available vocabulary used to name elements, attributes, and attribute values.
-
Grammar for how the vocabulary is used: rules for nesting, sequencing, etc.
TEI stands for Text Encoding Initiative. In the most general sense, the TEI is an international and interdisciplinary standard that is widely used by libraries, museums, publishers, and individual scholars to represent all kinds of textual material for online research and teaching. The TEI Consortium is an international organization of scholars whose mission is to develop and maintain guidelines for the digital encoding of literary and linguistic texts. Those guidelines are referred to as the TEI Guidelines and they are used to structure the schema file which projects can reference from their XML files. The schema and guidelines define the “grammar” for how certain elements and attributes are to be used and provide the rules for nesting, sequencing, etc.
When we check our XML files against a set of schema rules, we are checking their validity. We can use validity checks to make sure we’re spelling element names properly, writing attribute names and values consistently, and nesting elements in a way that makes sense to us that we want to hold consistent.
Note: an XML document can be well formed, but not be valid; whereas, an XML document cannot be valid if it is not well formed.
The lessons and exercises constructed for this course incorporate materials from Dr. Elisa Beshero-Bondar's Digital Humanities courses, the Digital Mitford Coding School, the Text Encoding Initiative's learning resources, GitHub Guides, and the GitHub Help resources. This repository is public-facing, therefore, the lessons and exercises herein are licensed under a CC BY-NC-SA license.