diff --git a/docs/usecases/.ipynb_checkpoints/john1-checkpoint.ipynb b/docs/usecases/.ipynb_checkpoints/john1-checkpoint.ipynb new file mode 100644 index 00000000..f1229f35 --- /dev/null +++ b/docs/usecases/.ipynb_checkpoints/john1-checkpoint.ipynb @@ -0,0 +1,2174 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "ade46dee-230f-43f9-a760-e39bf2996a4f", + "metadata": {}, + "source": [ + "# Examine John 1 " + ] + }, + { + "cell_type": "markdown", + "id": "558528dd-db55-4d8a-9456-4111a7e73528", + "metadata": {}, + "source": [ + "## Table of content \n", + "\n", + "* 1 - Introduction\n", + "* 2 - Load Text-Fabric app and data\n", + "* 3 - Performing the queries\n", + "* 4 - Display syntax tree \n", + " * 4.1 - View 1: combined view (display all nodes) \n", + " * 4.2 - View 2: syntactic view (no display of wordgroup nodes)\n", + " * 4.3 - View 3: XML source view (no display of clause, phrase, or subphrase nodes) " + ] + }, + { + "cell_type": "markdown", + "id": "9c53b8ad-c1f0-4e97-bcb1-8df75190921c", + "metadata": {}, + "source": [ + "# 1 - Introduction \n", + "##### [Back to TOC](#TOC)" + ] + }, + { + "cell_type": "markdown", + "id": "26579597-855b-41ce-8f55-4cd60faad6c8", + "metadata": {}, + "source": [ + "This Jupyter Notebook examines John 1 verse 1 and shows the three views on the data set.\n", + "\n", + "This Text-Fabric data set contains all wg data contained in the source XML data in is original type (which are the 'wg' nodes) and in its interpreted type (which converted each 'wg' node into either a 'clause', 'phrase', or 'subphrase' node depending on the data associated with the original 'wg')." + ] + }, + { + "cell_type": "markdown", + "id": "7b536b93-0396-4af3-9fee-408267a1c80c", + "metadata": {}, + "source": [ + "# 2 - Load Text-Fabric app and data \n", + "##### [Back to TOC](#TOC)" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "d45a2f25-51ec-4826-82b4-8eb4f409fd30", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The autoreload extension is already loaded. To reload it, use:\n", + " %reload_ext autoreload\n" + ] + } + ], + "source": [ + "%load_ext autoreload\n", + "%autoreload 2" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "f88147bc-7d18-496d-a8a1-280500b940e5", + "metadata": {}, + "outputs": [], + "source": [ + "# Loading the Text-Fabric code\n", + "# Note: it is assumed Text-Fabric is installed in your environment.\n", + "from tf.fabric import Fabric\n", + "from tf.app import use" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "a98758d0-9ca9-45bf-9bce-bf37f1ea4b0f", + "metadata": { + "scrolled": true, + "tags": [] + }, + "outputs": [ + { + "data": { + "text/markdown": [ + "**Locating corpus resources ...**" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "app: ~/text-fabric-data/github/saulocantanhede/tfgreek2/app" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "data: ~/text-fabric-data/github/saulocantanhede/tfgreek2/tf/0.5.1" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "\n", + " Text-Fabric: Text-Fabric API 11.4.10, saulocantanhede/tfgreek2/app v3, Search Reference
\n", + " Data: saulocantanhede - tfgreek2 0.5.1, Character table, Feature docs
\n", + "
Node types\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + "\n", + "
Name# of nodes# slots/node% coverage
book275102.93100
chapter260529.92100
verse794417.34100
sentence801117.20100
clause522428.56324
wg1068686.88533
phrase1195602.95256
subphrase728451.0053
word1377791.00100
\n", + " Sets: no custom sets
\n", + " Features:
\n", + "
saulocantanhede - tfgreek2/tf\n", + "
\n", + "\n", + "
\n", + "
\n", + "after\n", + "
\n", + "
str
\n", + "\n", + " material after the end of the word\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "appositioncontainer\n", + "
\n", + "
int
\n", + "\n", + " 1 if it is an apposition container\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "articular\n", + "
\n", + "
int
\n", + "\n", + " 1 if the wg has an article\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "before\n", + "
\n", + "
str
\n", + "\n", + " this is XML attribute before\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "book\n", + "
\n", + "
str
\n", + "\n", + " book name (abbreviated), from ref attribute in xml\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "book_short\n", + "
\n", + "
str
\n", + "\n", + " this is XML attribute book_short\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "case\n", + "
\n", + "
str
\n", + "\n", + " grammatical case\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "chapter\n", + "
\n", + "
int
\n", + "\n", + " chapter number, from ref attribute in xml\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "clauseType\n", + "
\n", + "
str
\n", + "\n", + " clause type\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "cls\n", + "
\n", + "
str
\n", + "\n", + " this is XML attribute cls\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "cltype\n", + "
\n", + "
str
\n", + "\n", + " clause type\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "criticalsign\n", + "
\n", + "
str
\n", + "\n", + " this is XML attribute criticalsign\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "crule\n", + "
\n", + "
str
\n", + "\n", + " clause rule (from xml attribute Rule)\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "degree\n", + "
\n", + "
str
\n", + "\n", + " grammatical degree\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "discontinuous\n", + "
\n", + "
int
\n", + "\n", + " 1 if the word is out of sequence in the xml\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "domain\n", + "
\n", + "
str
\n", + "\n", + " domain\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "framespec\n", + "
\n", + "
str
\n", + "\n", + " this is XML attribute framespec\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "function\n", + "
\n", + "
str
\n", + "\n", + " this is XML attribute function\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "gender\n", + "
\n", + "
str
\n", + "\n", + " grammatical gender\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "gloss\n", + "
\n", + "
str
\n", + "\n", + " short translation\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "id\n", + "
\n", + "
str
\n", + "\n", + " xml id\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "junction\n", + "
\n", + "
str
\n", + "\n", + " type of junction\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "lang\n", + "
\n", + "
str
\n", + "\n", + " language the text is in\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "lemma\n", + "
\n", + "
str
\n", + "\n", + " lexical lemma\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "ln\n", + "
\n", + "
str
\n", + "\n", + " ln\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "mood\n", + "
\n", + "
str
\n", + "\n", + " verbal mood\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "morph\n", + "
\n", + "
str
\n", + "\n", + " morphological code\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "nodeId\n", + "
\n", + "
int
\n", + "\n", + " node id (as in the XML source data\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "normalized\n", + "
\n", + "
str
\n", + "\n", + " lemma normalized\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "note\n", + "
\n", + "
str
\n", + "\n", + " annotation of linguistic nature\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "num\n", + "
\n", + "
int
\n", + "\n", + " generated number (not in xml): book: (Matthew=1, Mark=2, ..., Revelation=27); sentence: numbered per chapter; word: numbered per verse.\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "number\n", + "
\n", + "
str
\n", + "\n", + " grammatical number\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "otype\n", + "
\n", + "
str
\n", + "\n", + " \n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "person\n", + "
\n", + "
str
\n", + "\n", + " grammatical person\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "punctuation\n", + "
\n", + "
str
\n", + "\n", + " this is XML attribute punctuation\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "ref\n", + "
\n", + "
str
\n", + "\n", + " biblical reference with word counting\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "referent\n", + "
\n", + "
str
\n", + "\n", + " number of referent\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "rela\n", + "
\n", + "
str
\n", + "\n", + " this is XML attribute rela\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "role\n", + "
\n", + "
str
\n", + "\n", + " role\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "rule\n", + "
\n", + "
str
\n", + "\n", + " syntactical rule\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "strong\n", + "
\n", + "
int
\n", + "\n", + " strong number\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "subjrefspec\n", + "
\n", + "
str
\n", + "\n", + " this is XML attribute subjrefspec\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "tense\n", + "
\n", + "
str
\n", + "\n", + " verbal tense\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "text\n", + "
\n", + "
str
\n", + "\n", + " the text of a word\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "typ\n", + "
\n", + "
str
\n", + "\n", + " this is XML attribute typ\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "type\n", + "
\n", + "
str
\n", + "\n", + " morphological type (on w), syntactical type (on wg)\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "unicode\n", + "
\n", + "
str
\n", + "\n", + " word in unicode characters plus material after it\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "verse\n", + "
\n", + "
int
\n", + "\n", + " verse number, from ref attribute in xml\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "voice\n", + "
\n", + "
str
\n", + "\n", + " verbal voice\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "frame\n", + "
\n", + "
str
\n", + "\n", + " frame\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "oslots\n", + "
\n", + "
none
\n", + "\n", + " \n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "parent\n", + "
\n", + "
none
\n", + "\n", + " parent relationship between words\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "sibling\n", + "
\n", + "
int
\n", + "\n", + " this is XML attribute sibling\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "subjref\n", + "
\n", + "
none
\n", + "\n", + " number of subject referent\n", + "\n", + "
\n", + "\n", + "
\n", + "
\n", + "\n" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "\n", + "\n" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "
Text-Fabric API: names N F E L T S C TF directly usable

" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# load the N1904 app and data\n", + "N1904 = use (\"saulocantanhede/tfgreek2\",version='0.5.1',hoist=globals())" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "8a387dd9-50a2-47ce-867b-1e948b31f636", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display with notebook viewer)\n", + "N1904.dh(N1904.getCss())" + ] + }, + { + "cell_type": "markdown", + "id": "3e383881-2356-4704-97fd-c8a676abf6e9", + "metadata": {}, + "source": [ + "Note: to access the feature descriptions click here" + ] + }, + { + "cell_type": "markdown", + "id": "38ba1e92-4459-4e4c-bb07-3b002d1a60a3", + "metadata": {}, + "source": [ + "# 3 - Performing the queries \n", + "##### [Back to TOC](#TOC)" + ] + }, + { + "cell_type": "markdown", + "id": "e0204b00-a2dc-43f2-9968-86b57d29e8f9", + "metadata": {}, + "source": [ + "First we will define a query template to select John 1:1." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "12aed999-b3fb-4d0a-95ad-1b2e2a62debc", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " 0.02s 21 results\n" + ] + } + ], + "source": [ + "VerseQuery = '''\n", + "book book=John\n", + " chapter chapter=1\n", + " verse verse=1\n", + "'''\n", + "\n", + "VerseResults = N1904.search(VerseQuery)" + ] + }, + { + "cell_type": "markdown", + "id": "a429cfff-e434-44be-a6fe-7c234e4e63a9", + "metadata": { + "tags": [] + }, + "source": [ + "# 4 - The syntax tree presentation\n", + "##### [Back to TOC](#TOC)\n", + "\n", + "The data set allows for different types of tree presentation:" + ] + }, + { + "cell_type": "markdown", + "id": "2ddf7f23-8d7e-4bcd-a495-ac89894d3468", + "metadata": {}, + "source": [ + "## 4.1- View 1: combined view (display all nodes)\n", + "##### [Back to TOC](#TOC)\n", + "\n", + "The following will show John 1:1 with all nodes visable." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "0a51ca20-ca23-4e81-a381-6131be641caa", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "

verse 1" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "

verse
book=Johnchapter=1verse=1
sentence
book=John
clause
book=John
wg
wg
book=John
wg
phrase
phrase
subphrase
Ἐν
book=Johnchapter=1verse=1
subphrase
ἀρχῇ
book=Johnchapter=1verse=1
phrase
phrase
ἦν
book=Johnchapter=1verse=1
wg
phrase
phrase
subphrase
book=Johnchapter=1verse=1
subphrase
Λόγος,
book=Johnchapter=1verse=1
clause
wg
wg
phrase
phrase
καὶ
book=Johnchapter=1verse=1
clause
book=John
wg
wg
wg
book=John
wg
phrase
phrase
subphrase
book=Johnchapter=1verse=1
subphrase
Λόγος
book=Johnchapter=1verse=1
phrase
phrase
ἦν
book=Johnchapter=1verse=1
wg
phrase
phrase
subphrase
πρὸς
book=Johnchapter=1verse=1
wg
phrase
phrase
phrase
subphrase
τὸν
book=Johnchapter=1verse=1
subphrase
Θεόν,
book=Johnchapter=1verse=1
clause
wg
wg
phrase
phrase
καὶ
book=Johnchapter=1verse=1
clause
book=John
wg
wg
wg
book=John
phrase
phrase
Θεὸς
book=Johnchapter=1verse=1
phrase
ἦν
book=Johnchapter=1verse=1
wg
phrase
phrase
subphrase
book=Johnchapter=1verse=1
subphrase
Λόγος.
book=Johnchapter=1verse=1
" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "N1904.show(VerseResults, start=1, end=1, condensed=True, multiFeatures=False)" + ] + }, + { + "cell_type": "markdown", + "id": "6fe4843b-e69c-4c3d-8469-7770f456d084", + "metadata": {}, + "source": [ + "## 4.2- View 2: syntactic view (no display of wordgroup nodes)\n", + "##### [Back to TOC](#TOC)\n", + "\n", + "When the display of word groups is switched off, the tree contain all syntactical relevant detail, presented in a much easier to understand manner." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "42203fe4-dc23-4fb5-8aeb-e227c988f2b2", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "

verse 1" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "

verse
book=Johnchapter=1verse=1
sentence
book=John
clause
book=John
phrase
phrase
subphrase
Ἐν
book=Johnchapter=1verse=1
subphrase
ἀρχῇ
book=Johnchapter=1verse=1
phrase
ἦν
book=Johnchapter=1verse=1
phrase
subphrase
book=Johnchapter=1verse=1
subphrase
Λόγος,
book=Johnchapter=1verse=1
clause
phrase
phrase
καὶ
book=Johnchapter=1verse=1
clause
book=John
phrase
phrase
subphrase
book=Johnchapter=1verse=1
subphrase
Λόγος
book=Johnchapter=1verse=1
phrase
ἦν
book=Johnchapter=1verse=1
phrase
subphrase
πρὸς
book=Johnchapter=1verse=1
phrase
subphrase
τὸν
book=Johnchapter=1verse=1
subphrase
Θεόν,
book=Johnchapter=1verse=1
clause
phrase
phrase
καὶ
book=Johnchapter=1verse=1
clause
book=John
phrase
phrase
Θεὸς
book=Johnchapter=1verse=1
phrase
ἦν
book=Johnchapter=1verse=1
phrase
subphrase
book=Johnchapter=1verse=1
subphrase
Λόγος.
book=Johnchapter=1verse=1
" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "N1904.show(VerseResults, start=1, end=1, condensed=True, hiddenTypes={\"wg\"}, multiFeatures=False)" + ] + }, + { + "cell_type": "markdown", + "id": "b1936f5e-cbfd-412c-bfa8-1c5ddfc48d7a", + "metadata": {}, + "source": [ + "## 4.3- View 3: XML source view (no display of clause, phrase, or subphrase nodes)\n", + "##### [Back to TOC](#TOC)\n", + "\n", + "When the display of clause, phrase and subphrase nodes is switched off, the tree is presented 'as found' in the XML source data." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "8afa10de-6fff-403b-84d2-fc834a96665d", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "

verse 1" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "

verse
book=Johnchapter=1verse=1
sentence
book=John
wg
wg
book=John
wg
Ἐν
book=Johnchapter=1verse=1
ἀρχῇ
book=Johnchapter=1verse=1
ἦν
book=Johnchapter=1verse=1
wg
book=Johnchapter=1verse=1
Λόγος,
book=Johnchapter=1verse=1
wg
καὶ
book=Johnchapter=1verse=1
wg
book=John
wg
book=Johnchapter=1verse=1
Λόγος
book=Johnchapter=1verse=1
ἦν
book=Johnchapter=1verse=1
wg
πρὸς
book=Johnchapter=1verse=1
wg
τὸν
book=Johnchapter=1verse=1
Θεόν,
book=Johnchapter=1verse=1
wg
καὶ
book=Johnchapter=1verse=1
wg
book=John
Θεὸς
book=Johnchapter=1verse=1
ἦν
book=Johnchapter=1verse=1
wg
book=Johnchapter=1verse=1
Λόγος.
book=Johnchapter=1verse=1
" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "N1904.show(VerseResults, start=1, end=1, condensed=True, hiddenTypes={\"clause\",\"phrase\",\"subphrase\"}, multiFeatures=False)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bc140c69-1a21-47fc-899b-461d339d5e2e", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}