Use Glyssen (and/or other tools) to create fully marked up Greek and Hebrew texts #10

tombogle · 2022-03-10T19:18:05Z

tombogle
Mar 10, 2022
Maintainer

I pulled down the UW Hebrew and Greek Bibles and made a very brief foray into trying to identify the characters for the speaking parts using Glyssen. I quickly realized that my lack of knowledge of the biblical languages and the complete absence of quotation marks in the text was going to make this a very long, slow process. But I did just enough to at least see that it is possible.
If this process were completed, it would be fairly simple to turn the result into a Glyssen reference text. I haven't tried it yet, but it would probably also be simple to use the Voice Marking Tools to apply that work back to the USFM data. (I think the biggest question is whether it might somehow clobber the \w markup.)
From Glyssen's standpoint, there is probably no compelling reason to have a reference text in the original languages, since the main purposes of the reference text are to a) help a user figure out who is speaking in the Identify Speaking Parts dialog box and b) provide an LWC to assist the field recording teams in keeping track of what they are recording (since they typically do not know the target language). However, there might well be other uses for a marked up original-language text. Once produced, someone would need to decide whether it should be included as a separate resource or whether the existing texts should have those milestone markers added. Presumably the latter would be more maintainable, but there might be compelling reasons to keep them separate.

RobH123 · 2022-03-10T19:31:11Z

RobH123
Mar 10, 2022
Maintainer

As well as tagging speakers, I'm also interested in tagging pronoun referents, i.e., who is "he" in "And he answered,". It seems that this is a related concern. I had originally thought of tagging an open literal English text and that's certainly useful, but I then moved towards thinking about Hebrew and Greek as being more universal and less English-centric, but I also see your point that English is indeed more helpful to more of the world's population.

BTW, the uW Hebrew UHB is derived from https://hb.openscriptures.org/ and the uW Greek UGNT is derived from a preliminary version from https://greekcntr.org/home/index.htm (where I currently spend most of my working week) and they are due to release their first production version within the next few weeks/months.

0 replies

tombogle · 2022-03-10T22:16:04Z

tombogle
Mar 10, 2022
Maintainer Author

Tagging referents does seem like it could be useful - and a lot of work! It would certainly involve a larger set of characters than those needed to identify speakers.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Glyssen (and/or other tools) to create fully marked up Greek and Hebrew texts #10

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Use Glyssen (and/or other tools) to create fully marked up Greek and Hebrew texts #10

tombogle Mar 10, 2022 Maintainer

Replies: 2 comments

RobH123 Mar 10, 2022 Maintainer

tombogle Mar 10, 2022 Maintainer Author

tombogle
Mar 10, 2022
Maintainer

RobH123
Mar 10, 2022
Maintainer

tombogle
Mar 10, 2022
Maintainer Author