SourceViewBible #19
Replies: 6 comments 9 replies
-
In April 5, Rob Wiebe <[email protected]> wrote: |
Beta Was this translation helpful? Give feedback.
-
Wow, thanks Rob Wiebe for that helpful update, and to Tom for the amazing detailed spot check and comparison! It's clear that there's no one answer to how to do this, and like anything relating to natural language, it's always way more complicated than first expected. To me, one important detail is that the YWAM data isn't freely available / open-licensed (yet?). Perhaps this seems harsh, but in my mind, other than learning about different possible ways of thinking about technical matters (which Tom did above), it seems to me that it makes it largely irrelevant to this bigger discussion which is mostly about resources which can be freely offered to the Bible world to use in ways that we can't maybe even imagine yet. (Saying it might be made available just in Paratext is of no personal interest to me. My interest is tagging original Heb and Grk texts so that it may be possible for translations that are aligned to the original text to automatically transfer that data to their translation.) But please do let us know, Rob, if/when you are able negotiate better licensing terms. And also, don't hesitate to let us know of any limitations or deficiencies you see in the Glyssen data/system. |
Beta Was this translation helpful? Give feedback.
-
Unless we can come up with an objective standard/automated way to arrive at sufficiently (whatever that means) unique and usable IDs, it's going to take a painful amount of work to come to agreement as to which (arbitrary) approach is "best." I'd personally be willing to spend some time making Glyssen conform to some other standard if it met Glyssen's needs and "everyone" else agreed it was the best standard. (I'd have to get buy-in from my higher ups, but I'm guessing it would be possible.) But so far, it doesn't look like there is an "everyone" else. It's a lot less motivating to make Glyssen mimic "someone" else. Feels like there are two rather difficult options:
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
In answer to your questions above:
Given that there may be legitimate reasons for various decisions about character IDs, perhaps the "best" approach would be to have a database that is able to store more than one system ("aliases"). The trouble with that, I'm afraid, is that unless you start with a "master" list that has it broken up into the maximum granularity, you'll end up needing a many-to-many map. By introducing Scripture references, a useful many-to-many map still might be possible, but it's probably not trivial. Looking at the data we have in Glyssen, I see that there are 249 character IDs that represent a group/plurality that speak in more than one verse. Of those, 100 speak in exactly 2 verses. In places where those verses are contiguous or in close proximity, they can generally be treated as a single unique character. That still leaves well over 150 character IDs that would potentially need to be split up into greater granularity to allow for historical uniqueness. In a way, this would be "better" in Glyssen because there is no compelling reason why every time the "Israelites" speak, it should always sound like the same person. (In practice, whenever Glyssen's output is processed into FCBH's Core Script, those distinctions would be erased because they map our character IDs onto their minimal set of voice actors. But Glyssen does have the ability to come up with an optimal distribution of characters to any size cast (where you can specify genders and age groupings.) |
Beta Was this translation helpful? Give feedback.
-
If you want unique identifiers for speakers, would the Semantic Dictionary of Biblical Greek and the Semantic Dictionary of Biblical Hebrew work? See https://semanticdictionary.org/. These identifiers are already used in Enhanced Resources and also in the MACULA datasets: https://github.com/Clear-Bible/macula-hebrew For instance, in SDBH, 3.1.7 is the domain "Names of People", there is also a domain for "Names of Deities", etc. One downside: the identifiers for NT and OT differ. |
Beta Was this translation helpful? Give feedback.
-
@RobH123 mentioned that he had corresponded with Rob Wiebe from SourceView (SourceViewReader app SourceViewBible.com), who is also already parsing Scripture to identify speakers.
I found this app, which I'm assuming is the one, based on the URL. but that URL is now a junk site and is not being kept up. Apparently the app isn't either (last updated January 2017).
Has this moved to a new URL? Is development supposed to resume? Is it open source?
Beta Was this translation helpful? Give feedback.
All reactions