-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need an attribute with just the lemma #91
Comments
That makes sense to me. You know more than I about XML schemata. Would we need to modify the schema to make that validate? |
@jonathanrobie I reformatted the file with a new name, https://github.com/translatable-exegetical-tools/Abbott-Smith/blob/master/abbott-smith.tei_lemma.xml, and have it merged into the repository. THere were cases where there was no vertical bar in the entry and where there were undefined Strongs, G????. I have attached 2 files with those instances. I am open to changing things it that will work better for you. Let me know. |
Thanks - that's helpful, I hadn't gotten around to this. Is there a need to have two different files in different formats? Is there need for the format that uses the |
Is there a schema? I can't find a file with the extension .dtd, .xsd, .rnc, or .rng. |
I do not believe that there is a schema file. I never found one and built by script based upon the xml file contents. I used a different file since I wasn't sure whether we wanted to keep both formats. We will have to get Todd in the loop to move to 1 file for the new format since he was over the effort to do the manual updates to reflect reality. |
I have always validated against http://www.crosswire.org/OSIS/teiP5osis.2.5.0.xsd (see TEI@xsi:schemaLocation). I have it in my local files but Git is set to ignore it. |
I cannot see any reason not to maintain only one file. It would make things much easier. @destatez Thank you for working on this file. I think we probably should just make the changes directly in abbott-smith.tei.xml. But if you would feel more comfortable looping Todd in, that's okay with me. We started the project together, and it's a good idea to keep key parties updated as changes are made. |
I went through the entries that did not have a bar (most did not have a Strong's number) and fixed those Strong's numbers that were G????. |
I wished we would have talked before you appended the letter "a" to these undefined Strongs. We, the Unlocked Greek Lexicon and Unlocked Greek New Testament teams, have taken an approach by Alan Bunning called Strongs Plus (from ugl docset: "The Strong’s Plus ID referenced above was initially developed by Alan Bunning, where he took the 4-digit Strong’s ID and appended a zero to create a 5-digit ID. This gave him extra IDs to be able to qualify different word forms than the standard Strong’s. We will be using this Strong’s Plus identification for this project.") When we created the ugl files, we made all the Strongs IDs this Strongs Plus, The cases where a particular Greek word from A-S was not a part of Alan's deliverable spreadsheet (which is the root for ugnt), I put the G???/ in the xml and for ugl assigned them a unique ID and lemma file above G99000, and put them in a class of IDs that the ugl team would review and determine the actual Strongs Plus ID and update the ugl lemma file accordingly. The work that you did for these can and will be used by our team to reassign these We need to talk about the Strongs Plus ID scheme. It may be better to re-write the xml file using this convention so that we can all be on the same page, We can even update the delivered A-S xml with the Strongs IDs (in 5-number form) for those that were undefined, which you have defined. I can then re-run my script for the new format and have the best of both worlds. |
We should have a separate issue for extending Strong's. I will open one. |
I'm fine with whatever you think would work best on this @destatez. Thanks for your work. |
In the current format, there is no element or attribute that contains the lemma directly without additional text. That makes it difficult to match in queries without preprocessing.
Instead of:
I would prefer something like:
The text was updated successfully, but these errors were encountered: