Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we recommend specifying language tag? #479

Open
mcourtot opened this issue Sep 12, 2017 · 27 comments
Open

Should we recommend specifying language tag? #479

mcourtot opened this issue Sep 12, 2017 · 27 comments
Labels
attn: Editorial WG Issues pertinent to editorial activities, such as ontology reviews and principles attn: Operations Committee Issues pertinent to broad Foundry activities, such as policies and guidelines ontology metadata Issues related to ontology metadata policy Issues and discussion related to OBO Foundry policies vote Issue that is open to voting (by whom?)

Comments

@mcourtot
Copy link
Contributor

mcourtot commented Sep 12, 2017

As per https://groups.google.com/forum/#!topic/obo-discuss/_x1MpwAjHQw, from Peter Midford:

Entering for string for a definition or a synonym in Protege I'm confronted with choosing a type (xsd:string) or a language (en), but it seems only one is allowed. Looking around in NBO, which I'm updating, it looks like type wins over language. So, my question is whether specifying type or language is the better practice in the OBO community.

@nlharris
Copy link
Contributor

nlharris commented Apr 13, 2020

Relates to #325 and #437

@nlharris nlharris added ontology metadata Issues related to ontology metadata policy Issues and discussion related to OBO Foundry policies labels Apr 13, 2020
@nlharris
Copy link
Contributor

nlharris commented Dec 1, 2020

can someone answer @mcourtot's question?

@alanruttenberg
Copy link
Member

alanruttenberg commented Dec 1, 2020 via email

@jamesaoverton
Copy link
Member

I agree with @alanruttenberg that a language tag is better than xsd:string for labels, definitions, synonyms, etc.

In practise, I see more xsd:strings than language tags but we should push to use language tags.

@matentzn
Copy link
Contributor

matentzn commented Dec 1, 2020

I agree. I think xsd:string is essentially redundant - there is no good practical reason to annotate strings with xsd:string. I also believe language tags is the way to go here. Just telling the tooling that will be quite a challenge..

@yongqunh
Copy link
Contributor

yongqunh commented Dec 1, 2020

I agree as well. A language tag is better than xsd:string.

@nlharris
Copy link
Contributor

nlharris commented Jun 8, 2021

does this recommendation still need to be added somewhere?

@matentzn matentzn added the attn: Operations Committee Issues pertinent to broad Foundry activities, such as policies and guidelines label Jun 9, 2021
@matentzn
Copy link
Contributor

matentzn commented Jun 9, 2021

I added the Operations Commitee tag to just put this up for vote.

I think its straight forward to vote that we want to use language tags over xsd:string. However, the big question mark is what we want to recommend when comparing "nothing" to @en - there will be a lot of screams of agony if we require all English language labels, definitions etc to get an @en tag. But maybe that's the way to go to break the dominance of the English language in truly global world! I would vote for it, and I would volunteer helping the Foundry ontologies migrate. However, there are voices (I am sure @cmungall is one of them) that would say that "@en" on all literals will confuse the users :D But even here - we could say: use @en everywhere, and if your users are confused, export a version of your ontology without language tags. So, two votes:

Suggestion:
Recommend to use language tags instead of xsd:string, and add the recommendation to the "common format" principle. This will require changes to obo2owl format parser and some work on the curation side. We wont require language tags across the board (gene names, peoples names, xrefs etc, thanks @alanruttenberg ), but ROBOT report will produce a warning if a class in an ontology has a label, synonym or definition that does not have a language tag.

  • 🚀 Yes. Its hard work, but we can overcome the technical challenges and I think it's worth it.
  • 👎 No. Its not worth it at this moment, not unless someone is being paid to do it.

@matentzn matentzn added the vote Issue that is open to voting (by whom?) label Jun 9, 2021
@cmungall
Copy link
Contributor

voices (I am sure @cmungall is one of them) that would say that "@en" on all literals will confuse the users

I am all for not confusing users, but I am not sure how this would confuse users, most of whom interact via OLS etc

All seems reasonable on the surface. I think the challenge is with the tooling, not policy. Provide people tools and they will do the right thing.

The main tooling need is in the obo2owl code. If standard sparql updates are provided in odk/robot then it will be easier for maintainers to migrate. But ideally this would be in the owlapi conversion code. That way there is no confusion in having the edit version be different owl than the release version. I don't think adding this to the owlapi is so hard but someone needs to manage the migration process.

I also think you need to give clear guidance on how to migrate. Many ontologies may use latin terms. Doing a replace-all of string to @en will yield incorrect results. Unless we consider the fact that a latin term is acceptable in formal english speaking contexts? Or maybe we should require two labels? Or one label plus an exact synonym?

There are methods to be able to infer whether a term is english or latin but this is work we would be putting on ontology developers, many of which have to balance limited resources against actual requests from curators rather than formal ontologists.

@pbuttigieg
Copy link
Contributor

In some of our UN work, language tags are very desirable for obvious reasons, and more interoperability efforts are also asking for multilingual support. Supportive of the language tag.

@alanruttenberg
Copy link
Member

alanruttenberg commented Jun 15, 2021 via email

@matentzn
Copy link
Contributor

Ok the vote is now ready: a simple yes or no question:
#479 (comment)

@cthoyt
Copy link
Collaborator

cthoyt commented Jun 15, 2021

Ok the vote is now ready: a simple yes or no question:
#479 (comment)

Not sure if this is planned to become more common but I like the idea that votes take place via github issues / comments.

@matentzn
Copy link
Contributor

matentzn commented Jul 13, 2021

Open action items:

  • Finalise the vote
  • Find resources to extend the OBO format parser to handle language tags correctly (alternatively, we can recommend stripping them from rdfs:label annotation prior to release using ODK)

@nlharris
Copy link
Contributor

Looks like the vote so far is 3 yes and 2 "thumbs up" (not mentioned as a voting option, but I think we can assume those are also yeses).

@matentzn
Copy link
Contributor

Ok, the outcome of the vote here is that we start recommending language tags in place of DT(string) for labels.

Next steps:

  • Add this to common format principle OBO (@nataled @lschriml @nicolevasilevsky EWG)
  • Making an announcement on OBO discuss about new OLS support (OFOC call)
  • Start building QC checks (probably our team)

@nataled
Copy link
Contributor

nataled commented Feb 15, 2022

Not a pushback on the outcome of the vote, but on the process. One of the early decisions regarding new or changed principles is that they are discussed and voted on in an Operations call before wording is added by the EWG. I see that there was discussion of this during a call, but it's not clear to me that a vote was taken during a call. Has that happened?

@matentzn
Copy link
Contributor

Sure, we can raise this one more time at the OFOC! Makes sense. I don't remember exactly what has happened wrt to the discussion there. So best just finalise the decision next Tuesday! Thank you @nataled

@matentzn matentzn added the attn: OFOC call Issue to discuss on fortnightly OBO Operations meeting label Feb 15, 2022
@matentzn matentzn removed the attn: OFOC call Issue to discuss on fortnightly OBO Operations meeting label Apr 19, 2022
@ddooley
Copy link
Contributor

ddooley commented May 12, 2022

Is there any guidance about what language tags are good, and which might be malformed? We're looking into permitted language variants over in FoodOntology/joint-food-ontology-wg#25

@alanruttenberg
Copy link
Member

@matentzn matentzn added the attn: OFOC call Issue to discuss on fortnightly OBO Operations meeting label May 1, 2023
@matentzn matentzn added attn: Editorial WG Issues pertinent to editorial activities, such as ontology reviews and principles and removed attn: OFOC call Issue to discuss on fortnightly OBO Operations meeting labels Jun 27, 2023
@matentzn
Copy link
Contributor

@nataled Action items:

#479 (comment)

@hoganwr
Copy link
Contributor

hoganwr commented Jun 27, 2023

A couple thoughts, and hopefully I am not stirring a hornet's nest:

  1. Should we be specific about which annotation properties this applies to? rdfs:label, skos:prefLabel, certain things from OMO?
  2. In some cases as Alan R. points out, xsd:string is absolutely the correct type. For example when annotating non-IRI identifiers such as RxCui on classes. Those identifiers are strings of numerals (and I would argue not numbers, but that's not important right now).

@nataled
Copy link
Contributor

nataled commented Jun 27, 2023

@hoganwr based on #479 (comment) it should be applied to label only (at least for now).

@matentzn I'm going to need some text along with specific instructions to be provided to users. Best if these instructions include directions for both OWL and OBO formats, but if you don't know the latter I can probably figure it out using an OWL-to-OBO converter.

I should mention that I'm becoming increasingly concerned that we are overloading the principles with directives that are quite ancillary to the principle at hand. This language tag thing, for example, while referring to format, is not really related to the format principle, which is about the overall artifact format (OBO, OWL, JSON, etc) and not about specific fields. I'm thinking we need to separate principles from specific details. I plan on raising this issue in a OFOC call.

@hoganwr
Copy link
Contributor

hoganwr commented Jun 27, 2023 via email

@matentzn
Copy link
Contributor

matentzn commented Jun 28, 2023

There will be some iterations on this on the PR @nataled but you can start with this:

- For rdfs:label and IAO:0000115 annotation assertions, we discourage the use of datatype declarations such as `xsd:string`. It is important to note that `xsd:string` is essentially redundant in OWL/RDF, so "assay" and "assay"^^xsd:string should be the exact same thing. However, a lot of tooling may be confused by the difference, xsd:string datatype assertion SHOULD be omitted in general for all annotations, but MUST be omitted for rdfs:label and IAO:0000115.
- To designate rdfs:label, and IAO:0000115 annotations in a language different from English, a [valid RDF language tag](https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal) MUST be specified, for example, "Krankheit"@de.
- rdfs:label and IAO:0000115 annotation assertions for English content MAY be annotated with an English language tag. If the ontology chooses not to use language tags, a protege:defaultLanguage assertion MUST be added as an ontology annotation.

@alanruttenberg
Copy link
Member

@matentzn I'm confused. The votes and discussion suggests use of language tags, but the text you suggest effectively says to not use them for english.

@matentzn
Copy link
Contributor

matentzn commented Jul 2, 2023

@alanruttenberg good Point, i forgot adding a note about that. I made it a bit less restrictive now, and added a third bullet on how to deal with English.

allenbaron added a commit to DiseaseOntology/HumanDiseaseOntology that referenced this issue Feb 5, 2024
allenbaron added a commit to DiseaseOntology/HumanDiseaseOntology that referenced this issue Mar 18, 2024
Auto-add 'en' lang tag to all labels, definitions, synonyms, &
comments of DO terms without a language tag (currently excludes
comments as annotations on definitions).

Related to
OBOFoundry/OBOFoundry.github.io#479.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
attn: Editorial WG Issues pertinent to editorial activities, such as ontology reviews and principles attn: Operations Committee Issues pertinent to broad Foundry activities, such as policies and guidelines ontology metadata Issues related to ontology metadata policy Issues and discussion related to OBO Foundry policies vote Issue that is open to voting (by whom?)
Projects
None yet
Development

No branches or pull requests