-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify the semantics of the _alias.dictionary_uri
attribute
#481
Comments
I think our modest ambition for the 'alias' attributes is to automatically recognise different variants of a data name, such that software can successfully use aliases interchangeably. Any software-relevant semantic differences would require a new data name to be defined, where we might be tolerant in practice to changes that have no practical effect (such as changing the minimum value from 0 to 1, if we know that nobody has used 0). So, from the point of view of software, the particular version pointed to by the URI shouldn't matter. Also, it becomes difficult for dictionary writers to make the 'semantically similar' judgement. I would be in favour therefore of simply pointing to the latest known version that defined the alias, so the wording would be:
|
Seems ok to me, I created PR #485 to address this. I will update the draft PR #483 accordingly. But this also got me thinking, that including the full URI for each alias seems like a significant duplication of data. The same two or three URIs will be repeated in almost all definitions (imagine one of the dictionaries moving to a different location). Would it make sense to (eventually) introduce something like |
In theory this would be nice (more normalised as the DB people like to say) but in practice it just means programmers have to write in an extra step of indirection and readers have to scroll around. Global search and replace makes editing multiple entries trivial. However, even doing it that way, we have created a real workload for ourselves if we are trying to keep up with wwPDB latest version, so we might want to finesse our definition to state that it is "Absolute URI of a version of the dictionary containing the latest version of the aliased definition." so that if the text of the definition doesn't change (which is true for 99% of the PDB definitions that we alias) then we don't have to update our definition either. |
I see your point. I would further update the proposed phrasing to: "Absolute URI of a version of the dictionary containing the latest compatible version of the aliased definition.". But this latest round of discussions also made me realise, that we might not always have URIs for specific dictionary versions (e.g. PDB does does not seem to do that for mmCIF/PDBx dictionaries). I therefore propose the following approach:
If you are OK with this approach, I can update PR #482 to reflect these changes. What do you think? |
Yes, I agree with these suggestions. |
The current version of the DDLm reference dictionary defines the
_alias.dictionary_uri
attribute as:However, the definition is a bit imprecise which prevents this attribute from being effectively used in an automated fashion. For example, the same data name may belong to several versions of the the same dictionary and sometimes with slightly different semantics (or at least different human-readable definitions). I suggest we clarify the definition along the lines of:
The main benefit of this would be that we could run automated checks from time to time to see if the definitions from different dictionaries (e.g.
mmCIF
andCIF_CORE
) still match up to the specified level (e.g. we might not require the human-readable definitions to match verbatim, but having the same enumeration ranges would be great). I specifically used "should" in the reformulated definition to indicate that the definitions may become out of sync from time to time, but that one should strive to get them in sync when possible.Alternatively, we may anchor the URI to a different dictionary version, e.g. the version that originally defined the data item.
The text was updated successfully, but these errors were encountered: