You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The _atom_rho_multipole.scat_valence_table and _atom_rho_multipole.scat_core_table data items are assigned the Array container type, but are described as tables in their human-readable descriptions, i.e.:
The table contains the st/l value as the key and the scattering factor
as the value. E.g. {"0.00":"15.65","0.05":"15.32",.....etc }
Furthermore, there is also the ATOM_SCAT_VERSUS_STOL category from the CIF_CORE dictionary which records similar information using separate data items.
Currently, I can think of three different ways this data can be represented and each comes with its pros and cons.
Array
With the "Array" container type the example value would be recorded as:
[ 0.00 15.65 0.05 15.32, ... ]
Pros:
Both the st/l and the scattering values can be explicitly defined as numeric.
Cons:
The key-value pairing is be implied by the order of the elements.
No DDLm way of specifying that the key values must be unique.
No DDLm way of specifying the array must be even-sized.
Numeric values that describe semantically different values are stored in the same data structure.
Fragile -- one missing or misplaced value may break the pairing of subsequent values.
Table
With the "Table" container type the example value would be recorded as:
{ "0.00":15.65 "0.05":15.32 ...}
Pros:
Key-value pairs are explicit.
Missing keys or values would cause a syntactic error.
No duplicate key values.
Cons:
Key values are no longer explicitly declared as numeric (all CIF_2.0 keys are strings), that is, a CIF validator would not automatically detect incorrect data like {"xxx":15:65}.
Separate category
When defined as a separate category (i.e. ATOM_RHO_MULTIPOLE_VALENCE_SCAT_VERSUS_STOL) similarly to the ATOM_SCAT_VERSUS_STOL category, the example value would look something like:
This category would have a composite key that consists of cat.atom_label and cat.stol_value where the cat.atom_label is also linked to the _atom_rho_multipole.atom_label data item.
Pros:
Key-value pairs are explicit.
Uneven number of keys and values would cause a syntactic error.
All values can be explicitly declared as numeric (and even with different enumeration ranges, e.g. st/l could be restricted to positive values).
No duplicate key values.
Cons:
An additional level of indirection. Since the values are presented in a separate loop, they might be slightly harder to read or associate with a specific atom site.
Additional questions
Should the st/l values be associated with a specific atom site identified by the unique atom label or would it suffice to link the to atom types (as is currently done in the ATOM_SCAT_VERSUS_STOL category from the CIF core dictionary)?
Should the values of st/l or scat be declared as non-negative [0, inf]? I guess it's a "no" for the scattering, but what about the st/l. Note, that this would also apply to the ATOM_SCAT_VERSUS_STOL category (currently it does not define any limits).
Final remarks
It seems, that the two items in question were only introduced after the migration from DDL1 to DDLm and are thus very unlikely to be currently used by any piece of software. That does provide some freedom for refactoring.
Personally, I would probably go with the separate category approach. @jamesrhester, @nautolycus do you have any preference on this?
The text was updated successfully, but these errors were encountered:
Thanks for this detailed analysis. I strongly prefer the separate category approach. I'm not convinced that the "Table" data type is very useful in general, as it is exactly equivalent to a category expressed a bit more concisely, but losing all of the benefits of exposing the category to the DDLm machinery (as you point out). I can imagine that it was introduced to avoid repeating the atom type for every line, but if we go down that path any loop with more than one key data name can be turned into a loop with one key data name and a table, and we are recreating a hierarchical model, but poorly.
We should associate with atom type, not site. If atoms have the same nominal type, but different form factors, then a user can create a separate atom type for each one.
The problem
The
_atom_rho_multipole.scat_valence_table
and_atom_rho_multipole.scat_core_table
data items are assigned theArray
container type, but are described as tables in their human-readable descriptions, i.e.:Furthermore, there is also the
ATOM_SCAT_VERSUS_STOL
category from theCIF_CORE
dictionary which records similar information using separate data items.Currently, I can think of three different ways this data can be represented and each comes with its pros and cons.
Array
With the "Array" container type the example value would be recorded as:
Pros:
Cons:
Table
With the "Table" container type the example value would be recorded as:
Pros:
Cons:
CIF_2.0
keys are strings), that is, a CIF validator would not automatically detect incorrect data like{"xxx":15:65}
.Separate category
When defined as a separate category (i.e. ATOM_RHO_MULTIPOLE_VALENCE_SCAT_VERSUS_STOL) similarly to the
ATOM_SCAT_VERSUS_STOL
category, the example value would look something like:This category would have a composite key that consists of
cat.atom_label
andcat.stol_value
where thecat.atom_label
is also linked to the_atom_rho_multipole.atom_label
data item.Pros:
Cons:
Additional questions
ATOM_SCAT_VERSUS_STOL
category from the CIF core dictionary)?ATOM_SCAT_VERSUS_STOL
category (currently it does not define any limits).Final remarks
It seems, that the two items in question were only introduced after the migration from DDL1 to DDLm and are thus very unlikely to be currently used by any piece of software. That does provide some freedom for refactoring.
Personally, I would probably go with the separate category approach. @jamesrhester, @nautolycus do you have any preference on this?
The text was updated successfully, but these errors were encountered: