-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrating _diffrn.id
and _structure.id
with the powder dictionary
#171
Comments
Dammit, James. ;) I thought I had it all figured out, and then I had another think and now I don't know what I think... I think that we've been doing with There are a couple of ways of looking at things. One is to see how the dictionary works: can we cope with having the same Taking the second one first, what are some interesting edge cases to look at the limits? I think that Ian Madsen's (I think it's his) definition is a good starting point: A phase is a crystallographically-distinct material.
Does it make sense for a phase to have more than one structure? In the CIF sense, I think the answer has to be yes. How can you look at, say, corundum at 300 K and 1000 K, with all the concomittent changes in unit cell parameters, atomic coordinates, and displacement parameters? It's still corundum. It's just at a higher temperature. What about the inverse questions: can a structure have more than one phase? I think the answer has to be no. If multiple phases have the same structure, then they aren't crystallographically distinct. Now, what about multi-diffractogram experiments?
From this last one, do there exist two structures with (essentially) the same unit cell parameters, the same site occupancies, and the same symmetry, but have different atomic coordinates, and are considered to be two different phases? ie what is required to be different to be different phases? Unit cell params, not necessessarily. Symmetry, yes. Site occupancies, probably yes. (again, is Mg1.95Fe0.05SiO4 a different phase to Mg1.96Fe0.06SiO4) Atomic coordinates, probably yes. I think I'll stop there for now. Its getting late and I need sleep. I'll come back to this later. |
A couple quick comments:
Any physical change, changes the structure and possibly the microstructure, how that is modeled OTOH is discretionary. Pressure is going to change the lattice parameters for sure.
There is one exception that I can think of for this -- which is more of a nomenclature issue, than a real one -- but in describing a magnetic material, one presents a structure for the atoms and one for the spins. Breaking this into two views of a single entity makes the description more compact, so there is still really only one phase, but CIF sees this as two. |
I think @rowlesmr comment confirms I'm on the right track. I have proposed defining Taking @rowlesmr 's cases from the top:
ie there is no mapping from phase to structure. I am only asserting a mapping from structure to phase, so that's fine.
A structure can be partially defined, just provide values for
If there is a mapping from structure to phase, multiple structures can map to a single phase, or there can be a one-to-one mapping. Both these situations are covered by the proposed definition.
In this situation both diffractogram and phase are important to describe PO. Structure is not directly involved. So structure maps to phase, and phase together with diffractogram determine a particular set of PO parameters. This shows the importance of phase as a concept separate to structure (as do many of the other examples).
Not sure how you can raise the temperature and have no changes to the structure? In any case, under the proposed definitions a single phase can have multiple structures, and if you want to name the phase differently at some point, that works as well.
At each point in the solid solution, there is a defined structure. Under the proposed definition, you have the flexibility of assigning each structure to a different phase, or to the same phase. The important thing is that the proposed definition doesn't commit you to a particular view of when a phase is no longer the same. It does commit you to only allowing a structure to be associated with one phase. (Of course, even the latter can be worked around by creating a new |
I'm not sure why you say that CIF sees this as two. I'm assuming that the current approach is that a separate As an aside, we haven't exactly bedded down how we want the magnetic structure to relate to the structure as currently defined (ie the bundle of cell, space group, and atomic positions). We can either absorb magnetic structure into |
What about the first. Can we cope with non-unique values of Just FYI:
First, which categories use
There is a preponderance of
Second, which categories use
As long as this is enough to uniquely identify the phase and structure. |
We're going to have to beef up the definitions in I'll try and draw up an example CIF. * Even if it is just to get it right in my head. |
I'm also working on some full examples generated from GSAS-II tutorial data. If the QPA standard is given as a phase id, then you'd have to associate only a single structure with that phase. You could instead give a structure id instead of a phase ID, and that would be associated with a particular phase. |
I think there are the following types of "project CIFs" generated in GSAS-II:
Not sure we have tutorials covering all of these. Probably all but 3, but that can be generated from the sequential fit tutorial if one only includes the majority phase. There are probably quite a few subcases for 2, if one considers one phase, multiple histograms different from, one histogram w/multiple phases, and then cases where not all phases are found in all histograms, also combined powder/single crystal. |
AcTuAlLy, its defined as a a label for a diffraction data set collected under particular diffraction conditions (see COMCIFS/MultiBlock_Dictionary#17). I think it should label the conditions, so that if many diffractograms are collected under the same set of conditions, then you don't need to repeat yourself. |
Definitely. I think this is where the stress test lies. Taking a temperature-dependent experiment as a baseline (could be time, pressure, magnetic field, any other combination you'd like...)
I don't think we currently have the ability to define a structure or phase that has been co-refined over multiple diffractograms. Is this a thing we want to look at? ( Does core CIF worry about a structure being determined from multiple data sets? |
Well, using The concept of a refinement has not yet been added to core CIF (which means that implicitly the results in a CIF are from a single refinement), so that's the next frontier. You could imagine a pointer in the structure category to a |
Please see below draft of first example: one phase, two measurements. Each measurement is in a separate data block, each set of diffraction conditions is also in a separate data block, all other information is in a single data block. Data blocks are linked using data names linked to Key issue: there is no well-defined value for A better solution is for us to deprecate Notes:
|
I dislike putting the measurement conditions in a separate block from the diffraction pattern data itself. To me they are very much linked and I see little advantage from separating them, so I would go with three blocks here rather than five. Perhaps four, since I like to have something that serves as a TOC. In this case the TOC info can be combined with the Phase block, but with multiple phases, that would need to be free-standing. |
@jamesrhester I think you have a typo, as the X-ray diffrn.id isn't referenced anywhere |
Fixed |
There is indeed no technical reason that values corresponding to the |
And here's an example of Brian's case number 2 with multiple phases, single histogram. There are five blocks: one for each of the two structures*, one for each space group, and one for everything that there is only one of, which in this case is histogram and diffraction conditions. Separating space group from the structure block may seem like overkill in this case, but if you have a sequential fit (the next example I'll post) that means only stating the space group and symops once for each space group. In theory you should also have separate data blocks for each phase, as a phase is distinct from a structure, but in this case those data blocks would contain only the phase identifiers, so I've placed ** remembering that a structure is a combination of cell, space group and atomic positions. Edit: fixed C 2/c space group data block contents.
|
I have mixed feelings on one aspect of this. It is streamlined and elegant, but separating the space group from the phase means that any older code that does not know how to follow the If small, compact, files are the goal, CIF is probably not the answer (but then XML is way worse and that has not stopped anyone from using that). |
Indeed. Fortunately, how the information is distributed over data blocks is quite flexible. The rules are:
I believe Rule 3 is the only practical difference to the status quo for PD data, as exemplified by GSAS-II output. The above rules permit you to literally cut and paste the contents of the space group data blocks into the relevant structure data blocks (changing nothing) and that would still be valid. Such flexibility is not necessarily desirable as it imposes extra burdens on software that has to read all the alternatives, not to mention the legacy issues @briantoby notes. So I think we (the PD standards community) would develop best-practice recommendations for how to distribute information over data blocks. We've already got two recommendations:
and anticipating...
Meanwhile, I'm finding it quite useful to go for "maximum splittage" in these examples as that makes any missing links between categories plainer because it removes the implied link that exists when data names appear in the same data block.
I hope my above comments demonstrate that we can keep legacy codes happy and include references across blocks. While I haven't included block id pointers in the examples, they could also be provided.
Sure, the most important thing is not elegance, but adoption. If none of this is used then we're wasting our time. FWIW I think XML is on the way out, replaced by JSON. |
Note I've just updated the two-phase, one diffractogram example to split the reporting of the phase mass percent to the relevant per-phase data block as dictated by the above rules. |
And here is an example with multiple phases at multiple temperatures, generated from the GSAS-II sequential refinement tutorial, followed by GSAS-II CIF export and lots of editing and rearrangement. Note I've assumed that separating out I have not incorporated any of the newer additions to pdCIF (e.g. preferred orientation) yet. One heuristic for generating these large files is:
|
As there haven't been any objections so far, I'm going to go ahead and submit a PR for |
Core CIF has recently added a few data names for handling more complex datasets that include data collected under different conditions, potentially yielding a variety of structures. These new data names are provided in the multi-block dictionary. The powder dictionary can make use of these.
_diffrn.id
First: new data name
_diffrn.id
(also found in mmCIF) labels a particular set of experimental conditions (ambient environment, radiation source, crystal specimen). Previously, this information was implicitly linked to a diffractogram byDIFFRN
data names appearing in the same data block as the diffractogram. We should make this link explicit by defining_pd_diffractogram.diffrn_id
, whose value would refer to the set of diffraction conditions relevant to the diffractogram identified by_pd_diffractogram.id
._structure.id
Core CIF defines a structure as a combination of the atomic sites, a unit cell, and symmetry. Clearly this is closely related to a crystallographic phase. We should determine the nature of this relationship: either it is
_pd_phase.id
implies at most one particular_structure.id
_structure.id
describes at most one specific phaseI suggest (1) is not true, as for each temperature step in a multi-temperature experiment the phase would be considered the same (assuming no phase transitions) but the unit cell would be different. Therefore neither can (3) be true. I believe that (2) is thus a reasonable assertion: any structure that is reported is the structure of a particular phase under particular conditions. This means that the powder dictionary should add a data name
_structure.phase_id
, identifying which phase the structure relates to. Note that the link between a structure and diffraction conditions is already taken care of by the core data name_structure.diffrn_id
.The above suggestions start to address also the points raised in #164 .
Please comment, particularly regarding my understanding of the term "phase".
The text was updated successfully, but these errors were encountered: