Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linking refinements to the data sets used #344

Open
jamesrhester opened this issue Jan 31, 2023 · 4 comments
Open

Linking refinements to the data sets used #344

jamesrhester opened this issue Jan 31, 2023 · 4 comments

Comments

@jamesrhester
Copy link
Contributor

(Issue created so as not to forget it - this was work done as part of thinking about where CELL belongs. I don't think this is an urgent task).

Up until now, the link between a refined structural model and the data it is based on was simply by virtue of being in the same data block. In multi-block scenarios we need to explicitly link a particular model with data. So I propose:

  1. Create refine.id to identify a refinement.

  2. Create _refine_diffrn.refine_id and _refine_diffrn.diffrn_id: key data names of new REFINE_DIFFRN category, listing the datasets (_diffrn.id) used in the refinement e.g.

loop_
_refine_diffrn.refine_id
_refine_diffrn.diffrn_id
1 xray1
1 neutron1
2 xray1
2 xray2

loop_
_diffrn_radiation.diffrn_id
_diffrn_radiation.type
xray1 xray
xray2 xray
neutron1 neutron

describing two refinements, one which used a neutron and xray dataset and one which used two xray datasets.

Open question: how would this (if at all) interact with powder diffraction, where the data set is linked to _diffractogram.id rather than _diffrn.id.

@rowlesmr
Copy link
Collaborator

rowlesmr commented Jul 1, 2023

re powder, it also depends on how you define "refinement".

  1. One diffraction pattern, one or more phases: This is one refinement.
  2. More than one diffraction pattern, all refering to the same one or more phases: This is one refinement.
    • Doesn't matter if mixing X-rays and neutrons, or lab and natl. facility.
  3. More than one diffraction pattern, all refering to different one or more phases, with each diffraction pattern refined independently: This is many refinements.
    • this is just repeating 1. (or possibly 2.) many times
  4. More than one diffraction pattern, all refering to different one or more phases, refined parametrically: Is this one refinement?
    • eg I refine a thermal expansion coefficient, and derived cell prms from this coefficient, which is refined over all diffraction patterns.

@jamesrhester
Copy link
Contributor Author

jamesrhester commented Jul 5, 2023

This is absolutely an important task.

There's on easy answer: one refinement must include everything that contributes to the calculation of chi^2 (or the quantity being minimised). So if multiple diffraction patterns and phases are involved, then that is one refinement.

A refinement requires observations and a model. A refinement would associated with an identifier, which would come from a Set category to enforce no more than one refinement per data block, as that is the current implicit treatment in single crystal/cif_core. One might list all observations (diffractograms, constraints) contributing to the refinement in a separate REFINE_OBS loop.

A model, on the other hand, is the result of a particular refinement. Perhaps we want a separate _model.id to group structure (via #442, perhaps) and restraints into a model, and then the REFINE category simply refers to the model.

Not forgetting restraints and constraints, covered by a separate dictionary but also relevant to particular refinements.

Note these thoughts are relative to cif_core, and haven't touched on powder, for which multiple structures are often refined simultaneously.

@rowlesmr
Copy link
Collaborator

Open question: how would this (if at all) interact with powder diffraction, where the data set is linked to _diffractogram.id rather than _diffrn.id.

(coming from a powder point-of-view)

_diffrn.id is described as

Unique identifier for a diffraction data set collected under particular diffraction conditions.

It could be better described as

Unique identifier for a set of particular diffraction conditions.

You could then link _diffrn.id and _diffractogram.id through _diffractogram.diffrn_id. I think this is way would be preferred, as it keeps the powder implementation in the powder dictionary; _diffrn.diffractogram_id is starting to cross-pollinate.

Other arguments: I feel that _diffrn.id is kind of higher up the foodchain, and diffractogram should refer to it, rather than the other way around. Also, you can have many diffractograms taken at one set of experimental conditions, so this maintains the Setness of both categories.

data_diffraction_conditions
loop_ # just looping for consiseness. Pretend each row is in a different block
_diffrn.id
_diffrn.ambient_temperature
_diffrn.ambient_pressure
A 10 101.3
B 20 101.3
C 50 101.3
D 100 101.3
#...

data_pattern_1
_diffractogram.id 1
_diffractogram.diffrn_id A
#...

data_pattern_2
_diffractogram.id 2
_diffractogram.diffrn_id B
#...

data_pattern_3
_diffractogram.id 3
_diffractogram.diffrn_id B
#...

data_pattern_4
_diffractogram.id 4
_diffractogram.diffrn_id C
#...

@jamesrhester
Copy link
Contributor Author

That _diffractogram.id / _diffrn.id proposal sounds good. A little surprising this isn't in the pdCIF dictionary already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants