-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
28 add md relevant outputs #87
Conversation
@JosePizarro3 @ndaelman-hu Could you do a preliminary brief check of the structure within Right now, all energies fall under Will this work for the quantum case? Looking at the old schema (old quantities that I have not yet converted are commented out in the file) it seems like there are a lot of sub-structures in terms of the energies. How much are these used? Can you conceive a good functionality with flat lists? |
Ok, let me check. I'd like to see how much of the quantum energies were restructured into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some quick takes, not a thorough review.
My main impression is that there's some confusion between classical vs quantum and potential vs energy. Regarding the former, it's very simple: all classical terms have a quantum counterpart. I think it makes more sense to distinguish the systems for which we specify the energy, e.g. electrons, nuclei, both, external.
Also note that magnetism is completely left out.
class ClassicalEnergy(Energy): | ||
""" | ||
Abstract physical property section describing some classical energy of a (sub)system. | ||
""" | ||
|
||
def normalize(self, archive, logger) -> None: | ||
super().normalize(archive, logger) | ||
|
||
if not self.type: | ||
self.type == 'classical' | ||
elif self.type != 'classical': | ||
logger.error( | ||
f"Misidentified type for classical energy." | ||
) | ||
|
||
class QuantumEnergy(Energy): | ||
""" | ||
Abstract physical property section describing some quantum energy of a (sub)system. | ||
""" | ||
|
||
def normalize(self, archive, logger) -> None: | ||
super().normalize(archive, logger) | ||
|
||
if not self.type: | ||
self.type == 'quantum' | ||
elif self.type != 'quantum': | ||
logger.error( | ||
f"Misidentified type for quantum energy." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh, I'm not sure what would constitute the boundary between classical and quantum energies.
Every classical energy has a quantum variant. There are some unique quantum phenomena that don't have a classical counterpart, but it's so few that I would not distinguish at this level.
The only example I can think of now is exchange energy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely understand your point. My reasons for proposing to distinguish between the two were:
- To help clarify the origin of certain energy contributions in the case of a hybrid model. E.g., you may have a QM/MM model and want to store a particular contribution to the total energy from the QM and MM parts of your model separately.
- I will argue against myself here and say that the proper way to do this would be with the method_ref, which is true, but this is a bit opaque for a user. Having the
type
quantity makes it easier to see quickly what level of model one used to get this output.
- I will argue against myself here and say that the proper way to do this would be with the method_ref, which is true, but this is a bit opaque for a user. Having the
- There are certain energy contributions, e.g., kinetic energy, where the standard method of calculation is fundamentally different from a classical or a quantum model.
- Again maybe it is not a good idea to conflate the output and the method too much. In fact this goes against a bunch of my comments where I asked if we should take the reference to a specific calculation method out of the description. So, from this perspective I am on the fence.
- There are many examples where there is not a quantum counterpart to the classical energy. In addition to a few quantum phenomena as you mentioned, in classical simulations it is common to store the contribution to the total energy (usually per particle) from a particular term of the force field.
- E.g., you may want to know the total energy on a particle in a configuration coming from some dihedral potential only.
- In the previous schema it became unclear at times the usage of certain contributions. This is partially due to the inclusion of specific methodology for calculation within the description, but also because I think in some cases people had a certain class of models in mind. So, I was trying to prevent this a little.
- I think this is why you thought there is a confusion between energy and potential. The potential is indeed stored within the FF section of method. But the corresponding energy is the evaluation of that potential or set of potentials for a particular configuration.
All that being said, I definitely am not set on this distinction. Curious to hear your rebuttal given the additional information, and also the thoughts of @Bernadette-Mohr and @JosePizarro3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding your points:
- QM/MM are indeed a mapping between methods and structure. How would you, using the distinction above, distinguish the energy contributions in QM/QM/MM (e.g. CCSD/DFT/Amber) or QM/MM/MM?
- ... QM in the limit of heavy particles converges to CM, without exception. The main difference in maths is that variables and functions in CM become functions (in the Hilbert space) and operators in QM. That is to say, the same physics is presented in higher-order structures. Some new physical phenomena emerge, but all the old ones remain. More importantly, why does the mathematical representation matter here? We're just getting scalars (maybe tensors) in our data.
- Well, that's just plain wrong. At best, these communities are not interested in these contributions, but that's another story. As for your example: the dihedral potential is captured by DFT, just not named as such. ML potentials actually use DFT data to reformulate the physics in terms of the (local) neighborhood.
I agree with your sentiment that a lot of these contributions are model-specific, and there's value in extracting those. Then we need a system to annotate the method terms. I would mentally distinguish it from the contributions of physical phenomena, which are not subordinate to the method used to simulate them. Let's reserve the energies here for what experimentalists could also request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, here's my unqualified opinion:
- Making the different energy contributions storable/accessible is valid, in my opinion, either because they were relevant for the original study or because somebody is interested in that particular aspect when conducting secondary analysis.
- While I agree with @ndaelman-hu that, theoretically, especially QM and MM/MD describe the same physics, each approach has it's strengths and weaknesses.
- We should absolutely provide that information somewhere, but maybe more with methods than with outputs?
One use case that comes to mind is training data for an ML model. If you want to have any faith in your predictions, you should be interested in the underlying distribution or potential issues.
- @JFRudzinski, I'm not sure why you think including the methodology makes things confusing here; streamlining what information we provide where will probably make things clearer, though.
Generally, it seems that the slightly different usage of "potential" vs. "energy" in the different methods is a source of confusion even among us, so I think we should try to prevent this with the users. I remember us discussing something like a "derived" flag, to differentiate quantities that are calculated and aspects that are directly part of or automatically generated by the method. Could something along this line help?
class AngleEnergy(ClassicalEnergy): | ||
""" | ||
Physical property section describing contributions to the potential energy from angle interactions of a (sub)system. | ||
""" | ||
|
||
def normalize(self, archive, logger) -> None: | ||
super().normalize(archive, logger) | ||
|
||
|
||
class DihedralEnergy(ClassicalEnergy): | ||
""" | ||
Physical property section describing contributions to the potential energy from dihedral interactions of a (sub)system. | ||
""" | ||
|
||
def normalize(self, archive, logger) -> None: | ||
super().normalize(archive, logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused now, these look more like terms of the potential rather than energy contributions.
Instead, I'd expect rotational and vibrational energies (especially useful in light of the virial theorem), as subtypes of kinetic energy. These modes would classically be described in function of the (dihedral) angle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear I paste my explanation from above:
In classical simulations it is common to store the contribution to the total energy (usually per particle) from a particular term of the force field. E.g., you may want to know the total energy on a particle in a configuration coming from some dihedral potential only. I think this is why you thought there is a confusion between energy and potential. The potential is indeed stored within the FF section of method. But the corresponding energy is the evaluation of that potential or set of potentials for a particular configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the contribution to the total energy (usually per particle) from a particular term of the force field
Yes, in QM we do the same with the Hamiltonian (which may encode more than just the potential). Up to a certain extent, QM codes also report the energy decomposition as such, though not always. The crux is that this is model-specific, by definition. I.e. you're generating a semantic language to describe all physical models. This may be fine, but should then tie back to the method
section.
What I was suggesting is to focus instead on physical processes. I think those are more interoperable and of general interest i. E.g. kinetic energy which breaks apart into translational, rotational, and vibrational is relevant for the virial theorem, entropy, etc. I would therefore make these into subsections.
total energy on a particle in a configuration
Well, this is firstly partitioned by the structure, so you should decide on a mapping there. The rest is method specific data, so I would see to tie to the method definition. Still, one can also analyze (but maybe not easily extract) the dihedral contribution and others in a rotational and vibrational mode.
class ExternalEnergy(ClassicalEnergy): | ||
""" | ||
Physical property section describing contributions to the potential energy from external interactions of a (sub)system. | ||
""" | ||
|
||
def normalize(self, archive, logger) -> None: | ||
super().normalize(archive, logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would consider this more of an energy flux, so the dimensionality is slightly different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This term is similar to the contributions coming from specific terms in the force field as defined above. Sometimes you define an external force within the force field (i.e., not linked to a particular atom type + config). Then you may want to again know the contribution to the total energy (e.g., per particle) from this external potential. Perhaps it should be renamed?
So, it's not a flux. Maybe there is a different energy contribution that you are describing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it all depends on where you draw the boundary between your system and the outer world. If the latter interacts with the system, we can describe that as an energy flux, e.g.:
- a thermal bath: heat (a flux of kinetic and radiation energy) is being exchanged.
- an external electric or magnetic field: fluctuations enter the system as EM-waves, i.e. a flux of radiation energy.
Given these examples, we can represent the outer-world in simpler models, abstracting away a lot of complexity. However, I don't think most people consider them part of the system, so, in line with conservation of energy, they interact via a flux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JFRudzinski: With "external force," do you mean something like, e.g., a biasing potential or a wall in your box? Maybe this makes what we're talking about a bit clearer to @ndaelman-hu.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ow, that makes more sense! Yes, these can't be modeled via flux, as they are more so numerical / statistical tricks instead of physical forces (well, the biasing potential, at least).
I would definitely seek to associate these approaches with the method itself.
class ZeroPointEnergy(QuantumEnergy): | ||
""" | ||
Physical property section describing the zero-point vibrational energy of a (sub)system, | ||
calculated using the method described in zero_point_method. | ||
""" | ||
# ! Someone check this description! | ||
# ? Do we really want to specify the method here? This can't be user-defined? | ||
|
||
def normalize(self, archive, logger) -> None: | ||
super().normalize(archive, logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quantum-specific subtype of vibrational energy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you be more specific what you suggest changing in this case? I need your guidance for the quantum contributions (maybe it needs to wait until we figure out whether or not to distinguish Quant/Class though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is fine, I'd alter the supertype to ZeroPointEnergy(VibrationalEnergy)
.
This is the lowest vibrational state, which has a non-zero contribution.
class ZeroTemperatureEnergy(QuantumEnergy): | ||
""" | ||
Physical property section describing the total energy of a (sub)system extrapolated to $T=0$, based on a free-electron gas argument. | ||
""" | ||
# ! Someone check this description! | ||
# ? Do we really want to specify the method here? This can't be user-defined? | ||
|
||
def normalize(self, archive, logger) -> None: | ||
super().normalize(archive, logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be introduced in the context of a workflow. Would remove it here.
class NuclearRepulsionEnergy(QuantumEnergy): | ||
""" | ||
Physical property section describing the nuclear-nuclear repulsion energy of a (sub)system. | ||
""" | ||
|
||
def normalize(self, archive, logger) -> None: | ||
super().normalize(archive, logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just electrostatics for the nuclei. See my take above.
# van_der_waals = SubSection( | ||
# sub_section=EnergyEntry.m_def, | ||
# description=""" | ||
# Contains the value and information regarding the Van der Waals energy. A multiple | ||
# occurence is expected when more than one van der Waals methods are defined. The | ||
# van der Waals kind should be specified in Energy.kind | ||
# """, | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can both refer to a set of methodologies and the Casimir effect.
# double_counting = SubSection( | ||
# sub_section=EnergyEntry.m_def, | ||
# categories=[FastAccess], | ||
# description=""" | ||
# Double counting correction when performing Hubbard model calculations. | ||
# """, | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While not a physical energy, per se, it is a term in the Hamiltonian used for corrections. This begs the question: do we want to distinguish Hamiltonian terms or physical phenomena?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just repeating myself from above: We should definitely NOT store the Hamiltonian terms here. That goes in the method. However, we may store the evaluation of the Hamiltonian for particular configurations of particles. This should be more clear though for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just repeating myself from above: We should definitely NOT store the Hamiltonian terms here
I agree with you (see above), but I have the impression that you were doing that anyhow here.
# fermi = Quantity( | ||
# type=np.dtype(np.float64), | ||
# shape=[], | ||
# unit='joule', | ||
# description=""" | ||
# Fermi energy (separates occupied from unoccupied single-particle states) | ||
# """, | ||
# categories=[EnergyTypeReference, EnergyValue], | ||
# ) | ||
|
||
# highest_occupied = Quantity( | ||
# type=np.dtype(np.float64), | ||
# unit='joule', | ||
# shape=[], | ||
# description=""" | ||
# The highest occupied energy. | ||
# """, | ||
# ) | ||
|
||
# lowest_unoccupied = Quantity( | ||
# type=np.dtype(np.float64), | ||
# unit='joule', | ||
# shape=[], | ||
# description=""" | ||
# The lowest unoccupied energy. | ||
# """, | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are already accommodated else where in the schema. May be removed here.
# # TODO I suggest ewald is moved to "long range" under electrostatic->energyentry, unless there is some other usage I am misunderstanding | ||
# ewald = SubSection( | ||
# sub_section=EnergyEntry.m_def, | ||
# description=""" | ||
# Contains the value and information regarding the Ewald energy. | ||
# """, | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subtype of electrostatic energy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a major question that I have: How should we deal with sub-contributions? One option is to keep a flat list (flat inside total_energy.contributions at least), and somehow distinguish sub contributions with names and descriptions.
The other option is to nest. If we do this, I think we should be restrictive about the nesting, i.e., specify energy sub-contributions that are relevant for a particular contribution. We should not allow free recursive nesting or sub-contributions that don't make sense theoretically/methodologically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you can have both options at once if you use inheritance.
Ok, I took a look on the code. I can send later on some notes on the DFT total energy, and how this could be corrected in other approaches. I also agree with what Nathan said on the confusion of @JFRudzinski your approach is good, so here I am just giving an alternative idea, also for @ndaelman-hu @Bernadette-Mohr about how to deal with contributions to PhysicalProperties. Regarding class definitions, these changes show how annoying is to define multiple contributions using inheritance. In my opinion, I think contributions could be simplified to an abstract class and some identifier (a string/MEnum) used both for
My proposal would be thus something simpler, a string or MEnum (see the abstract class class BaseTotalEnergy(PhysicalProperty):
"""
A base section class defining the abstract concept of `TotalEnergy`. This section is useful when defining contribution to the `TotalEnergy`, as well as the `TotalEnergy` section itself.
"""
value = Quantity(
type=np.float64,
unit='joule',
description="""
The value of the energy.
""",
)
def normalize(self, archive, logger) -> None:
super().normalize(archive, logger)
class ClassicalTotalEnergy(BaseTotalEnergy):
"""
A base section defining the concept of classical `TotalEnergy`. This section is useful for categorizing TotalEnergy in terms of classical or quantum approaches.
"""
name = Quantity(type=MEnum('kinetic', 'potential', 'intermolecular'...),) # more strings here
def normalize(self, archive, logger) -> None:
super().normalize(archive, logger)
class QuantumTotalEnergy(BaseTotalEnergy):
"""
A base section defining the concept of classical `TotalEnergy`. This section is useful for categorizing TotalEnergy in terms of classical or quantum approaches.
"""
name = Quantity(type=MEnum('kinetic', 'Hartree', 'xc', ...),) # more strings here
def normalize(self, archive, logger) -> None:
super().normalize(archive, logger)
class TotalEnergy(BaseTotalEnergy):
"""
... TAKE DESCRIPTION FROM TAXONOMY
"""
iri = '...'
classical_contributions = SubSection(
sub_section=ClassicalTotalEnergy.m_def, repeats=True
)
quantum_contributions = SubSection(
sub_section=QuantumTotalEnergy.m_def, repeats=True
)
def normalize(self, archive, logger) -> None:
super().normalize(archive, logger) The question here is also, at the end of the day, how important are these contributions? I think we can ease the taxonomy by not overclassifying these kind of things. Ah, and I edit to say that the description of the property should then reflect the contributions and what are these. I think we should bring this last point (if agreed) to the taxonomy TF. |
@ndaelman-hu I found this video which might help as a first approach: https://youtu.be/5mxVR1ikQQo?si=Cfxzs0FlKyqrkFuv&t=1424 But I am sure there are better notes somewhere, I just casually browsed Youtube. |
I definitely like the simplicity of this approach. I guess my only concern is losing the descriptions of the contributions. Could these actually simply be set during normalization? Since there will be no description coming from the taxonomy for these contributions? Oh ok, I see now that you suggested adding these to the taxonomy. This would work for physical contributions but not really specific model contributions (see my descriptions about evaluation of terms in the Hamiltonian above). I think for the later we need to add descriptions within the schema. (tbh I also do not know how important these contributions are in general. But I think we still want to have some sort of robust scheme for dealing with them because they will be used at least in some niche cases) |
685fe98
to
2c2f2fd
Compare
Sorry if this is confusing, but the easiest way to apply the new structure was to simply manually edit/add the files starting from the head of develop. So, this PR got killed in the process. We can keep discussing the energies here to close out the open discussion. But after that I created a new PR: #91 |
Yeah sorry, indeed it might be better to do that. |
it was actually quite easy to amend the structure, it's just that I lost the connection to my other changes, but tbh i don't really care since all this is very preliminary anyway |
@Bernadette-Mohr Thank you very much for your input, it is very valuable. Nathan and I just had a meeting in an attempt to settle some aspects of the above discussions. Here is a short outline of our proposal: class Energy(PhysicalProperty):
"""
Base class for all energy properties
"""
value = Quantity(
type=np.float64,
unit='joule',
description="""
The value of the energy.
""",
)
def normalize(self, archive, logger) -> None:
super().normalize(archive, logger)
class TotalEnergy(Energy):
"""
Just the total energy
"""
def normalize(self, archive, logger) -> None:
super().normalize(archive, logger)
class PropertyContribution(PhysicalProperty):
"""
Base class for defining contribution link to some method. Will be used at least for energies and forces
"""
method_contribution_ref = Quantity(
type=MethodContribution,
description="""
""",
a_eln=ELNAnnotation(component='ReferenceEditQuantity'),
)
def normalize(self, archive, logger) -> None:
super().normalize(archive, logger)
class EnergyContribution(PropertyContribution, Energy):
"""
Energy property linked to method contribution
"""
name = Quantity(type=str,) # may be set explicitly or via method_contr_ref
def normalize(self, archive, logger) -> None:
super().normalize(archive, logger) Let me know if there are any comments or questions. My plan is to implement this structure today and then switch to #91 for another review |
new PR for #28 since the first one is somehow broken