-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'develop' of https://github.com/nomad-coe/nomad-schema-p…
…lugin-simulation-data into develop
- Loading branch information
Showing
42 changed files
with
2,793 additions
and
295 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -56,4 +56,3 @@ jobs: | |
- uses: chartboost/ruff-action@v1 | ||
with: | ||
args: "format . --check --verbose" | ||
version: 0.1.8 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -122,6 +122,7 @@ celerybeat.pid | |
# Environments | ||
.env | ||
.venv | ||
.pyenv | ||
env/ | ||
venv/ | ||
ENV/ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -131,3 +131,12 @@ If using VSCode, you can add the following snippet to your `.vscode/launch.json` | |
where `${workspaceFolder}` refers to the NOMAD root. | ||
|
||
The settings configuration file `.vscode/settings.json` performs automatically applies the linting upon saving the file progress. | ||
|
||
|
||
## Main contributors | ||
| Name | E-mail | Topics | Github profiles | | ||
|------|------------|--------|-----------------| | ||
| Dr. Nathan Daelman | [[email protected]](mailto:[email protected]) | DFT, Precision | [@ndaelman-hu](https://github.com/ndaelman-hu) | | ||
| Dr. Bernadette Mohr | [[email protected]](mailto:[email protected]) | MD, FF | [@Bernadette-Mohr](https://github.com/Bernadette-Mohr) | | ||
| Dr. José M. Pizarro | [[email protected]](mailto:[email protected]) | GW, DMFT, BSE | [@JosePizarro3](https://github.com/JosePizarro3) | | ||
| Dr. Joseph F. Rudzinski (**Coordinator**) | [[email protected]](mailto:[email protected]) | General | [@JFRudzinski](https://github.com/JFRudzinski) | |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,17 +4,16 @@ NOMAD is an open source project that warmly welcomes community projects, contrib | |
|
||
You can reach us by different channels. You can send as directly an email to the main contributors list: | ||
|
||
!!! info "Main contributors" | ||
| Name | E-mail | Topics | | ||
|------|------------|--------| | ||
| Dr. Nathan Daelman | [[email protected]](mailto:[email protected]) | DFT, parsers, normalizers | | ||
| Dr. Alvin Noe Ladines | [[email protected]](mailto:[email protected]) | Parsers, workflows | | ||
| Dr. José M. Pizarro | [[email protected]](mailto:[email protected]) | GW, DMFT, BSE, parsers, workflows, normalizers | | ||
| Dr. Joseph F. Rudzinski (**Coordinator**) | [[email protected]](mailto:[email protected]) | MD, parsers, workflows, normalizers | | ||
| Name | E-mail | Topics | Github profiles | | ||
|------|------------|--------|-----------------| | ||
| Dr. Nathan Daelman | [[email protected]](mailto:[email protected]) | DFT, Precision | [@ndaelman-hu](https://github.com/ndaelman-hu) | | ||
| Dr. Bernadette Mohr | [[email protected]](mailto:[email protected]) | MD, FF | [@Bernadette-Mohr](https://github.com/Bernadette-Mohr) | | ||
| Dr. José M. Pizarro | [[email protected]](mailto:[email protected]) | GW, DMFT, BSE | [@JosePizarro3](https://github.com/JosePizarro3) | | ||
| Dr. Joseph F. Rudzinski (**Coordinator**) | [[email protected]](mailto:[email protected]) | General | [@JFRudzinski](https://github.com/JFRudzinski) | | ||
|
||
|
||
Alternatively, you can also: | ||
|
||
- Open an issue in the [general NOMAD Github project](https://github.com/nomad-coe/nomad), or in one of the [sub-projects](https://github.com/nomad-coe/nomad/tree/develop/dependencies/parsers) related with specific parsers. Our Github profile tags are [@ndaelman-hu](https://github.com/ndaelman-hu), [@ladinesa](https://github.com/ladinesa), [@JosePizarro3](https://github.com/JosePizarro3), and [@JFRudzinski](https://github.com/JFRudzinski). | ||
- Write us in the [NOMAD MatSci forum](https://matsci.org/c/nomad/32). Our tags there are @NateD, @ladinesa, @JosePizarro, and @JFRudzinski. | ||
- Send an email to [[email protected]](mailto:support@nomad-lab.eu). Please, add in the subject "ATTN - Area C". | ||
- Open an [**issue**](https://github.com/nomad-coe/nomad-schema-plugin-simulation-data/issues) in the [Github project](https://github.com/nomad-coe/nomad-schema-plugin-simulation-data/), and tag any of us. | ||
- Join the [Discord channel](https://discord.gg/Gyzx3ukUw8) and ask us there directly. | ||
- If you are included as a contributor in the Github project, you can open new [**discussions**](https://github.com/nomad-coe/nomad-schema-plugin-simulation-data/discussions) regarding a new data schema or modelling you want to see covered. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
# `Simulation` base section | ||
|
||
<!-- | ||
Improve these paragraphs once `Program` and `BaseSimulation` are integrated in `basesections.py` | ||
---> | ||
In NOMAD, all the simulation metadata is defined in the `Simulation` section. You can find its Python schema definition in [src/nomad_simulations/general.py](https://github.com/nomad-coe/nomad-schema-plugin-simulation-data/blob/develop/src/nomad_simulations/general.py). This section will appear under the `data` section for the [*archive*](https://nomad-lab.eu/prod/v1/staging/docs/reference/glossary.html#archive) metadata structure of each [*entry*](https://nomad-lab.eu/prod/v1/staging/docs/reference/glossary.html#entry). | ||
|
||
The `Simulation` section inherits from a _base section_ `BaseSimulation`. In NOMAD, a set of [base sections](https://nomad-lab.eu/prod/v1/staging/docs/howto/customization/base_sections.html) derived from the [Basic Formal Ontology (BFO)](https://basic-formal-ontology.org/) are defined. We used them to define `BaseSimulation` as an [`Activity`](http://purl.obolibrary.org/obo/BFO_0000015). The UML diagram is: | ||
|
||
<div class="click-zoom"> | ||
<label> | ||
<input type="checkbox"> | ||
<img src="../assets/simulation_base.png" alt="Simulation base section diagram." width="80%" title="Click to zoom in"> | ||
</label> | ||
</div> | ||
|
||
`BaseSimulation` contains the general information about the `Program` used, as well as general times of the simulation, e.g., the datetime at which it started (`datetime`) and ended (`datetime_end`). `Simulation` contains further information about the specific input and output sections ([see below](#sub-sections-in-simulation)) The detailed UML diagram of quantities and functions defined for `Simulation` is thus: | ||
|
||
<div class="click-zoom"> | ||
<label> | ||
<input type="checkbox"> | ||
<img src="../assets/simulation.png" alt="Simulation quantities and functions UML diagram." width="50%" title="Click to zoom in"> | ||
</label> | ||
</div> | ||
|
||
??? question "Notation for the section attributes in the UML diagram" | ||
We included the information of each attributes / quantities after its definition. The notation is: | ||
|
||
<name-of-quantity>: <type-of-quantity>, <units-of-quantity> | ||
|
||
Thus, `cpu1_start: np.float64, s` means that there is a quantity named `'cpu1_start'` of type `numpy.float64` and whose units are `'s'` (seconds). | ||
We also include the existance of sub-sections by bolding the name, i.e.: | ||
|
||
<name-of-sub-section>: <sub-section-definition> | ||
|
||
E.g., there is a sub-section under `Simulation` named `'model_method'` whose section defintion can be found in the `ModelMethod` section. We will represent this sub-section containment in more complex UML diagrams in the future using the containment arrow (see below for [an example using `Program`](#program)). | ||
|
||
We use double inheritance from `EntryData` in order to populate the `data` section in the NOMAD archive. All of the base sections discussed here are subject to the [public normalize function](normalize.md) in NOMAD. The private function `set_system_branch_depth()` is related with the [ModelSystem base section](model_system/model_system.md). | ||
|
||
## Main sub-sections in `Simulation` {#sub-sections-in-simulation} | ||
|
||
The `Simulation` base section is composed of 4 main sub-sections: | ||
|
||
1. `Program`: contains all the program information, e.g., `name` of the program, `version`, etc. | ||
2. `ModelSystem`: contains all the system information about geometrical positions of atoms, their states, simulation cells, symmetry information, etc. | ||
3. `ModelMethod`: contains all the methodological information, and it is divided in two main aspects: the mathematical model or approximation used in the simulation (e.g., `DFT`, `GW`, `ForceFields`, etc.) and the numerical settings used to compute the properties (e.g., meshes, self-consistent parameters, basis sets settings, etc.). | ||
4. `Outputs`: contains all the output properties, as well as references to the `ModelSystem` used to obtain such properties. It might also contain information which will populate `ModelSystem` (e.g., atomic occupations, atomic moments, crystal field energies, etc.). | ||
|
||
!!! note "Self-consistent steps, SinglePoint entries, and more complex workflows." | ||
The minimal unit for storing data in the NOMAD archive is an [*entry*](https://nomad-lab.eu/prod/v1/staging/docs/reference/glossary.html#entry). In the context of simulation data, an entry may contain data from a calculation on an individual system configuration (e.g., a single-point DFT calculation) using **only** the above-mentioned sections of the `Simulation` section. Information from self-consistent iterations to converge properties for this configuration are also contained within these sections. | ||
|
||
More complex calculations that involve multiple configurations require the definition of a *workflow* section within the archive. Depending on the situation, the information from individual workflow steps may be stored within a single or multiple entries. For example, for efficiency, the data from workflows involving a large amount of configurations, e.g., molecular dynamics trajectories, are stored within a single entry. Other standard workflows store the single-point data in separate entries, e.g., a `GW` calculation is composed of a `DFT SinglePoint` entry and a `GW SinglePoint` entry. Higher-level workflows, which simply connect a series of standard or custom workflows, are typically stored as a separate entry. You can check the [NOMAD simulations workflow schema](https://github.com/nomad-coe/nomad-schema-plugin-simulation-workflow) for more information. | ||
|
||
The following schematic represents a simplified representation of the `Simulation` section (note that the arrows here are a simple way of visually defining _inputs_ and _outputs_): | ||
|
||
<div class="click-zoom"> | ||
<label> | ||
<input type="checkbox"> | ||
<img src="../assets/simulation_composition.png" alt="Simulation composition diagram." width="90%" title="Click to zoom in"> | ||
</label> | ||
</div> | ||
|
||
### `Program` {#program} | ||
|
||
The `Program` base section contains all the information about the program / software / code used to perform the simulation. We consider it to be a [`(Continuant) Entity`](http://purl.obolibrary.org/obo/BFO_0000002) and contained within `BaseSimulation` as a sub-section. The detailed UML diagram is: | ||
|
||
<div class="click-zoom"> | ||
<label> | ||
<input type="checkbox"> | ||
<img src="../assets/program.png" alt="Program quantities and functions UML diagram." width="75%" title="Click to zoom in"> | ||
</label> | ||
</div> | ||
|
||
|
||
When [writing a parser](https://nomad-lab.eu/prod/v1/staging/docs/howto/customization/parsers.html), we recommend to start by instantiating the `Program` section and populating its quantities, in order to get acquainted with the NOMAD parsing infrastructure. | ||
|
||
For example, imagine we have a file which we want to parse with the following information: | ||
```txt | ||
! * * * * * * * | ||
! Welcome to SUPERCODE, version 7.0 | ||
... | ||
``` | ||
|
||
We can parse the program `name` and `version` by matching the texts (see, e.g., [Wikipedia page for Regular expressions, also called _regex_](https://en.wikipedia.org/wiki/Regular_expression)): | ||
|
||
```python | ||
from nomad.parsing.file_parser import TextParser, Quantity | ||
from nomad_simulations import Simulation, Program | ||
|
||
|
||
class SUPERCODEParser: | ||
""" | ||
Class responsible to populate the NOMAD `archive` from the files given by a | ||
SUPERCODE simulation. | ||
""" | ||
|
||
def parse(self, filepath, archive, logger): | ||
output_parser = TextParser( | ||
quantities=[ | ||
Quantity('program_version', r'version *([\d\.]+) *', repeats=False) | ||
] | ||
) | ||
output_parser.mainfile = filepath | ||
|
||
simulation = Simulation() | ||
simulation.program = Program( | ||
name='SUPERCODE', | ||
version=output_parser.get('program_version'), | ||
) | ||
# append `Simulation` as an `archive.data` section | ||
archive.data.append(simulation) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# How to use the `Simulation` schema | ||
|
||
!!! warning | ||
This page is still under construction. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,17 +3,9 @@ | |
<div id="cy"></div> | ||
--> | ||
|
||
**Welcome to the NOMAD documentation for the Schema developed for Computational Materials Scientists**, where you can find information about how to use the NOMAD standard schema for your own simulations. | ||
**Welcome to the NOMAD documentation for the Schema developed for Computational Materials Scientists**, where you can find information about how to use the NOMAD schema definition to store the data output by your simulations. | ||
This project contains all the information about the main base sections and their `SubSections` and `Quantities` relevant for simulations. We propose here a general schema which could then be used as a basis to build more specific schemas. | ||
|
||
NOMAD is a free open-source data management platform for Materials Science which follows the F.A.I.R. (Findable, Accessible, Interoperable, and Reusable) principles. This documentation page is a part of the more [general NOMAD documentation](https://nomad-lab.eu/prod/v1/staging/docs/), and more specifically, a part on the usage of [NOMAD base sections](https://nomad-lab.eu/prod/v1/staging/docs/howto/customization/base_sections.html). | ||
NOMAD is a free open-source data management platform for Materials Science which follows the F.A.I.R. (Findable, Accessible, Interoperable, and Reusable) principles. This documentation page is a part of the more [general NOMAD documentation](https://nomad-lab.eu/prod/v1/staging/docs/), as well as on the usage of [NOMAD base sections](https://nomad-lab.eu/prod/v1/staging/docs/howto/customization/base_sections.html). | ||
|
||
|
||
|
||
!!! info "Main contributors" | ||
Dr. Nathan Daelman, [[email protected]](mailto:[email protected]) | ||
|
||
Dr. Alvin Noe Ladines, [[email protected]](mailto:[email protected]) | ||
|
||
Dr. José M. Pizarro, [[email protected]](mailto:[email protected]) | ||
|
||
Dr. Joseph F. Rudzinski, [[email protected]](mailto:[email protected]) | ||
When designing the sections, we follow [SOLID principles](https://www.geeksforgeeks.org/solid-principle-in-programming-understand-with-real-life-examples/) for object-oriented programming. And throughout this documentation, we will use [UML diagrams](https://en.wikipedia.org/wiki/Class_diagram), both in a simplified and in a detailed manner, to draw the schemas relationships. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# `ModelMethod` | ||
|
||
!!! warning | ||
This page is still under construction. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# `ModelSystem` | ||
|
||
!!! warning | ||
This page is still under construction. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# The `normalize()` function | ||
|
||
Each base section defined using the NOMAD schema has a set of public functions which can be used at any moment when reading and parsing files in NOMAD. The `normalize(archive, logger)` function is a special case of such functions, which warrants an in-depth description. | ||
|
||
This function is run within the NOMAD infrastructure by the [`MetainfoNormalizer`](https://github.com/nomad-coe/nomad/blob/develop/nomad/normalizing/metainfo.py) in the following order: | ||
|
||
1. A child section's `normalize()` function is run before their/its parents' `normalize()` function. | ||
2. For sibling sections, the `normalize()` function is executed from the smaller to the larger `normalizer_level` attribute. If `normalizer_level` is not set or if they are the same for two different sections, the order is established by the attributes definition order in the parent section. | ||
3. Using `super().normalize(archive, logger)` runs the inherited section normalize function. | ||
|
||
Let's see some examples. Imagine having the following `Section` and `SubSection` structure: | ||
|
||
```python | ||
from nomad.datamodel.data import ArchiveSection | ||
|
||
|
||
class Section1(ArchiveSection): | ||
normalizer_level = 1 | ||
|
||
def normalize(self, achive, logger): | ||
# some operations here | ||
pass | ||
|
||
|
||
class Section2(ArchiveSection): | ||
normalizer_level = 0 | ||
|
||
def normalize(self, achive, logger): | ||
super().normalize(archive, logger) | ||
# Some operations here or before `super().normalize(archive, logger)` | ||
|
||
|
||
class ParentSection(ArchiveSection): | ||
|
||
sub_section_1 = SubSection(Section1.m_def, repeats=False) | ||
|
||
sub_section_2 = SubSection(Section2.m_def, repeats=True) | ||
|
||
def normalize(self, achive, logger): | ||
super().normalize(archive, logger) | ||
# Some operations here or before `super().normalize(archive, logger)` | ||
``` | ||
|
||
Now, `MetainfoNormalizer` will be run on the `ParentSection`. Applying **rule 1**, the `normalize()` functions of the `ParentSection`'s childs are executed first. The order of these functions is established by **rule 2** with the `normalizer_level` atrribute, i.e., all the `Section2` (note that `sub_section_2` is a list of sections) `normalize()` functions are run first, then `Section1.normalize()`. Then, the order of execution will be: | ||
|
||
1. `Section2.normalize()` | ||
2. `Section1.normalize()` | ||
3. `ParentSection.normalize()` | ||
|
||
In case we do not assign a value to `Section1.normalizer_level` and `Section2.normalizer_level`, `Section1.normalize()` will run first before `Section2.normalize()`, due to the order of `SubSection` attributes in `ParentSection`. Thus the order will be in this case: | ||
|
||
1. `Section1.normalize()` | ||
2. `Section2.normalize()` | ||
3. `ParentSection.normalize()` | ||
|
||
By checking on the `normalize()` functions and **rule 3**, we can establish whether `ArchiveSection.normalize()` will be run or not. In `Section1.normalize()`, it will not, while in the other sections, `Section2` and `ParentSection`, it will. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# `Outputs` | ||
|
||
!!! warning | ||
This page is still under construction. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.