From 367efba5e71314ff3eb815209e298305ecb043e4 Mon Sep 17 00:00:00 2001 From: Axel Huebl Date: Mon, 8 Mar 2021 15:28:03 -0800 Subject: [PATCH 1/8] IterationEncoding: stepBased Add standard guidance for stepBased iteration encoding. --- STANDARD.md | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/STANDARD.md b/STANDARD.md index b997d02..8e6add8 100644 --- a/STANDARD.md +++ b/STANDARD.md @@ -48,7 +48,9 @@ for changes in keywords). Hierarchy of the Data File -------------------------- -The used hierarchical data file format must provide the capability to +For simplicity, we call the storage concept of a specific data format that implements the openPMD hierarchy "files", even if they are implemented in-memory or by other means. + +The used hierarchical data format must provide the capability to - create groups and sub-groups (in-file directories) - create multi-dimensional, homogeneous array-based data structures @@ -85,7 +87,10 @@ Each file's *root* group (path `/`) must at least contain the attributes: to create a real path from it replace all occurrences of `%T` with the integer value of the iteration, e.g., `/data/%T` becomes `/data/100` - - allowed value: fixed to `/data/%T/` for this version of the standard + - allowed values: + - see *Iterations and Time Series* below + - for `fileBased` and `groupBased`, this is fixed to `/data/%T/` + - for `stepBased` this is fixed to `/data/` - note: all the data that is formatted according to the present standard (i.e. both the meshes and the particles) is to be stored within a path of the form given by `basePath` (e.g. in @@ -195,9 +200,8 @@ standard: Iterations and Time Series -------------------------- -Iterations can be encoded in either the file name of each master-file of a -time step or in groups of the same file. (Here, an *iteration* refers -to a single simulation cycle.) +Iterations can be encoded in either the file name of each individual files, in groups of the same file, or in data sets & attributes (with supported data formats). +(Here, an *iteration* might refer to a single measurement or simulation cycle.) The chosen style shall not vary within a related set of iterations. @@ -212,6 +216,7 @@ Each file's *root* group (path `/`) must further define the attributes: - allowed values: - `fileBased` (multiple files) - `groupBased` (one file) + - `stepBased` (one file with internally encoding, if supported by the data format) - `iterationFormat` - type: *(string)* @@ -225,17 +230,16 @@ Each file's *root* group (path `/`) must further define the attributes: - examples: - for `fileBased`: - `filename_%T.h5` (without file system directories) - - for `groupBased`: + - for `groupBased`: (fixed value) - `/data/%T/` (must be equal to and encoded in the `basePath`) + - for `stepBased`: (fixed value) + - `slowest varying index` Required Attributes for the `basePath` -------------------------------------- -In addition to holding information about the iteration, each series of -files (`fileBased`) or series of groups (`groupBased`) should have -attributes that describe the current time and the last -time step. +In addition to holding information about the iteration, each series of files (`fileBased`), series of groups (`groupBased`) or internally encoded iterations (`stepBased`) should have attributes that describe the current time and the last time step. - `time` - type: *(floatX)* From c81864ac6d95a4ebb8307ae576d746bef9ad3d41 Mon Sep 17 00:00:00 2001 From: Axel Huebl Date: Mon, 8 Mar 2021 17:13:39 -0800 Subject: [PATCH 2/8] ADIOS: New Attribute, Variable & stepbased Encoding --- FORMAT_ADIOS.md | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/FORMAT_ADIOS.md b/FORMAT_ADIOS.md index 30ca3fe..ec09fd3 100644 --- a/FORMAT_ADIOS.md +++ b/FORMAT_ADIOS.md @@ -32,3 +32,35 @@ Output from `bpls -A` for a boolean attribute `pybool` stored in the location of There is no convention yet for a unique representation of ADIOS2 variables with boolean type. Thus, implementations should cast the data to and from `unsigned char` instead. + +## `stepBased` Encoding of Iterations + +In order to correlate openPMD iterations with ADIOS steps, the *root* group (path `/`) in ADIOS must contain a variable: + + - `__step__` + - type: 1-dimensional array containing N *(int)* elements, where N is the number of ADIOS steps + - description: for each ADIOS step, this variable needs to be updated with the corresponding openPMD iteration. + - note: ADIOS steps are absolute and not every ADIOS step or openPMD iteration contains an update for each declared openPMD record. + - advice to implementers: [decide on this] an openPMD iteration for different openPMD records might be spread over multiple ADIOS steps. + An iteration of an openPMD record must correspond to exactly one ADIOS step. + +## Datasets + +An openPMD **data set** is represented by an group prefix that contains an ADIOS variable `__data__`. + +**attributes** are defined further below and can also appear at the dataset's **group** prefix level. + +## Attributes + +openPMD **attributes** stored as ADIOS `Variables` at the location where they would usually be stored. + +Example for a mesh record `E` with record component `x` and attributes `unitDimension` and `unitSI`: +``` + double /data/meshes/E/unitDimension 10*{7} + double /data/meshes/E/x/__data__ 10*{1000} + double /data/meshes/E/x/position 10*{1} + double /data/meshes/E/x/unitSI 10*scalar +``` + +This example uses `stepBased` iteration encoding, but other iteration encodings would work similarly with their respective `basePath` prefix. + From cee733039331c9d7f6732611a59bb353e88931f2 Mon Sep 17 00:00:00 2001 From: Axel Huebl Date: Wed, 7 Apr 2021 14:34:44 -0700 Subject: [PATCH 3/8] Snapshot naming, Remove Specs Mapping --- FORMAT_ADIOS.md | 9 +-------- STANDARD.md | 17 +++++++++++++++-- 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/FORMAT_ADIOS.md b/FORMAT_ADIOS.md index ec09fd3..c8e024c 100644 --- a/FORMAT_ADIOS.md +++ b/FORMAT_ADIOS.md @@ -35,14 +35,7 @@ Thus, implementations should cast the data to and from `unsigned char` instead. ## `stepBased` Encoding of Iterations -In order to correlate openPMD iterations with ADIOS steps, the *root* group (path `/`) in ADIOS must contain a variable: - - - `__step__` - - type: 1-dimensional array containing N *(int)* elements, where N is the number of ADIOS steps - - description: for each ADIOS step, this variable needs to be updated with the corresponding openPMD iteration. - - note: ADIOS steps are absolute and not every ADIOS step or openPMD iteration contains an update for each declared openPMD record. - - advice to implementers: [decide on this] an openPMD iteration for different openPMD records might be spread over multiple ADIOS steps. - An iteration of an openPMD record must correspond to exactly one ADIOS step. +The `iterationEncoding` mode `stepBased` must be implemented via ADIOS steps. ## Datasets diff --git a/STANDARD.md b/STANDARD.md index 8e6add8..093c2a1 100644 --- a/STANDARD.md +++ b/STANDARD.md @@ -216,7 +216,7 @@ Each file's *root* group (path `/`) must further define the attributes: - allowed values: - `fileBased` (multiple files) - `groupBased` (one file) - - `stepBased` (one file with internally encoding, if supported by the data format) + - `stepBased` (one file with internal encoding for iterations, if supported by the data format) - `iterationFormat` - type: *(string)* @@ -227,13 +227,26 @@ Each file's *root* group (path `/`) must further define the attributes: for `fileBased` formats the iteration must be included in the file name; the format depends on the selected `iterationEncoding` method + - note: it is not required that every openPMD iteration contains an update for each declared openPMD record (see below) - examples: - for `fileBased`: - `filename_%T.h5` (without file system directories) - for `groupBased`: (fixed value) - `/data/%T/` (must be equal to and encoded in the `basePath`) - for `stepBased`: (fixed value) - - `slowest varying index` + - data-format internal convention + - *slowest varying index* of data + +### `stepBased` Encoding of Iterations + +In order to correlate openPMD iterations with an index of data-format internal updates/steps or an index in the slowest varying dimension of an array, the *root* group (path `/`) must contain an additional variable once `stepBased` is chosen for `iterationEncoding`: + + - `snapshot` + - type: 1-dimensional array containing N *(int)* elements, where N is the number of updates/steps in the data format + - description: for each update/step in a data format, this variable needs to be updated with the corresponding openPMD iteration. + - note: in some data formats, updates/steps are absolute and not every update/step contains an update for each declared openPMD record + - advice to implementers: an openPMD iteration might be spread over multiple updates/steps, but not vice versa. + In such a scenario, an individual openPMD record's update/step must appear exactly once per iteration. Required Attributes for the `basePath` From 2a6938c2a47cd1e4b46aea42873c41253413cbce Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Franz=20P=C3=B6schel?= Date: Fri, 19 Aug 2022 11:43:11 +0200 Subject: [PATCH 4/8] Update stepBased -> variableBased, remove schema2021-specifics --- FORMAT_ADIOS.md | 18 +++++++++--------- STANDARD.md | 12 ++++++------ 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/FORMAT_ADIOS.md b/FORMAT_ADIOS.md index c8e024c..b025b93 100644 --- a/FORMAT_ADIOS.md +++ b/FORMAT_ADIOS.md @@ -33,27 +33,27 @@ Output from `bpls -A` for a boolean attribute `pybool` stored in the location of There is no convention yet for a unique representation of ADIOS2 variables with boolean type. Thus, implementations should cast the data to and from `unsigned char` instead. -## `stepBased` Encoding of Iterations +## `variableBased` Encoding of Iterations -The `iterationEncoding` mode `stepBased` must be implemented via ADIOS steps. +The `iterationEncoding` mode `variableBased` must be implemented via ADIOS steps. ## Datasets -An openPMD **data set** is represented by an group prefix that contains an ADIOS variable `__data__`. +An openPMD **data set** is represented by an ADIOS `Variable` at the location where it would usually be stored. **attributes** are defined further below and can also appear at the dataset's **group** prefix level. ## Attributes -openPMD **attributes** stored as ADIOS `Variables` at the location where they would usually be stored. +openPMD **attributes** stored as ADIOS `Attributes` at the location where they would usually be stored. Example for a mesh record `E` with record component `x` and attributes `unitDimension` and `unitSI`: ``` - double /data/meshes/E/unitDimension 10*{7} - double /data/meshes/E/x/__data__ 10*{1000} - double /data/meshes/E/x/position 10*{1} - double /data/meshes/E/x/unitSI 10*scalar + double /data/meshes/E/unitDimension attr = {0, 0, 0, 0, 0, 0, 0} + int32_t /data/meshes/E/x {1000} + double /data/meshes/E/x/position attr = 0 + double /data/meshes/E/x/unitSI attr = 1 ``` -This example uses `stepBased` iteration encoding, but other iteration encodings would work similarly with their respective `basePath` prefix. +This example uses `variableBased` iteration encoding, but other iteration encodings would work similarly with their respective `basePath` prefix. diff --git a/STANDARD.md b/STANDARD.md index 093c2a1..4e5af04 100644 --- a/STANDARD.md +++ b/STANDARD.md @@ -90,7 +90,7 @@ Each file's *root* group (path `/`) must at least contain the attributes: - allowed values: - see *Iterations and Time Series* below - for `fileBased` and `groupBased`, this is fixed to `/data/%T/` - - for `stepBased` this is fixed to `/data/` + - for `variableBased` this is fixed to `/data/` - note: all the data that is formatted according to the present standard (i.e. both the meshes and the particles) is to be stored within a path of the form given by `basePath` (e.g. in @@ -216,7 +216,7 @@ Each file's *root* group (path `/`) must further define the attributes: - allowed values: - `fileBased` (multiple files) - `groupBased` (one file) - - `stepBased` (one file with internal encoding for iterations, if supported by the data format) + - `variableBased` (one file with internal encoding for iterations, if supported by the data format) - `iterationFormat` - type: *(string)* @@ -233,13 +233,13 @@ Each file's *root* group (path `/`) must further define the attributes: - `filename_%T.h5` (without file system directories) - for `groupBased`: (fixed value) - `/data/%T/` (must be equal to and encoded in the `basePath`) - - for `stepBased`: (fixed value) + - for `variableBased`: (fixed value) - data-format internal convention - *slowest varying index* of data -### `stepBased` Encoding of Iterations +### `variableBased` Encoding of Iterations -In order to correlate openPMD iterations with an index of data-format internal updates/steps or an index in the slowest varying dimension of an array, the *root* group (path `/`) must contain an additional variable once `stepBased` is chosen for `iterationEncoding`: +In order to correlate openPMD iterations with an index of data-format internal updates/steps or an index in the slowest varying dimension of an array, the *root* group (path `/`) must contain an additional variable once `variableBased` is chosen for `iterationEncoding`: - `snapshot` - type: 1-dimensional array containing N *(int)* elements, where N is the number of updates/steps in the data format @@ -252,7 +252,7 @@ In order to correlate openPMD iterations with an index of data-format internal u Required Attributes for the `basePath` -------------------------------------- -In addition to holding information about the iteration, each series of files (`fileBased`), series of groups (`groupBased`) or internally encoded iterations (`stepBased`) should have attributes that describe the current time and the last time step. +In addition to holding information about the iteration, each series of files (`fileBased`), series of groups (`groupBased`) or internally encoded iterations (`variableBased`) should have attributes that describe the current time and the last time step. - `time` - type: *(floatX)* From 34f9cc78b7bd0c97e581a1834cba97d226e0273a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Franz=20P=C3=B6schel?= Date: Fri, 19 Aug 2022 12:00:27 +0200 Subject: [PATCH 5/8] Document updated usage of snapshot attribute --- STANDARD.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/STANDARD.md b/STANDARD.md index 4e5af04..7edf680 100644 --- a/STANDARD.md +++ b/STANDARD.md @@ -239,7 +239,7 @@ Each file's *root* group (path `/`) must further define the attributes: ### `variableBased` Encoding of Iterations -In order to correlate openPMD iterations with an index of data-format internal updates/steps or an index in the slowest varying dimension of an array, the *root* group (path `/`) must contain an additional variable once `variableBased` is chosen for `iterationEncoding`: +In order to correlate openPMD iterations with an index of data-format internal updates/steps or an index in the slowest varying dimension of an array, the iteration base path (default: path `/default`) must contain an additional variable once `variableBased` is chosen for `iterationEncoding`: - `snapshot` - type: 1-dimensional array containing N *(int)* elements, where N is the number of updates/steps in the data format @@ -248,6 +248,14 @@ In order to correlate openPMD iterations with an index of data-format internal u - advice to implementers: an openPMD iteration might be spread over multiple updates/steps, but not vice versa. In such a scenario, an individual openPMD record's update/step must appear exactly once per iteration. +Notes: + +* In implementations without support for IO steps, the variable-based encoding of iterations may still be used for storage of a single iteration. + In that case, the `snapshot` attribute is optional and defaults to zero (0). +* In implementations with support for IO steps, the `snapshot` attribute may optionally be used in group-based encoding to associate openPMD iterations with IO steps. + In group-based encoding, there is still only one instance of this attribute globally (`/data/snapshot`). + In consequence, the attribute shall only be written if modifiable attributes are supported by the implementation. + Required Attributes for the `basePath` -------------------------------------- From a285b6901d68e4eca4018b8468ace8f2cc864219 Mon Sep 17 00:00:00 2001 From: Axel Huebl Date: Wed, 12 Oct 2022 14:05:55 -0700 Subject: [PATCH 6/8] definition updates from today --- STANDARD.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/STANDARD.md b/STANDARD.md index 7edf680..e3356f0 100644 --- a/STANDARD.md +++ b/STANDARD.md @@ -214,9 +214,9 @@ Each file's *root* group (path `/`) must further define the attributes: is an other `open/close` call necessary to access other iterations - allowed values: - - `fileBased` (multiple files) - - `groupBased` (one file) - - `variableBased` (one file with internal encoding for iterations, if supported by the data format) + - `fileBased` (multiple files; one iteration per file) + - `groupBased` (one file; iteration use groups in that file) + - `variableBased` (one file; if the data format supports to store multiple iterations in the same variables and attributes) - `iterationFormat` - type: *(string)* From 1f5aa32ad77e7c2d37ab5029b1de7cdc69e7c3cc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Franz=20P=C3=B6schel?= Date: Thu, 13 Oct 2022 14:20:20 +0200 Subject: [PATCH 7/8] Apply suggestions from code review Co-authored-by: Axel Huebl --- FORMAT_ADIOS.md | 2 +- STANDARD.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/FORMAT_ADIOS.md b/FORMAT_ADIOS.md index b025b93..b4f4586 100644 --- a/FORMAT_ADIOS.md +++ b/FORMAT_ADIOS.md @@ -50,7 +50,7 @@ openPMD **attributes** stored as ADIOS `Attributes` at the location where they w Example for a mesh record `E` with record component `x` and attributes `unitDimension` and `unitSI`: ``` double /data/meshes/E/unitDimension attr = {0, 0, 0, 0, 0, 0, 0} - int32_t /data/meshes/E/x {1000} + double /data/meshes/E/x {1000} double /data/meshes/E/x/position attr = 0 double /data/meshes/E/x/unitSI attr = 1 ``` diff --git a/STANDARD.md b/STANDARD.md index e3356f0..403648b 100644 --- a/STANDARD.md +++ b/STANDARD.md @@ -260,7 +260,7 @@ Notes: Required Attributes for the `basePath` -------------------------------------- -In addition to holding information about the iteration, each series of files (`fileBased`), series of groups (`groupBased`) or internally encoded iterations (`variableBased`) should have attributes that describe the current time and the last time step. +In addition to holding information about the iteration, each series of files (`fileBased`), series of groups (`groupBased`) or internally encoded iterations (`variableBased`) should have attributes that describe the current time and the last step. - `time` - type: *(floatX)* From d7f3d59f7ac7345351129c2c03d0895b939452c7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Franz=20P=C3=B6schel?= Date: Thu, 13 Oct 2022 17:18:01 +0200 Subject: [PATCH 8/8] Update after review Make the example more realistic, also minor fixes --- FORMAT_ADIOS.md | 8 ++++---- STANDARD.md | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/FORMAT_ADIOS.md b/FORMAT_ADIOS.md index b4f4586..db451a3 100644 --- a/FORMAT_ADIOS.md +++ b/FORMAT_ADIOS.md @@ -49,10 +49,10 @@ openPMD **attributes** stored as ADIOS `Attributes` at the location where they w Example for a mesh record `E` with record component `x` and attributes `unitDimension` and `unitSI`: ``` - double /data/meshes/E/unitDimension attr = {0, 0, 0, 0, 0, 0, 0} - double /data/meshes/E/x {1000} - double /data/meshes/E/x/position attr = 0 - double /data/meshes/E/x/unitSI attr = 1 + double /data/meshes/E/unitDimension attr = {1, 1, -3, -1, 0, 0, 0} + double /data/meshes/E/x {128, 2048, 128} + double /data/meshes/E/x/position attr = {0.5, 0.5, 0.5} + double /data/meshes/E/x/unitSI attr = 1.22627e+13 ``` This example uses `variableBased` iteration encoding, but other iteration encodings would work similarly with their respective `basePath` prefix. diff --git a/STANDARD.md b/STANDARD.md index 403648b..b28f8a1 100644 --- a/STANDARD.md +++ b/STANDARD.md @@ -215,7 +215,7 @@ Each file's *root* group (path `/`) must further define the attributes: iterations - allowed values: - `fileBased` (multiple files; one iteration per file) - - `groupBased` (one file; iteration use groups in that file) + - `groupBased` (one file; iterations use groups in that file) - `variableBased` (one file; if the data format supports to store multiple iterations in the same variables and attributes) - `iterationFormat` @@ -239,7 +239,7 @@ Each file's *root* group (path `/`) must further define the attributes: ### `variableBased` Encoding of Iterations -In order to correlate openPMD iterations with an index of data-format internal updates/steps or an index in the slowest varying dimension of an array, the iteration base path (default: path `/default`) must contain an additional variable once `variableBased` is chosen for `iterationEncoding`: +In order to correlate openPMD iterations with an index of data-format internal updates/steps or an index in the slowest varying dimension of an array, the iteration base path (default: path `/data`) must contain an additional variable once `variableBased` is chosen for `iterationEncoding`: - `snapshot` - type: 1-dimensional array containing N *(int)* elements, where N is the number of updates/steps in the data format @@ -250,9 +250,9 @@ In order to correlate openPMD iterations with an index of data-format internal u Notes: -* In implementations without support for IO steps, the variable-based encoding of iterations may still be used for storage of a single iteration. +* In implementations without support for storing multiple versions of datasets/attributes, the variable-based encoding of iterations may still be used for storage of a single iteration. In that case, the `snapshot` attribute is optional and defaults to zero (0). -* In implementations with support for IO steps, the `snapshot` attribute may optionally be used in group-based encoding to associate openPMD iterations with IO steps. +* In implementations with support for storing multiple versions of datasets/attributes, the `snapshot` attribute may optionally be used in group-based encoding to associate openPMD iterations with IO steps. In group-based encoding, there is still only one instance of this attribute globally (`/data/snapshot`). In consequence, the attribute shall only be written if modifiable attributes are supported by the implementation.