Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing and Reading Non-openPMD Infos #15

Open
ax3l opened this issue Jan 15, 2018 · 9 comments
Open

Writing and Reading Non-openPMD Infos #15

ax3l opened this issue Jan 15, 2018 · 9 comments
Assignees
Labels
api: new additions to the API

Comments

@ax3l
Copy link
Member

ax3l commented Jan 15, 2018

It would be necessary to allow storing non-openPMD information as well.

In the simple case, this means adding additional, user-defined attributes to records, record-components and paths.

In the more complex case, application specific states such as RNG states, unqiue-number-generator states, etc. need to be stored as records outside of the openPMD-interpreted paths.

@ax3l ax3l added the api: breaking breaking API changes label Jan 15, 2018
@C0nsultant
Copy link
Member

C0nsultant commented Jan 15, 2018

The simple case is already possible, maybe not made explicit enough yet. Every "position" in an openPMD hierearchy (which in this APIs terms is an Attributable) supports a templated operation setAttribute, taking a string and a generic argument whose type has to map to a supported Datatype

Here's an example demonstrating said feature by annotating the root group with a std::string.

@ax3l
Copy link
Member Author

ax3l commented Jan 15, 2018

That's already a great start, thx! :)

The latter case is quite relevant for PIConGPU checkpoints.

@ax3l ax3l changed the title API: Non-openPMD Infos Writing and Reading Non-openPMD Infos Feb 12, 2018
@ax3l ax3l added api: new additions to the API and removed api: breaking breaking API changes labels Jun 6, 2018
@ax3l
Copy link
Member Author

ax3l commented Oct 4, 2019

cc @ejcjason you also need this, right?

@JunCEEE
Copy link

JunCEEE commented Oct 24, 2019

cc @ejcjason you also need this, right?

I cannot access the link "here", but I guess the suggested solution is to change the attribute of "position" under one particle group into something we want. That's good. But what if we want to add one group, say "observable", in the iteration group (i.e., in the same level of particles)?

  • data/
    • 1/
      • observable/
      • particles/
        • Cu

It seems that the above solution doesn't work in this case, because there is no command like i.meshes[xx] or i.particles[xx] to create the observable. Maybe it can be achieved if we have something to change the group name of particles or meshes?

@ax3l
Copy link
Member Author

ax3l commented Nov 1, 2019

Links updated with permanently accessible examples. setAttribute examples is now also in the manual.

Yes, what you want is to side-channel to the low-level write API to write groups and datasets outside of openPMD. That's also what I meant.

Maybe it can be achieved if we have something to change the group name of particles or meshes?

Or are you already good if you can change the meshesPath and particlesPath that we pick before we write? Probably not really.

ax3l added a commit to ax3l/openPMD-api that referenced this issue Jun 14, 2021
Runs into timeout for unclear reasons with this patch:
```
15/32 Test openPMD#15: MPI.8_benchmark_parallel ...............***Timeout 1500.17 sec
```
ax3l added a commit to ax3l/openPMD-api that referenced this issue Jun 14, 2021
Runs into timeout for unclear reasons with this patch:
```
15/32 Test openPMD#15: MPI.8_benchmark_parallel ...............***Timeout 1500.17 sec
```
ax3l added a commit to ax3l/openPMD-api that referenced this issue Jun 23, 2021
Runs into timeout for unclear reasons with this patch:
```
15/32 Test openPMD#15: MPI.8_benchmark_parallel ...............***Timeout 1500.17 sec
```
ax3l added a commit that referenced this issue Jun 24, 2021
* HDF5: Empiric for Optimal Chunk Size

This ports a prior empirical algorithm from libSplash to determine
an optimal (large) chunk size for an HDF5 dataset based on its
datatype and global extent.

Original implementation by Felix Schmitt @f-schmitt (ZIH, TU Dresden)
in
[libSplash](https://github.com/ComputationalRadiationPhysics/libSplash).

Original source:
- https://github.com/ComputationalRadiationPhysics/libSplash/blob/v1.7.0/src/DCDataSet.cpp
- https://github.com/ComputationalRadiationPhysics/libSplash/blob/v1.7.0/src/include/splash/core/DCHelper.hpp

Co-authored-by: Felix Schmitt <[email protected]>

* Add scaffolding for JSON options in HDF5

* HDF5: Finish Chunking JSON/Env control

* HiPACE (legacy) pipeline: no chunking

The parallel, independent I/O pattern here is corner-case for what
HDF5 can support, due to non-collective declarations of data sets.
Testing shows that it does not work with chunking.

* CI: no HDF5 Chunking with Sanitizer

Runs into timeout for unclear reasons with this patch:
```
15/32 Test #15: MPI.8_benchmark_parallel ...............***Timeout 1500.17 sec
```

* Apply suggestions from code review

Co-authored-by: Franz Pöschel <[email protected]>

Co-authored-by: Felix Schmitt <[email protected]>
Co-authored-by: Franz Pöschel <[email protected]>
@pgrete
Copy link

pgrete commented Mar 15, 2024

What the current status of supporting reading/writing other group data, @ax3l ?
I didn't find any more recent and we have a use case to store data that is not tied to a mesh (or particle).

@franzpoeschel
Copy link
Contributor

There is an openPMD-standard PR here openPMD/openPMD-standard#282 and an openPMD-api PR here #1432

The openPMD-api PR can already be used, but there is a number of things that are still up to discussion. If you want to try it out, you can have a look at the Python example or the C++ tests in the Diff of that PR.

@pgrete
Copy link

pgrete commented Mar 15, 2024

Thanks for the pointer.
What's the timescale for those changes?
I see that the api PR is open for about a year and I'm hesitant to build our future production backend on changes that are potentially still in flux/subject to discussion.

@franzpoeschel
Copy link
Contributor

What's the timescale for those changes?

This is part of a delivery scheduled for this autumn, but I hope to merge the openPMD-api pull request sooner than that. It is a relatively big change and the API might still be changed / is subject to discussion, so long story short is that there is no stable support for this yet, but there is a timeline.

There are logically two parts to this project:

  1. Writing custom datasets into custom hierarchies. This does not affect the openPMD-standard.
  2. Using openPMD markup within these custom hierarchies. This affects the openPMD-standard.

The workaround for now would probably be to try if you can "pretend" your custom data is a mesh (we do something similar for checkpointing data in PIConGPU at the moment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: new additions to the API
Projects
None yet
Development

No branches or pull requests

5 participants