-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lists of Iteration, Mesh and Particle Pathes #282
base: upcoming-2.0.0
Are you sure you want to change the base?
Lists of Iteration, Mesh and Particle Pathes #282
Conversation
Thank you for the standard update and PR, this is excellent. Let's discuss details in person, yet let me answer some ideas on the open questions above.
I think regex support (e.g., which flavor) is a bit tricky across languages. But clearly, there is a big benefit in at least supporting globbing for now, because it avoids that every new entry needs to be listed. E.g., for sub-hierarchies that are numbered in a sensible manner. My feeling is to allow only bash-like globbing for now, which is easy to replace to a regex or matching syntax in any language. |
I would be cautious here since globbing and regex are incompatible. If, say, you allow bash-like globbing initially, to be backwards compatible when regex is introduced the standard must support both. This is a bit messy. |
Yes, such a change would then be for 3.0 and major. But needs a motivating set of examples why regex parsing as a dependency is needed. I just want to avoid loosing, e.g., Fortran users or people not having the exact implementation of regex flavor we require. |
Discussed today on nesting (minutes): I would be interested in concrete examples that really need this "discovery" feature to explore it more. (Do we have some deep nested structures we need to support interplay with NeXuS?) One way to implement this with faster parsing for readers: one could be to have recursive paths from root to these elements that aid discovery: a list of paths to iteration folder(s), each of them having attributes that point to their list of mesh paths and particle paths. |
Please add a brief description (one sentence) here and link the issue this pull-request implements
This is a suggestion for making openPMD markup be usable in a user-specified group hierarchy within an Iteration.
Implements issue: #115
Description
As detailed description as possible.
In principle, custom data can be introduced to openPMD datasets today already without the openPMD standard being any wiser about it, since the standard only describes what must be there. As a result, custom groups, datasets, attributes may be used today already.
With this PR, datasets will be able to profit from openPMD markup within such custom groups, e.g. meshes and particles can be placed in user-defined places within iterations now:
What is introduced, removed or renamed and why?
The only change necessary to this end is a new definition of the
meshesPath
andparticlesPath
attributes. The new definition is defined to be backward compatible with the old definition, e.g. existing datasets using the openPMD 1.0 specification for e.g.meshesPath
will still work with openPMD 2.0.meshesPath
andparticlesPath
are now lists of regular expressions. (This is technically redundant as regular expressions are powerful enough to represent "lists" (e1|e2|e3), but it reduces reliance on regular expressions, and the concept of "listing" paths is important enough to be explicitly expressed in the standard.)This way, the
meshesPath
andparticlesPath
can list the locations where meshes and particles species may be found (either as a relative or an absolute path).What is made required, recommended, optional?
It no longer makes sense to require that a mesh exists when the
meshesPath
is defined, likewise theparticlesPath
(regular expressions might in fact denote an infinite set of locations).What concept stands behind this change?
The fact that the openPMD standard is extensible and hierarchical. This proposal makes the openPMD standard aware of extended hierarchies and allows interacting with them.
Please also add an example.
-- will follow --
Affected Components
base
Logic Changes
Which logic changes are conceptually introduced?
Writer Changes
How does this change affect data writers?
Not at all if the new features are not used.
What would a writer need to change?
It can optionally use the new features.
Does this pull request change the interpretation of existing data writers?
No.
libopenPMD
: The proposal is prototyped in Custom Hierarchies openPMD-api#1432Reader Changes
How does this change affect data readers?
Data readers can no longer expect to have a single group containing meshes (likewise particles). Especially in the openPMD-api, the semantics of
iteration[0].meshes
become unclear when in fact the dataset contains/data/0/fields
and/data/0/images
.Data readers need to traverse subdirectories if they intend to find all data.
The
meshesPath
andparticlesPath
attributes might now have values that old readers cannot interpret.What would a reader need to change? Link implementation examples!
libopenPMD
: The proposal is prototyped in Custom Hierarchies openPMD-api#1432openPMD-validator
: https://github.com/openPMD/openPMD-validator/...openPMD-viewer
: https://github.com/openPMD/openPMD-viewer/...yt
: https://github.com/yt-project/yt/...VisIt
: https://github.com/openPMD/openPMD-visit-pluginXDMF
(upcoming): https://github.com/openPMD/openPMD-tools/...Data Converter
How does this affect already existing files with previous versions of the standard?
Not at all.
Is this change possible to be expressed in a way, that an automated tool can update the file?
Link code/pull requests that implement the upgrade logic!
Not necessary.
Discussion:
meshesPath
/particlesPath
can only refer to the parent group that contains meshes/species (as it does now), implying that any group can contain only (meshes XOR particles XOR custom subgroups). With the current design, mixing them is possible.meshesPath
andparticlesPath
as it is now, but allow overriding themeshesPath
andparticlesPath
at each level of the hierarchy. Maybe allow specifying lists, so that things like/data/0/meshes
and/data/0/images
can be meshes in the same dataset.This approach has the advantage of being simpler and having increased forward compatibility, since the "new" features can be better contained in places where old readers don't see them. The
meshesPath
andparticlesPath
attributes become less monolithic, i.e. the location of meshes and particles are not described in one single place inside an attribute with rather complicated semantics.