Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADIOS2 Append Mode #1007

Merged
merged 14 commits into from
May 9, 2022
Merged

ADIOS2 Append Mode #1007

merged 14 commits into from
May 9, 2022

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Jun 10, 2021

Our READ_WRITE mode is not quite adequate for use in ADIOS2, since ADIOS2 only allows either reading or writing, but our READ_WRITE mode's workflow needs that both are working (#996). However, Append mode exposes an interface equivalent to that of the Write mode, but steps already on disk are not deleted, new ones are just appended. Hence, add a new Access::APPEND that unlike READ_WRITE does not allow any reading, but does not overwrite anything either.

  • Simple proof-of-concept implementation
  • Test all possible edge cases:
    • Since there is no reading involved, it's on the user to use the exact same configuration of openPMD in append mode as had been used for the appended-to file.
    • This is now testing new and old ADIOS schema, group- file- and variablebased encoding
    • Overwriting old iterations is still a work-in-progress, non-linear read patterns and truncation of files upon opening are to come in a later PR
  • Keep possible a workflow where in file-based iteration encoding, already existing iterations can be read and new ones can be written
    The READ_WRITE mode currently does this. I would suggest to use new Append mode to support this more properly. Will need some tweaks in the frontend though where APPEND mode is currently treated as equivalent to WRITE mode.
    Idea: In filebased iteration encoding, coerce the backendAccess to CREATE. Any written iteration will then be overwritten in doubt.
  • Bring this to the other backends. This should become the new default READ_WRITE mode light, unless you really need to read data.
  • Discussion point: Since reading does not work at all in this mode, I don't see any way to guarantee that an update to a dataset is written with the same overall settings (schema, usage of steps, ..). Users will need to make sure to create consistent datasets...
  • Discussion point: Checkpoint-restarting. Probably one of the main uses of this. Dump every 100 time steps, checkpoint every 1000, crash at 1500, restart from 1000. Consequence: Steps {1000,1100,…,1500} are written twice. See Mapping between ADIOS steps and openPMD iterations #949 for an exploration on how to deal with this, but it's not really ideal.
    Update: See Truncate option for Engine Open mode Append ornladios/ADIOS2#2775
    Will still not be trivial to implement, since a preceding read access might be necessary to figure out how many steps should be dropped. Would also require to specify from which iteration to restart writing.
  • Documentation page on access modi
  • File-based appending in ADIOS2: Throw error if trying to append to already written iteration?
    Update: Given the proposed workflow with Truncate option for Engine Open mode Append ornladios/ADIOS2#2775, it would be most consistent to just overwrite iterations. Maybe use Access::CREATE again for the backendAccess.
  • ADIOS1 UPDATE: OperationUnsupportedInBackend
  • Merge Add header for openPMD-specific errors #1080 first
  • Document altered meaning of createFile task

Also, this introduces a new Error.hpp, so I don't need to again throw std::runtime_errors around.

@franzpoeschel franzpoeschel force-pushed the topic-adios2-append branch 5 times, most recently from 92526a7 to 966b1b2 Compare June 21, 2021 13:42
@ax3l
Copy link
Member

ax3l commented Jun 22, 2021

Checkpoint-restarting. Probably one of the main uses of this. Dump every 100 time steps, checkpoint every 1000, crash at 1500, restart from 1000. Consequence: Steps {1000,1100,…,1500} are written twice. See #949 for an exploration on how to deal with this, but it's not really ideal.

I think useful would be a mode that overwrites a step exactly when it is re-opened again, but leaves older (and even newer) steps as they are. That's what people most expect and know from file-based output: files stay around until you overwrite one by one.

Since output from checkpoint-restart should give again the same result as the prior run, this is also ok (no "invalid" older files, so to say).

Alternatively, one could at a certain point (e.g. once during re-open/append) delete all steps after the restart step.

@franzpoeschel
Copy link
Contributor Author

I think useful would be a mode that overwrites a step exactly when it is re-opened again, but leaves older (and even newer) steps as they are. That's what people most expect and know from file-based output: files stay around until you overwrite one by one.

That is possible to implement with file-based iteration encoding and APPEND mode, but it won't be possible with either group- or variable-based encoding. Adding steps to an existing file in ADIOS2 allows no reading or modifying of old steps, you're just appending to a BP file without caring what had previously been written to it.

The problem is not file-based encoding, but variable-/step-based encodings with steps enabled for output data after loading a checkpoint.

m_backendAccess == Access::APPEND )
{
// do we really want to have those as const members..?
*const_cast< Access * >( &m_backendAccess ) = Access::CREATE;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now added some handling in the backend so that this isn't necessary any more. Could remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: Eh, let' keep it. Clean overwriting is better and in file-based encoding it's easy to achieve.

@ax3l
Copy link
Member

ax3l commented Jun 23, 2021

The problem is not file-based encoding, but variable-/step-based encodings with steps enabled for output data after loading a checkpoint.

Uff, right. I put this on the ADIOS2 agenda. I think we could try to gracefully warn and skip such outputs in user-code if they exist, but definitely have to think about a workflow.

@franzpoeschel
Copy link
Contributor Author

franzpoeschel commented Jun 24, 2021

I quickly deleted the member and looked where compilation failed. It seems that there currently is no other bug of the sort. wrong PR

@ax3l
Copy link
Member

ax3l commented Jun 25, 2021

Discussion update:

  • RFE with ADIOS2: Append mode, with parameter to truncate after the Nth (ADIOS) step: Truncate option for Engine Open mode Append ornladios/ADIOS2#2775
  • high-level solution: using the openPMD “step” variable to list the same step again and our logic picks up the “latest” one in reads (will work, but accumulates data if steps are written multiple times)

@ax3l ax3l changed the title ADIOS2 Append Mode [WIP] ADIOS2 Append Mode Jul 2, 2021
@franzpoeschel franzpoeschel force-pushed the topic-adios2-append branch 4 times, most recently from 58e137a to 3dfc917 Compare August 4, 2021 14:55
@franzpoeschel
Copy link
Contributor Author

I think this is ready for review @ax3l
Note that there is further info found in the added documentation, otherwise I have tried to split this into a sane number of commits.

@franzpoeschel franzpoeschel force-pushed the topic-adios2-append branch 2 times, most recently from 5094fe1 to 0e60346 Compare August 5, 2021 15:44
@franzpoeschel franzpoeschel force-pushed the topic-adios2-append branch 2 times, most recently from 9a1a175 to 009e69a Compare August 18, 2021 08:13
@franzpoeschel franzpoeschel changed the title [WIP] ADIOS2 Append Mode ADIOS2 Append Mode Sep 23, 2021
@franzpoeschel franzpoeschel force-pushed the topic-adios2-append branch 3 times, most recently from a65de2e to 443e1a9 Compare September 24, 2021 13:12
@franzpoeschel
Copy link
Contributor Author

I've done further cleanup now. If it runs green, ready for review @ax3l

src/IO/HDF5/HDF5IOHandler.cpp Outdated Show resolved Hide resolved
src/Error.cpp Outdated Show resolved Hide resolved
src/IO/HDF5/HDF5IOHandler.cpp Outdated Show resolved Hide resolved
src/Error.cpp Show resolved Hide resolved
@ax3l ax3l enabled auto-merge (squash) May 6, 2022 16:13
@ax3l ax3l disabled auto-merge May 6, 2022 16:14
@ax3l ax3l enabled auto-merge (squash) May 6, 2022 16:14
@ax3l ax3l disabled auto-merge May 6, 2022 16:14
@ax3l ax3l enabled auto-merge (squash) May 6, 2022 16:14
@ax3l ax3l disabled auto-merge May 6, 2022 16:29
@ax3l
Copy link
Member

ax3l commented May 6, 2022

@franzpoeschel please needs a rebase. Collided with #1264

@ax3l ax3l self-requested a review May 6, 2022 23:22
franzpoeschel and others added 14 commits May 9, 2022 10:45
Also use switch statement in many frontend places for the Access enum.
With 4 variants, if statements become increasingly unreadable and it's
good to have compiler warnings if a case is forgotten. Try to avoid
default branches.

Also refactor `autoDetectPadding()` into a separate function since it is
now needed in two places.
1) Make backends aware of Append mode
2) In ADIOS1, fix use of namespaces, only use #include statements
   outside of namespaces
@ax3l ax3l merged commit f62a329 into openPMD:dev May 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants