Fix to check bytes accessed by MPIIO xfers and retry if possible #501
+126
−81
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently the MPIIO backend assumes all completed read or write calls successfully accessed all bytes and always returns
hints->transferSize
bytes. This allows for silent partial writes only visible during verification and theoretically skews reported bandwidth since all bytes are assumed transferred when calculating the rate.This patch follows the retry model in the POSIX backend looping up to MAX_RETRY times but retries the full request instead of remaining bytes. The full IO is retried for simplicity and ambiguity if only the first N bytes are accessed across all MPIs including with strided datatypes. We've only observed zero byte partial transfers in practice.
Collective MPIIO transfers aren't retried. Any one rank having a short access would require all ranks re-trying the collective call. Supporting that would require an additional synchronous MPI call to exchange all rank's bytes transferred to all ranks for every transfer. Any short MPIIO collective call returns the actual bytes transferred without retry.