You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To efficiently particle files, we need to know the total number of particles.
Currently, that means we load every file once before in order to allocate space in the resulting file (which can be very slow). It would be faster if we recorded the total number of particles when writing outputs.
In fact, it would actually be even better if we recorded the total number of particles per file in one centralized location (maybe just in file 0). That way, we could better parallelize concatenation (when using an arbitrary number of processes for concatenation that is totally unrelated to the number of processes used in the original simulation)
The text was updated successfully, but these errors were encountered:
as far as I remember, the header of each file has the number of local particles, this means that you just need to read that value from the header of each file and not the entire data to compute the total number of particles. This shouldn't be a significant overhead.
Your recollection is correct about what is stored. That is exactly what we currently do.
With that said, standard posix file operations can be extremely slow on parallel file systems. The overhead of simply opening and closing a file can be shockingly large (especially when other people are using the file system). This is most problematic when you have many thousands of files.
In my experience, the parallel systems on the Oakridge systems usually aren't bad. But I have had awful experiences with the LUSTRE filesystem on Frontera
I see. I haven't had the pleasure of using Frontera, but I always hear lovely things about it.
In that case, I agree that doing an MPI_Allreduce to get n_particles_total when writing the header of the output files is a good idea.
To efficiently particle files, we need to know the total number of particles.
Currently, that means we load every file once before in order to allocate space in the resulting file (which can be very slow). It would be faster if we recorded the total number of particles when writing outputs.
In fact, it would actually be even better if we recorded the total number of particles per file in one centralized location (maybe just in file 0). That way, we could better parallelize concatenation (when using an arbitrary number of processes for concatenation that is totally unrelated to the number of processes used in the original simulation)
The text was updated successfully, but these errors were encountered: