Outputting information to aide particle concatenation #405

mabruzzo · 2024-07-03T22:16:03Z

To efficiently particle files, we need to know the total number of particles.

Currently, that means we load every file once before in order to allocate space in the resulting file (which can be very slow). It would be faster if we recorded the total number of particles when writing outputs.

In fact, it would actually be even better if we recorded the total number of particles per file in one centralized location (maybe just in file 0). That way, we could better parallelize concatenation (when using an arbitrary number of processes for concatenation that is totally unrelated to the number of processes used in the original simulation)

bvillasen · 2024-07-03T23:09:36Z

as far as I remember, the header of each file has the number of local particles, this means that you just need to read that value from the header of each file and not the entire data to compute the total number of particles. This shouldn't be a significant overhead.

mabruzzo · 2024-07-03T23:27:17Z

Your recollection is correct about what is stored. That is exactly what we currently do.

With that said, standard posix file operations can be extremely slow on parallel file systems. The overhead of simply opening and closing a file can be shockingly large (especially when other people are using the file system). This is most problematic when you have many thousands of files.

In my experience, the parallel systems on the Oakridge systems usually aren't bad. But I have had awful experiences with the LUSTRE filesystem on Frontera

bvillasen · 2024-07-04T15:39:06Z

I see. I haven't had the pleasure of using Frontera, but I always hear lovely things about it.
In that case, I agree that doing an MPI_Allreduce to get n_particles_total when writing the header of the output files is a good idea.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Outputting information to aide particle concatenation #405

Outputting information to aide particle concatenation #405

mabruzzo commented Jul 3, 2024

bvillasen commented Jul 3, 2024

mabruzzo commented Jul 3, 2024 •

edited

Loading

bvillasen commented Jul 4, 2024

Outputting information to aide particle concatenation #405

Outputting information to aide particle concatenation #405

Comments

mabruzzo commented Jul 3, 2024

bvillasen commented Jul 3, 2024

mabruzzo commented Jul 3, 2024 • edited Loading

bvillasen commented Jul 4, 2024

mabruzzo commented Jul 3, 2024 •

edited

Loading