You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update: For clarity, the tests pass when using mpich 4.0, gcc 11.4.0.
I'm observing a failure using mpicc and running nc_test4/run_par_test.sh.
This issue occurs when running mpicc version 13.x, but does not occur on systems using mpicc version 11.x. This is most easily observed on my end using Ubuntu 22.04 vs. 24.04. I've created a couple of docker images which can be used to observe this. They can be run as follows:
$ docker run --rm -it docker.unidata.ucar.edu/h5par:2204
and
$ docker run --rm -it docker.unidata.ucar.edu/h5par:2404
You can enter the environment by appending bash to the end of either docker command.
It seems that the issue is related to the different version of mpicc, but I'm trying to sort through what exactly is going on. Any suggestions would be appreciated.
The error specifically is as follows:
153: Testing simple parallel I/O with 16 processors...
153:
153: *** Testing more advanced parallel access.
153: *** Testing parallel IO for raw-data with MPI-IO (driver)...
153: *** Testing more advanced parallel access.
153: *** Testing parallel IO for raw-data with MPI-IO (driver)...
153: *** Testing more advanced parallel access.
153: *** Testing parallel IO for raw-data with MPI-IO (driver)...Sorry! Unexpected result, /root/hdf5-1.14.3/netcdf-c/nc_test4/tst_parallel3.c, line: 284
153: Sorry! Unexpected result, /root/hdf5-1.14.3/netcdf-c/nc_test4/tst_parallel3.c, line: 91
153:
153: *** Testing more advanced parallel access.
153: *** Testing parallel IO for raw-data with MPI-IO (driver)...
153: *** Testing more advanced parallel access.
153: *** Testing parallel IO for raw-data with MPI-IO (driver)...Sorry! Unexpected result, /root/hdf5-1.14.3/netcdf-c/nc_test4/tst_parallel3.c, line: 284
153: Sorry! Unexpected result, /root/hdf5-1.14.3/netcdf-c/nc_test4/tst_parallel3.c, line: 91
153:
(...)
The text was updated successfully, but these errors were encountered:
@edwardhartnett@jhendersonHDF if anything leaps out at you, feel free to chime in, it might save some time as I dig through this! And if not, no worries XD. Thanks!
On ubuntu 24.04, installing libhdf5-mpi-dev installs openmpi and related tools. This version of libhdf5 works just fine, although the nc_test4/run_par_test.sh script requires --oversubscribe be passed to mpiexec -n 16 ./tst_parallel3. Otherwise, there is a complaint if the machine has < 16 cores/processors/what-have-you.
Using mpich and a custom-built libhdf5, we cannot oversubscribe. However, this is not an issue, because invoking mpiexec -n 2 ./tst_parallel3 results in the same issue as if we passed 4, or 8, or 16. Running tst_parallel3 directly works, but of course it is bypassing MPI entirely.
Installing libhdf5-mpich-dev sees the same behavior as using the custom-built version of libhdf5. This suggests there is an issue when using mpich but not inherently MPI.
Update: For clarity, the tests pass when using mpich 4.0, gcc 11.4.0.
I'm observing a failure using
mpicc
and runningnc_test4/run_par_test.sh
.This issue occurs when running
mpicc
version13.x
, but does not occur on systems usingmpicc
version11.x
. This is most easily observed on my end using Ubuntu22.04
vs.24.04
. I've created a couple of docker images which can be used to observe this. They can be run as follows:and
You can enter the environment by appending
bash
to the end of either docker command.It seems that the issue is related to the different version of
mpicc
, but I'm trying to sort through what exactly is going on. Any suggestions would be appreciated.The error specifically is as follows:
The text was updated successfully, but these errors were encountered: