Version 4.5.1-devel.
If I call nc_put_vars_double with stride=1 in parallel with some processors having no data to write, then the H5Dwrite call will fail.
The problem is due to the
if(nels == 0)
return NC_NOERR; /* cannot write anything */
at line 244 of libdispatch/dvarput.c. If I remove that early return and if stride == 1, then the code will complete correctly. If that line is left as is, then some processors return early and the code hangs down below H5Dwrite due to hdf5 calling PMPI_Allreduce if using collective io.
There is another issue if stride != 1, but I will report that in a separate issue.
Version 4.5.1-devel.
If I call nc_put_vars_double with stride=1 in parallel with some processors having no data to write, then the
H5Dwritecall will fail.The problem is due to the
at line 244 of libdispatch/dvarput.c. If I remove that early return and if stride == 1, then the code will complete correctly. If that line is left as is, then some processors return early and the code hangs down below
H5Dwritedue to hdf5 callingPMPI_Allreduceif using collective io.There is another issue if stride != 1, but I will report that in a separate issue.