NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Hi, I thought I'd give netcdf 4.7.4 a try for the compression in parallel IO (using hdf5 1.10.7, pnetcdf 1.9.0, netcdf-fortran-4.5.3) on a NOAA cluster. I've been using intel 19 with mvapich2.3, which worked fine with earlier versions (4.3.something). So the problem I have is that it works fine on a single node, but get various failures when trying to run a job that uses 2 or more nodes. It also fails if the IO is not parallel (standard netcdf-4 where each process writes its data in turn). I have also compiled everything (including cloud model code) using Intel MPI, which fails promptly with a seg fault when it tries to run on 2 nodes. (Here, I am comparing 4 or 9 threads on a single node or 16 threads split on 2 nodes. If I force the 16 thread version to run on a single node, it runs fine.) The problem seems to be reproducible with a simple write/read test adapted from ftst_parallel.F, so it is seems not specific to my model code. Fails with both pnetcdf and mpiio Any ideas what could be the issue here? I am stumped. -- Ted
netcdfgroup
archives: