NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
There used to be a problem with netcdf4 and openmpi (1.4.x) where netcdf4 was assuming the behavior of mpich in setting MPI_ERR_COMM (or something else where mpich assigned a fixed value (improperly) but openmpi did not). That got fixed, but perhaps the problem has come back? Did you try openmpi 1.4.x? -- Ted On Feb 8, 2012, at 6:03 PM, Orion Poplawski wrote: > I'm trying to build parallel enabled netcdf 4.1.3 on Fedora 16 with hdf5 > 1.8.7 and with both mpich2 1.4.1p1 and openmpi 1.5.4. In running make check > with the openmpi build I get: > > $ mpiexec -n 4 ./f90tst_parallel > [orca.cora.nwra.com:32630] *** An error occurred in MPI_Comm_d > [orca.cora.nwra.com:32630] *** on communicator MPI_COMM_WOR > [orca.cora.nwra.com:32630] *** MPI_ERR_COMM: invalid communicator > [orca.cora.nwra.com:32630] *** MPI_ERRORS_ARE_FATAL: your MPI job will now > HDF5: infinite loop closing library > D,T,AC,FD,P,FD,P,FD,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FDFD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,D,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,F,FD,FD,FD,FD,FD,FD,FD,FD,FD > > *** Testing netCDF-4 parallel I/O from Fortran 90. > HDF5: infinite loop closing library > D,T,AC,FD,P,FD,P,FD,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FDFD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,D,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,F,FD,FD,FD,FD,FD,FD,FD,FD,FD > HDF5: infinite loop closing library > D,T,AC,FD,P,FD,P,FD,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FDFD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,D,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,F,FD,FD,FD,FD,FD,FD,FD,FD,FD > ------------------------------------------------------------------------ > mpiexec has exited due to process rank 2 with PID 32631 on > node orca.cora.nwra.com exiting improperly. There are two reasons this could > occu > > 1. this process did not call "init" before exiting, but others in > the job did. This can cause a job to hang indefinitely while it wa > for all processes to call "init". By rule, if one process calls "init" > then ALL processes must call "init" prior to termination > > 2. this process called "init", but exited without calling "finaliz > By rule, all processes that call "init" MUST call "finalize" prior > exiting or it will be considered an "abnormal termination" > > This may have caused other processes in the application to > terminated by signals sent by mpiexec (as reported here) > ------------------------------------------------------------------------ > [orca.cora.nwra.com:32628] 3 more processes have sent help message > help-mpi-errors.trs_are_fatal > [orca.cora.nwra.com:32628] Set MCA parameter "orte_base_help_aggregate" to 0 > to see ror messages > > > It appears to work fine with mpich2. Has anyone else come across this? > > Thanks, > > Orion > > -- > Orion Poplawski > Technical Manager 303-415-9701 x222 > NWRA, Boulder Office FAX: 303-415-9702 > 3380 Mitchell Lane orion@xxxxxxxxxxxxx > Boulder, CO 80301 http://www.cora.nwra.com > > _______________________________________________ > netcdfgroup mailing list > netcdfgroup@xxxxxxxxxxxxxxxx > For list information or to unsubscribe, visit: > http://www.unidata.ucar.edu/mailing_lists/
netcdfgroup
archives: