NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.

[netcdfgroup] parallel I/O testing with MVAPICH2

I am comparing parallel I/O performance between NetCDF-4 and direct
MPI-IO.  I modified the ~netcdf/nc_test4/tst_nc4perf.c to add parameter
scans for direct MPI-IO calls.  It works fine on the discover platform
using Intel MPI, but fails when using MVAPICH2-1.5.1p1.

Any ideas?

I'm using ...
NetCDF-4.1.1          2010-04-01
configured with
--enable-hdf4 --enable-dap --disable-shared --enable-f90
--enable-fortran --enable-netcdf-4
--with-max-default-cache-size=268435456 --disable-cxx --with-udunits
--enable-hdf4

HDF5-1.8.4-patch1 released on Tue Feb 23 11:31:09 CST 2010
configured with
--disable-shared --disable-cxx --enable-hl --enable-fortran
--disable-sharedlib-rpath --enable-parallel CFLAGS=-fPIC

Here is a trace.
>mpirun_rsh -hostfile $PBS_NODEFILE -np 1 tst_nc4perf_mv2.5760x2881x72

HDF5-DIAG: Error detected in HDF5 (1.8.4-patch1) MPI-process 0:
  #000: H5Dio.c line 266 in H5Dwrite(): can't write data
    major: Dataset
    minor: Write failed
  #001: H5Dio.c line 578 in H5D_write(): can't write data
    major: Dataset
    minor: Write failed
  #002: H5Dmpio.c line 552 in H5D_contig_collective_write(): couldn't finish 
shared collective MPI-IO
    major: Low-level I/O
    minor: Write failed
  #003: H5Dmpio.c line 1586 in H5D_inter_collective_io(): couldn't finish 
collective MPI-IO
    major: Low-level I/O
    minor: Can't get value
  #004: H5Dmpio.c line 1632 in H5D_final_collective_io(): optimized write failed
    major: Dataset
    minor: Write failed
  #005: H5Dmpio.c line 334 in H5D_mpio_select_write(): can't finish collective 
parallel write
    major: Low-level I/O
    minor: Write failed
  #006: H5Fio.c line 167 in H5F_block_write(): file write failed
    major: Low-level I/O
    minor: Write failed
  #007: H5FDint.c line 185 in H5FD_write(): driver write request failed
    major: Virtual File Layer
    minor: Write failed
  #008: H5FDmpio.c line 1726 in H5FD_mpio_write(): can't convert from size to 
size_i
    major: Internal error (too specific to document in detail)
    minor: Out of range


-- 
Dan Kokron
Global Modeling and Assimilation Office
NASA Goddard Space Flight Center
Greenbelt, MD 20771
Daniel.S.Kokron@xxxxxxxx
Phone: (301) 614-5192
Fax:   (301) 614-5304



  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: