NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.

Re: netcdf4 parallel IO

NOTE: The netcdf-hdf mailing list is no longer active. The list archives are made available for historical reasons.

"David Stuebe" <dstuebe@xxxxxxxxxx> writes:

> <br><br>Hi NETCDF folks<br><br>I work on an unstructured finite volume 
> coastal ocean model, FVCOM, which is parallel (using MPICH2). The Read Write 
> is a major slow down for our large cases. On our cluster, we have one large 
> storage device, an emc raid array. The network is infini-band - the network 
> is much faster than the raid array.
> <br><br>For our model we need to read large initial condition data sets, and 
> single frames of forcing data while running. We also need to write single 
> frames of data for output (frequently), and large restart files (less 
> frequently).
> <br><br>I am considering two options for recoding the IO from the model. One 
> is based around the future F90 netcdf 4 parallel interface which would allow 
> a symmetric code- every processor does the same thing. The other option is to 
> use netcdf 3, let the master processor read/write the data and distribute it 
> to each node, -an asymmetric coding.
> <br><br>What I need to know-&nbsp; are netcdf 4 parallel IO operations 
> blocking? <br><br>The problem - the order of cells and nodes in our data set 
> does not allow for a simple start, count read format. A data array might have 
> dimensions (time,layers,cells). As an example, in&nbsp; a 2 processor case 
> with 8 cells, proc1 has cells(1 2 5 7) while proc2 has cells (3 4 6 8) - 
> write operations would have to be in a do loop to write each cell 
> individually from the processor that owns it.
> <br><br>For a model with 300,000 cells on 30 processors, this would be 10,000 
> calls to NF90_PUT_VAR on each processor. Even if the calls are non-blocking 
> this seems dangerous.<br><br>Any thoughts?<br><br>David<br><br><br>
> <br><br>

Howdy David!

Are you using unlimited dimensions for this test, and writing?

There was a bug in netCDF-4 which caused metadata to be written every
time a record variable was expanded along the unlimited
dimension. This would cause a slowdown of parallel I/O performance,
because blocking would occur on every write operation, as the metadata
were updated.

This is now fixed on the netcdf-4 snapshot:
http://www.unidata.ucar.edu/software/netcdf/builds/snapshot/index_4.html

Other than this bug, I believe that netCDF-4 will yield the same
performance as the underlying HDF5 API, so the comments of the HDF5
programmers are very relevant.

But before you test again, get the netCDF-4 snapshot to make sure it's
not the netCDF-4 metadata bug which was causing your problems.

Thanks!

Ed


-- 
Ed Hartnett  -- ed@xxxxxxxxxxxxxxxx

==============================================================================
To unsubscribe netcdf-hdf, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================


  • 2007 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-hdf archives: