Re: [netcdfgroup] [EXTERNAL] Re: bug report (nc_put_varm_double hangs during collective parallel I/O): a follow-up

To: netcdfgroup@xxxxxxxxxxxxxxxx
Subject: Re: [netcdfgroup] [EXTERNAL] Re: bug report (nc_put_varm_double hangs during collective parallel I/O): a follow-up
From: "Gregory Sjaardema" <gdsjaar@xxxxxxxxxx>
Date: Tue, 26 Feb 2013 10:25:54 -0700

Not sure if this his related, but I sent some patches awhile ago toaddress some parallel hangs. These were in nc4hdf.c(There are additional changes related to pnetcdf integration not shownhere):

diff --git a/TPLs_src/netcdf/src/libsrc4/nc4hdf.cb/TPLs_src/netcdf/src/libsrc4/nc4hdf.c

index 39f87df..c8696de 100644
--- a/TPLs_src/netcdf/src/libsrc4/nc4hdf.c
+++ b/TPLs_src/netcdf/src/libsrc4/nc4hdf.c

@@ -600,7 +604,7 @@ nc4_put_vara(NC_FILE_INFO_T *nc, int ncid, intvarid, const size_t *startp,

          {
             if (!dim->unlimited)
             {
-               if (start[d2] >= (hssize_t)fdims[d2])
+               if (start[d2] > (hssize_t)fdims[d2])
                   BAIL_QUIET(NC_EINVALCOORDS);
                if (start[d2] + count[d2] > fdims[d2])
                   BAIL_QUIET(NC_EEDGE);

@@ -609,12 +613,6 @@ nc4_put_vara(NC_FILE_INFO_T *nc, int ncid, intvarid, const size_t *startp,

       }
    }

-   /* A little quirk: if any of the count values are zero, then
-      return success and forget about it. */
-   for (d2 = 0; d2 < var->ndims; d2++)
-      if (count[d2] == 0)
-         goto exit;
-
    /* Now you would think that no one would be crazy enough to write
       a scalar dataspace with one of the array function calls, but you
       would be wrong. So let's check to see if the dataset is

--Greg

On 2/26/13 9:08 AM, Rob Latham wrote:

On Fri, Feb 22, 2013 at 01:45:44PM -0700, Dennis Heimbigner wrote:

I recently rewrote nc_get/put_vars to no longer
use varm, so it may be time to revisit this issue.
What confuses me is that in fact, the varm code
writes one instance of the variable on each pass
(e.g. v[0], v[1], v[2],...). So I am not sure
how it is ever not writing the same on all processors.
Can the original person (Rob?) give me more details?

I'm not the original person.  Constantine Khroulev provided a nice
testcase last January (netcdf_parallel_2d.c).

I just pulled netcdf4 from SVN (r2999) and built it with hdf5-1.8.10.

Constantine Khroulev's test case hangs (though in a different place
than a year ago...):

Today, that test case hangs with one process in a testcase-level
barrier (netcdf_parallel_ 2d.c:134) and one process stuck in
nc4_enddef_netcdf4_file trying to flush data.

This test case demonstrates the problem nicely.  Take a peek at it and
double-check the testcase is correct, but you've got a nice driver to
find and fix this bug.

==rob

=Dennis Heimbigner
  Unidata

Orion Poplawski wrote:

On 01/27/2012 01:22 PM, Rob Latham wrote:

On Wed, Jan 25, 2012 at 10:06:59PM -0900, Constantine Khroulev wrote:

Hello NetCDF developers,

My apologies to list subscribers not interested in these (very)
technical details.

I'm interested!   I hope you send more of these kinds of reports.

When the collective parallel access mode is selected all processors in
a communicator have to call H5Dread() (or H5Dwrite()) the same number
of times.

In nc_put_varm_*, NetCDF breaks data into contiguous segments that can
be written one at a time (see NCDEFAULT_get_varm(...) in
libdispatch/var.c, lines 479 and on). In some cases the number of
these segments varies from one processor to the next.

As a result as soon as one of the processors in a communicator is done
writing its data the program locks up, because now only a subset of
processors in this communicator are calling  H5Dwrite(). (Unless all
processors have the same number of "data segments" to write, that is.)

Oh, that's definitely a bug.  netcdf4 should call something like
MPI_Allreduce with MPI_MAX to figure out how many "rounds" of I/O will
be done (this is what we do inside ROMIO, for a slightly different
reason)

But here's the thing: I'm not sure this is worth fixing. The only
reason to use collective I/O I can think of is for better performance,
and then avoiding sub-sampled and mapped reading and writing is a good
idea anyway.

well, if varm and vars are the natural way to access the data, then
the library should do what it can to do that efficiently.   The fix
appears to be straightforward.  Collective I/O has a lot of advantages
on some platforms: it will automatically select a subset of processors
or automatically construct a file access most closely suited to the
underlying file system.

==rob

Was this ever fixed?

References:
- Re: [netcdfgroup] bug report (nc_put_varm_double hangs during collective parallel I/O): a follow-up
  - From: Orion Poplawski
- Re: [netcdfgroup] bug report (nc_put_varm_double hangs during collective parallel I/O): a follow-up
  - From: Dennis Heimbigner
- Re: [netcdfgroup] bug report (nc_put_varm_double hangs during collective parallel I/O): a follow-up
  - From: Rob Latham