NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.

[netcdf-hdf] bug in MPI cleanup

NOTE: The netcdf-hdf mailing list is no longer active. The list archives are made available for historical reasons.

Hi

I've got a small netcdf4 program that when run, gives the error
"Attempting to use an MPI routine after finalizing MPICH".  This
initially had me puzzled, because the only thing I do after calling
MPI_Finalize() is 'return 0'.

Turns out, HDF5 hooks a routine into atexit(3) that cleans up MPI
structures.  Parts of this cleanup routine should not be run after the
MPI_Finalize:  here's the backtrace at the moment where the error
about using MPI routines after finalizing MPICH is printed:

#1  0x084bdcea in PMPI_Comm_free (comm=0x8711c20) at /home/robl/work/mpich2/src/
mpi/comm/comm_free.c:73
#2  0x081b687f in H5FD_mpi_comm_info_free (comm=0x8711c20, info=0x8711c24) at ..
/../src/H5FDmpi.c:326
#3  0x081ba2f0 in H5FD_mpio_fapl_free (_fa=0x8711c20) at ../../src/H5FDmpio.c:87
0
#4  0x0819936b in H5FD_pl_close (driver_id=134217729, free_func=0x81ba113 <H5FD_
mpio_fapl_free>, pl=0x8711c20) at ../../src/H5FD.c:625
#5  0x08199f4c in H5FD_fapl_close (driver_id=134217729, fapl=0x8711c20) at ../..
/src/H5FD.c:791
#6  0x0829154c in H5P_facc_close (fapl_id=167772177, close_data=0x0) at ../../sr
c/H5Pfapl.c:431
#7  0x08279aca in H5P_close (_plist=0x87119c0) at ../../src/H5P.c:5370
#8  0x08202738 in H5I_clear_type (type=H5I_GENPROP_LST, force=0) at ../../src/H5
I.c:604
#9  0x0826949d in H5P_term_interface () at ../../src/H5P.c:488
#10 0x080c416b in H5_term_library () at ../../src/H5.c:266
#11 0xb7d979d9 in exit () from /lib/tls/i686/cmov/libc.so.6
#12 0xb7d80ec4 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6
#13 0x08085171 in _start ()


Since my test program called MPI_Finalize, it's incorrect for the HDF5
library to also call MPI_Comm_free.  

I would advise not hooking MPI-IO cleanup into atexit(3): you already
correctly make the user call MPI_Init and MPI_Finalize.  Can hdf5
cleanup instead occur as part of nc_close?

Thanks
==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B


  • 2007 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-hdf archives: