NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.

Re: something startling I just noticed...

NOTE: The netcdf-hdf mailing list is no longer active. The list archives are made available for historical reasons.

Hi Russ,

> > >There are some advantages of sequence numbers over times:
> > >  - you don't have to worry about clock resolution and the possibility
> > >    that creation times of two objects are equal
> >     Hmm, we use the gettimeofday() routine, which returns values in
> > microseconds, so this probably would not be too much of an issue, but I 
> > admit
> > it certainly is possible.
> 
> We ran into just this problem on a skiplist implementation (for LDM
> not netCDF) that required a total ordering.  Time stamps worked most
> of the time, but if two events happened to get assigned the same
> microsecond clock tick, we lost track of one of the corresponding
> objects.  On old machines, we never saw the problem, but it bit us
> when we tried running on faster hardware.  We ended up adding what was
> essentially a sequence number to the timestamp to disambiguate
> matching microsecond clock times.
    Well, I hope that we can create objects in the file fast enough that having
only a microsecond resolution is a problem for HDF5 also... :-)

> > Hmm, I think there may be some issues with a creation sequence number also:
> >     - The "last number issued" will need to be stored in the file (unlike
> >         creation times).
> >     - Should it be local to the group, or global to the file? There are
> >         pro's and con's to both:
> >             Global:
> >                 - Pro: One number to track for file
> >                 - Con: May have contention for updating this number in a
> >                     parallel environment.
> >                 - Con: Faster to roll over than a sequence number per group.
> >                 - Con: Sequence numbers in one group will have gaps, if
> >                     objects are created in other groups, which does not
> >                     imply objects were deleted in the group.
> > 
> >             Local:
> >                 - Pro: More consistent numbering within one group than a
> >                     sequence number per file.
> >                 - Con: May have contention for updating this number in a
> >                     parallel environment.
> >                 - Con: A new piece of metadata to update with every object
> >                     created in a group.
> > 
> > I guess I would tend toward a local (i.e. per group) sequence number.
> > How's that sit with people?
> 
> Good analysis of sequence number problems.  I agree with you, local
> seems to be adequate unless we chose to ignore Group semantics for the
> netCDF-4 interface and just treated the Group name as part of a global
> name for a netCDF-4 object.  In that case, local would be a problem,
> because two netCDF-4 objects that we wanted to iterate over in order
> could get the same sequence number.  Maybe this is an argument not to
> treat Groups as just part of the name.
    Yes, local sequence numbers cut both ways sometimes...  Since most (all?)
current netCDF users should be used to a 'flat' file, putting all the objects
in the root group of the file and using the creation order in that group seems
like a reasonable default.  Then, you could change the definition of the way the
creation order information is used for netCDF 4 users so that the group
structure was accounted for.
    BTW, I was looking through the netCDF 3 API for functions that take or
return an 'index' in the file and I can't find one.  Which function(s) applies
to this situation?

> For us, a different kind of local would also work: a set of sequence
> numbers for Datasets, for each Dataset's Attributes, and for shared
> dimension Scales.  But if you have other uses for time stamps or
> sequence numbers, our use shouldn't dictate the requirements, since
> anything that allows us to determine the creation order of netCDF
> variables, dimensions, and attributes would work.
    This is along lines that we've thought about for a long time: adding a 
live" index capability to HDF5 files, where every change to the file's metadata
(object creation, modification, deletion and attribute creation & deletion)
could update an index in the file in some way.  I think this is a great idea,
but I think it would be too much work at the current time. :-(

    Quincey

  • 2003 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-hdf archives: