Re: netCDF default fill (missing_value)

To: pack@xxxxxxxxxxxx
Subject: Re: netCDF default fill (missing_value)
From: davis
Date: Mon, 24 Jun 1991 11:56:22 -0600
> From: pack@xxxxxxxxxxxx (Daniel Packman)
> To: davis@xxxxxxxxxxxxx
> Subject: netCDF default fill (missing_value)
> Cc: bailey@xxxxxxxxxxxx, cacraig@xxxxxxxxxxxx, chucks@xxxxxxxxxxxx
> 
> Possibly a mistake on my part, but...
> 
> It seems that the value FILFLOAT and FILDFLOAT used to fill uninitialized
> floats and doubles are set to XDRFINF and XDRDINF by default.  Other
> variable types are set to "constants" such as FILBYTE = 128 and
> FILSHORT = 32768.  By setting fill values, by default, to true
> constants, programs can check the not-so-carefully produced datasets
> for missing values without there being an explicit :missing_value
> defined.
> 
> The problem is that FILFLOAT/FILDFLOAT are not the same on different
> machines.  On the vax, for example:
>       parameter(XDRDINF = 1.7014118346046923e+38)
>       parameter(XDRFINF = 2.93873588e-39)
> 
> While on suns/ibms:
>       parameter(XDRDINF = 1.797693134862315900e+308)
>       parameter(XDRFINF = 3.40282357e+38)
> 
> There are a couple of issues involved here.  One, is the internal storage
> in the netCDF file.  I would like to see missing values for floats/doubles
> stored internally as IEEE proscribed values.  On creation, the netCDF package
> would have a default :missing_value and possibly a user defined :missing_value
> that would map input data equal to this to the specific IEEE Not a number
> value used.
> 
> The second issue is how these not_a_number values are mapped on output.
> Just as in input, there would be a default (optionally user defined)
> :missing_value that the internal illegal values would be mapped to.
> 
> Using these mappings, the user program on any machine can properly set/test
> missing values.  In the current scheme, it seems that there is no such
> mapping but rather a direct copy of the value.  That is, default missing
> values cannot be tested at present unless the source machine is also known.
> 
> Another issue is how to deal with IEEE proscribed values.  On the vax
> and possibly other platforms, doing any mathematical operation on these
> values (eg: a simple compare or printing the value out) results in an
> illegal instruction and the program bombs.  Mapping these values to known
> but legal values (such as the present defaults) makes the user programs
> a bit more straightforward.
> 

To recapitulate what you are saying, the problem in choosing the
default fill values is that there are conflicting requirements:

        One wants them to be outside of or at the boundary of the
        range of the datatype so as not to be infringing on the
        user's space of possible data values.

        They need to be valid numbers on all platforms: eg, comparisons
        and such shouldn't cause core dumps or math exceptions.

We got around this by making them different numbers on different platforms.

On the vax, the xdr implementation maps the vax native extremal floating
point values to ieee infinity and back. I would argue that this is the
right thing to do.  However, boundary behaviors such as these are not
defined by the xdr spec, and we can't count on it.

The moral is: don't use the default fill values.

Instead, set the variable specific attribute _FillValue to a value
which is appropriate for your data. This is new with netcdf version 1.11
and documented on page 20 of the User's guide:

* _FillValue
If a scalar attribute with this name is defined for a variable and is of
the same type as the variable, it will be subsequently used as the
@emph{fill value} for that variable.  The purpose of this is to save the
applications programmer the work of prefilling the data and also to
eliminate the duplicate writes that result from netCDF filling in
missing data with its default fill value, only to be immediately
overwritten by the programmer's preferred value.  This value is
considered to be a special value that indicates missing data, and is
returned when reading values that were not written.  The missing value
should be outside the range specified by @code{valid_range} for a
variable.  It is not necessary to define your own @code{_FillValue}
attribute for a variable if the default @dfn{fill value} for the type of
the variable is adequate.  Note that if you change the value of this
attribute, the changed value only applies to subsequent writes;
previously written data are not changed.  See Section 7.9 [Missing values],
page 95, for more information.

-glenn
1991 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdfgroup archives: