NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
> From: pack@xxxxxxxxxxxx (Daniel Packman) > To: davis@xxxxxxxxxxxxx > Subject: netCDF default fill (missing_value) > Cc: bailey@xxxxxxxxxxxx, cacraig@xxxxxxxxxxxx, chucks@xxxxxxxxxxxx > > Possibly a mistake on my part, but... > > It seems that the value FILFLOAT and FILDFLOAT used to fill uninitialized > floats and doubles are set to XDRFINF and XDRDINF by default. Other > variable types are set to "constants" such as FILBYTE = 128 and > FILSHORT = 32768. By setting fill values, by default, to true > constants, programs can check the not-so-carefully produced datasets > for missing values without there being an explicit :missing_value > defined. > > The problem is that FILFLOAT/FILDFLOAT are not the same on different > machines. On the vax, for example: > parameter(XDRDINF = 1.7014118346046923e+38) > parameter(XDRFINF = 2.93873588e-39) > > While on suns/ibms: > parameter(XDRDINF = 1.797693134862315900e+308) > parameter(XDRFINF = 3.40282357e+38) > > There are a couple of issues involved here. One, is the internal storage > in the netCDF file. I would like to see missing values for floats/doubles > stored internally as IEEE proscribed values. On creation, the netCDF package > would have a default :missing_value and possibly a user defined :missing_value > that would map input data equal to this to the specific IEEE Not a number > value used. > > The second issue is how these not_a_number values are mapped on output. > Just as in input, there would be a default (optionally user defined) > :missing_value that the internal illegal values would be mapped to. > > Using these mappings, the user program on any machine can properly set/test > missing values. In the current scheme, it seems that there is no such > mapping but rather a direct copy of the value. That is, default missing > values cannot be tested at present unless the source machine is also known. > > Another issue is how to deal with IEEE proscribed values. On the vax > and possibly other platforms, doing any mathematical operation on these > values (eg: a simple compare or printing the value out) results in an > illegal instruction and the program bombs. Mapping these values to known > but legal values (such as the present defaults) makes the user programs > a bit more straightforward. > To recapitulate what you are saying, the problem in choosing the default fill values is that there are conflicting requirements: One wants them to be outside of or at the boundary of the range of the datatype so as not to be infringing on the user's space of possible data values. They need to be valid numbers on all platforms: eg, comparisons and such shouldn't cause core dumps or math exceptions. We got around this by making them different numbers on different platforms. On the vax, the xdr implementation maps the vax native extremal floating point values to ieee infinity and back. I would argue that this is the right thing to do. However, boundary behaviors such as these are not defined by the xdr spec, and we can't count on it. The moral is: don't use the default fill values. Instead, set the variable specific attribute _FillValue to a value which is appropriate for your data. This is new with netcdf version 1.11 and documented on page 20 of the User's guide: * _FillValue If a scalar attribute with this name is defined for a variable and is of the same type as the variable, it will be subsequently used as the @emph{fill value} for that variable. The purpose of this is to save the applications programmer the work of prefilling the data and also to eliminate the duplicate writes that result from netCDF filling in missing data with its default fill value, only to be immediately overwritten by the programmer's preferred value. This value is considered to be a special value that indicates missing data, and is returned when reading values that were not written. The missing value should be outside the range specified by @code{valid_range} for a variable. It is not necessary to define your own @code{_FillValue} attribute for a variable if the default @dfn{fill value} for the type of the variable is adequate. Note that if you change the value of this attribute, the changed value only applies to subsequent writes; previously written data are not changed. See Section 7.9 [Missing values], page 95, for more information. -glenn
netcdfgroup
archives: