NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
In the ideal world Unidata will do the packing in such a way that it is essentially invisible to the user. I know they have wanted to do if for many years. I dont know how fast it is actually going to happen. In the interim I have consider a kludge, which would require an extra set of agreed upon conventions, and maybe a wrapper to the Unidata Netcdf API. imagine a 4-D array of floats T(x,y,z,t). We define a set of scale factors and offsets float T-scale(z,t) and T-offset(z,t). The values of the factors are chosen so that an array short T-short(x,y,z,t) can be defined so that T-approx(x,y,z,t) = T-short(x,y,z,t)*T-scale(*,*,z,t) + T-offset(*,*,z,t) T-approx is an approximation to the original array T. It is trivial to produce a netcdf file storing T-short, T-scale, and T-offset, and the file will be about a factor of two smaller than one storing the original array T. If one used a byte array it would be a factor of 4 smaller. It is obviously not an optimal setup, because it doesnt support arbitrary bit lengths, but it would frequently result in dramatic reductions in the file length. The scale and offset arrays can be determined to optimize the scalability of the data. E.g. I have chosen here to define a set of scalings that are relevant to data that has similar characteristics on x,y surfaces. The main problem is we would need to establish a convention to be able to share within the community, and maybe a set of useful wrappers so that each programmer would not need to do conformance checking, etc. There may be a better way to do this kind of thing, and I am open to suggestion. I toss this out to stimulate some discussion. Phil On Thu, Oct 03, 2002 at 07:57:34AM -0600, John Caron wrote: > We have thought a lot about packing in netCDF and considered various > schemes on how to do it. Variable length compression schemes would > probably interfere with efficient subsetting, but it seems to me that fixed > bit size packing should be doable. We have this on our wish list, i mean > to-do list, for the next round of netCDF development. > > since COARDS is lat/lon only, you cant do arbitrary GRIB conversion. CF > seems like a good candidate, however. Has anyone investigated the > feasibility of this? > > gribtonc uses NUWG, and has various limitations. We are interested in > possibly upgrading gribtonc. If anyone can make use of more flexible GRIB > to netCDF conversion, I'd like to hear about your "use-case". > > I think in order to correctly group the GRIB records into 3 or 4 > dimensional netCDF variables, you will need (for the general case) some > sort of configuration info for the converter, although I suppose a "common > case" could be assumed. Anyone have any thoughts on that? > > > Timothy Hume wrote: > > >Hi, > > > >This discussion reminded me of how GRIB packs data. Ideally, it would be > >nice for NetCDF to be able to handle data with an arbitrary number of > >bits. Many meteorological data can be packed into only 9 or 10 bits (often > >less), so packing them into 16 bit short integers is "wasteful". Aside > >from that many satellite data are "naturally" 10 bit, and increasing them > >to 16 bits can cause the file size to increase by tens of megabytes per > >image. > > > >By the way, does anyone know of software that can convert GRIB data to > >COARDS or CF conventions? gribtonc converts GRIB to NUWG conventions? > > > >Tim Hume > > > >By the way, does anyone know of a GRIB to COARDS or CF > > > >On Wed, 2 Oct 2002, Mark A Ohrenschall wrote: > > > >>Hello, > >> > >>In the case of a packed variable (in which scale_factor and add_offset > >>are used) both the COARDS and CF conventions indicate that missing_value > >>and _FillValue should be likewise packed: > >> > >>COARDS: "In cases where the data variable is packed via the scale_value > >>attribute this implies that the missing_value flag is likewise packed." > >>CF: "The missing values of a variable with scale_factor and/or > >>add_offset attributes (see section 8.1) are interpreted relative to the > >>variable's external values, i.e., the values stored in the netCDF file." > >> > >>I'm assuming that for the sake of consistency, this means that all > >>statistical variable attributes should be packed as well, e.g., > >>valid_range and actual_range, as well as mean and standard_deviation. Is > >>this true? > >> > >>So for example, if I have real world data values for temperature between > >>-1.6 and 31.4 and I'm applying a scale_factor of 0.1 then I would say > >>the valid_range is -16, 314 and the mean is 116 (not 11.6)? > >> > >>Thanks, > >> > >>Mark > >> > >> > >> > > > > > -- Phil Rasch, Climate Modeling Section, National Center for Atmospheric Research Mail --> P.O. Box 3000, Boulder CO 80307 Shipping --> 1850 Table Mesa Dr, Boulder, CO 80305 email: pjr@xxxxxxxx, Web: http://www.cgd.ucar.edu/cms/pjr Phone: 303-497-1368, FAX: 303-497-1324 >From owner-netcdfgroup@xxxxxxxxxxxxxxxx Thu 3 2002 Oct 08:56:41 Date: Thu, 3 Oct 2002 08:56:41 -0700 (PDT) From: Charlie Zender <zender@xxxxxxx> To: netCDF Mailing Group <netcdfgroup@xxxxxxxxxxxxxxxx> Subject: packed data in NCO Received: (from majordo@localhost) by unidata.ucar.edu (UCAR/Unidata) id g93GVFw17441 for netcdfgroup-out; Thu, 3 Oct 2002 10:31:15 -0600 (MDT) Organization: UCAR/Unidata Keywords: 200210031631.g93GVD117428 Message-Id: <20021003155641.0C82B24803@xxxxxxxxxxxxxxx> Sender: owner-netcdfgroup@xxxxxxxxxxxxxxxx Precedence: bulk Reply-To: Charlie Zender <zender@xxxxxxx> Hi, For what it's worth, recent versions of NCO (http://nco.sf.net) support interpreting packed data to the following extent: All arithmetic operators (claim to) support packed data. This means multiple packed data files can be, e.g., easily averaged. ncap can read and write packed data, i.e., it will pack it for you. The read functions support packing into any type, so data packed into NC_CHAR (8 bits) should work fine. I hope people will exercise the packing/unpacking functionality and let me know how it works for them. Charlie -- Charlie Zender, zender at uci dot edu, (949) 824-2987, Department of Earth System Science, University of California, Irvine CA 92697-3100
netcdfgroup
archives: