NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

[netcdfgroup] netCDF Operators NCO version 4.4.7 are ready

The netCDF Operators NCO version 4.4.7 are ready.

http://nco.sf.net (Homepage)
http://dust.ess.uci.edu/nco (Homepage "mirror")

This stabilization release also contains useful new features that
assist users with chunking policies, hyperslabbing, and creating
record variables. First, hyperslab specifications now use
Python-consistent indexing (negative indices) to indicate the last
elements of a dimension. Second, chunking behavior has been revamped.
Overall the chunking algorithms are much more robust and optimized.
See below for the user-visible changes. Of note is the new balanced
chunking map named for Russ Rew. Finally, ncap2 can create record
variables.

Work on NCO 4.4.8 is underway, still focused on stability and speed.
This includes more netCDF4 mop-up and new chunking features.

Enjoy,
Charlie

NEW FEATURES (full details always in ChangeLog):

A. Hyperslab specifications now accept -1 (negative one) to indicate
   the last element of a dimension. This is consistent with Python
   syntax, unlike NCO's previous behavior (from 20120709--20141001)
   where negative integers indicated the number of elements from the
   last (-1 was penultimate, -0 was ultimate). Now -1 is last element,
   -2 is penultimate element, and -N is the first element. NCO warns
   users who may expect the old behavior.
   ncks -d lon,-1 in.nc out.nc # last longitude
   ncks -d lon,-3,-1 in.nc out.nc # last three longitudes
   http://nco.sf.net/nco.html#hyp

B. ncap2 creates record dimension when dimension size is negative:
   ncap2 -s 'defdim("time",10)'  in.nc out.nc # Fixed-size dimension
   ncap2 -s 'defdim("time",-10)' in.nc out.nc # Record dimension
   Formerly ncap2 had no way to create new record dimensions.
   http://nco.sf.net/nco.html#ncap2

C. A new chunking map, --cnk_map=rew, implements a chunking strategy
   that balances common 1-D (e.g., timeseries) and 2-D (e.g.,
   geographic slices) access patterns to 3-D (time, lat, lon) data.
   Name in honor of its (and netCDF's) developer Russ Rew:

http://www.unidata.ucar.edu/blogs/developer/en/entry/chunking_data_choosing_shapes
   For now, this map only applies to 3-D variables. We hope to include
   a version generalized to N-D variables in the next release.
   ncks --cnk_map=rew in.nc out.nc # Use Rew's balanced chunking
   http://nco.sf.net/nco.html#cnk

D. A new chunking map, --cnk_map=nc4, implements the netCDF4 default
   chunking map as implemented in the netCDF library used to build NCO.
   Formerly, NCO had no way to explicitly resort to netCDF4 defaults.
   (netCDF4 and NCO chunking defaults differ).
   This option allows users to change the chunking of a dataset to
   what it would be if it were created from scratch by ncgen without
   any specific chunking options.
   ncks --cnk_map=nc4 in.nc out.nc # Use netCDF4 default chunking
   http://nco.sf.net/nco.html#cnk

E. NCO's default chunking policy now preserves existing chunking.
   Previously NCO would use its favorite chunking policy (which
   variables to chunk) and map (how to chunk those variables) on all
   input files unless the user explicitly specified other chunking
   options. Thus a chain of processing commands had to explicitly
   specify chunking parameters in each command otherwise the
   chunksizes could be reset to a shape not optimal for the
   anticipated access patterns. Now once a file is optimally chunked,
   no further NCO operations on that file need specify chunking
   options (since they are preserved by default).
   ncks in.nc out.nc # Defaults to --cnk_plc=xst, --cnk_map=xst
   ncks --cnk_map=xst --cnk_plc=xst in.nc out.nc # Same effect
   http://nco.sf.net/nco.html#cnk

F. The minimum size of variables to chunk may now be specified with
   --cnk_min=var_sz, where var_sz is the minimum size in bytes (not
   elements) of variables to chunk. This threshold is intended to
   restrict use of chunking to variables for which it is efficient.
   By default this minimum variable size for chunking is 8192 B.
   Formerly, NCO would chunk all arrays of any size.
   To obtain that behavior now users must specifiy --cnk_min=1, so
   that arrays of any size will be chunked.
   To obtain a system-dependent minimum size, set cnk_min=0 and then
   NCO will compute the minimum size as twice the system blocksize
   (when available) and 8192 B otherwise.
   ncks in.nc out.nc # Minimum variable size to chunk is 8192 B
   ncks --cnk_min=1 in.nc out.nc # Minimum size is 1 B
   ncks --cnk_min=0 in.nc out.nc # Minimum size is twice blocksize
   http://nco.sf.net/nco.html#cnk

BUG FIXES:

A. ncrename has known problems when renaming netCDF4 coordinates.
   These problems will go away when Unidata issues a fix to the
   underlying netCDF library. Thanks to Parker Allen for reporting.
   http://nco.sf.net#bug_nc4_rename

B. Auxiliary coordinate hyperslabbing (with -X) was inadvertently
   turned-off for 1-D variables. Now it works fine with 1-D vars.
   Fixed by Pedro Vicente.
   Moreover, files without a "Conventions=CF-1.X" attribute are now
   treated as if they had the attribute to exploit -X functionality.

C. Auxiliary coordinate hyperslabbing (with -X) was broken for
   variables with non-coordinate dimensions. Fixed by Pedro Vicente.
   Thanks to romzie04 for alerting us to these -X issues.

D. Fixed ncks bug when built on netCDF < 4.3.1 due to error in
   handling NULL arguments to nco_inq_format_extended() stub.
   Also fix printing problem with strncpy().
   Only known to affect with strict initialization like clang.

KNOWN PROBLEMS DUE TO NCO:

   This section of ANNOUNCE reports and reminds users of the
   existence and severity of known, not yet fixed, problems.
   These problems occur with NCO 4.4.7 built/tested with netCDF
   4.3.3-rc2 (20141112) on top of HDF5 hdf5-1.8.13 with:

   cd ~/nco;./configure --enable-netcdf4  # Configure mechanism -or-
   cd ~/nco/bld;make dir;make allinone # Old Makefile mechanism

A. NOT YET FIXED (NCO problem)
Correctly read arrays of NC_STRING with embedded delimiters in ncatted arguments

   Demonstration:
ncatted -D 5 -O -a new_string_att,att_var,c,sng,"list","of","str,ings" ~/nco/data/in_4.nc ~/foo.nc
   ncks -m -C -v att_var ~/foo.nc

   20130724: Verified problem still exists
   TODO nco1102
   Cause: NCO parsing of ncatted arguments is not sophisticated
   enough to handle arrays of NC_STRINGS with embedded delimiters.

B. NOT YET FIXED (NCO problem?)
ncra/ncrcat (not ncks) hyperslabbing can fail on variables with multiple record dimensions

   Demonstration:
   ncrcat -O -d time,0 ~/nco/data/mrd.nc ~/foo.nc

   20140826: Verified problem still exists
   20140619: Problem reported by rmla
   Cause: Unsure. Maybe ncra.c loop structure not amenable to MRD?
   Workaround: Convert to fixed dimensions then hyperslab

KNOWN PROBLEMS DUE TO BASE LIBRARIES/PROTOCOLS:

A. NOT YET FIXED (netCDF4 problem)
   Renaming netCDF4 coordinate variables or dimensions "succeeds" but
   corrupts (sets to _FillValue) values in the output dataset.
   Full description here http://nco.sf.net#bug_nc4_rename

   Demonstration with netCDF <= 4.3.2:
   ncrename -O -v time,newrec ~/nco/data/in_grp.nc ~/foo.nc
   ncks --cdl -g // -v newrec -d time,0 -C ~/foo.nc

   20141007: Problem reported by Parker Norton
   20141008: Problem reported to Unidata
   20141010: Verified by Unidata.
   20141112: Verified problem still exists
   Bug tracking: https://www.unidata.ucar.edu/jira/browse/NCF-177
   Workaround: Convert to netCDF3, rename, convert back to netCDF4

B. NOT YET FIXED (netCDF4 or HDF5 problem?)
   Specifying strided hyperslab on large netCDF4 datasets leads
   to slowdown or failure with recent netCDF versions.

   Demonstration with NCO <= 4.4.5:
   time ncks -O -d time,0,,12 ~/ET_2000-01_2001-12.nc ~/foo.nc
   Demonstration with NCL:
   time ncl < ~/nco/data/ncl.ncl
   20140718: Problem reported by Parker Norton
   20140826: Verified problem still exists
   20140930: Finish NCO workaround for problem
   Cause: Slow algorithm in nc_var_gets()?
   Workaround #1: Use NCO 4.4.6 or later (avoids nc_var_gets())
   Workaround #2: Convert file to netCDF3 first, then use stride

C. NOT YET FIXED (would require DAP protocol change?)
   Unable to retrieve contents of variables including period '.' in name
   Periods are legal characters in netCDF variable names.
   Metadata are returned successfully, data are not.
   DAP non-transparency: Works locally, fails through DAP server.

   Demonstration:
ncks -O -C -D 3 -v var_nm.dot -p http://thredds-test.ucar.edu/thredds/dodsC/testdods in.nc # Fails to find variable

   20130724: Verified problem still exists.
   Stopped testing because inclusion of var_nm.dot broke all test scripts.
NB: Hard to fix since DAP interprets '.' as structure delimiter in HTTP query string.

   Bug tracking: https://www.unidata.ucar.edu/jira/browse/NCF-47

D. NOT YET FIXED (would require DAP protocol change)
   Correctly read scalar characters over DAP.
   DAP non-transparency: Works locally, fails through DAP server.
   Problem, IMHO, is with DAP definition/protocol

   Demonstration:
ncks -O -D 1 -H -C -m --md5_dgs -v md5_a -p http://thredds-test.ucar.edu/thredds/dodsC/testdods in.nc

   20120801: Verified problem still exists
   Bug report not filed
   Cause: DAP translates scalar characters into 64-element (this
   dimension is user-configurable, but still...), NUL-terminated
   strings so MD5 agreement fails

"Sticky" reminders:

A. Pre-built Debian Sid & Ubuntu packages:
   http://nco.sf.net#debian

B. Pre-built Fedora and CentOS RPMs:
   http://nco.sf.net#rpm

C. Pre-built Mac binaries:
   http://nco.sf.net#mac

D. Pre-built Windows (native) and Cygwin binaries:
   http://nco.sf.net#windows

E. Reminder that NCO works on most HDF4 and HDF5 datasets, e.g.,
   HDF4: AMSR MERRA MODIS ...
   HDF5: GLAS ICESat Mabel SBUV ...
   HDF-EOS5: AURA HIRDLS OMI ...


--
Charlie Zender, Earth System Sci. & Computer Sci.
University of California, Irvine 949-891-2429 )'(