NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

Re: netCDF Utilities

Greetings,

In <9202101050.AA16358@xxxxxxxxxxxxxxxxx>, hld@xxxxxxxxxxxxxxxxx (DAVIES
Harvey) writes

>I was very interested in this draft document & would like to start some
>discussion within netcdfgroup by sharing my comments.
>
>My first reaction is that the proposals seem to assume a quite different
>way of using netcdf from that I have adopted so far in my rather limited
>experience. I use ncgen to create files with everything except the main
>data arrays. Then I use my program nc_put_var to put the data into the file.
>It would greatly simplify the design of utilties if they did not have to
>create new variables & their related attributes & coordinates.

This is an interesting concept and one that we employ in certain,
special programs.

The netCDF operators, however, are intended to be generic, stand-alone
programs.  Thus, the questions that must be asked is "Where is the
decision to be made as to the structure of the output netCDF?"  Such a
decision must be based on the structure of any input netCDF's and the
nature of the processing to be done.

It seems reasonable, therefore, to centralize this decision-making 
in the routine that will populate the output netCDF's and that has all
the necessary information, namely, the netCDF program itself.

I would welcome, however, any discussion on alternative paradigms.

>I assume that the structure of a file does not change, only its data. I would
>also assume that one would process only one variable at a time.

That depends on the processing.  For example, the ncbarnes(1) operator,
on which I'm currently working, analyses all observations that relate
to a given output point at the same time (in general, this will consist
of more than one value).  Though it is possible to perform the analysis
one variable at at time, in most circumstances this is less efficient
than processing all related variables together (for example:
objectively analyzing both temperature and humidity to the same
(pressure, lat lon) grid).

There are other analyses which, for a given output point, take more
than a single input variable and/or produce more than a single output
variable.  Therefore, adopting a one-variable-at-a-time convention would
seem unnecessarily limiting.  (We're open to discussion, however.)

>I would hope
>that all utilities would:
>- do unit conversion using units attribute values of input & output files

That is our intention.  Where appropriate, all values will be made
comensurate using the udunits(3), conversion library (which,
incidentally, will allow automatic handling of "time" variables through
the specification of "units" and "origin" attributes).

>- do type conversion, allowing different types in input & output files

As far as working with heterogeneous data-types is concerned, we're
working on a prototype C++ implementation that will support polymorphic
arithmetic (using a technique call "double polymorphism").

We currently intend, however, to have the netCDF programs select the
output types, based on the input types and the processing algorithm.
Any, subsequent conversion of the output data to other datatypes would
be handled by conversion operators.

>- utilize add_offset & scale_factor when reading & writing

Our intention is to support this where appropriate.

>I am not happy with the facility for specifying arrays in either the current
>version of nc_put_var or the NCAR plans.

I'm not either.  Something along the lines of the grammar you posted
is definitely needed.

>I hope these ideas are of interest & not too 'way out'.

Your ideas are not too 'way out' and we are, indeed, interested.

Regards,
Steve Emmerson  <steve@xxxxxxxxxxxxxxxx>