NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.

[netcdfgroup] IOAPI compliance, was: netCDF tidy?

summary: my VERDI problem was indeed not with netCDF but with IOAPI,
or at least VERDI's compliance with IOAPI. Fixes and lessons-learned
below.

details:

Tom Roche Fri, Mar 2, 2012 at 10:05 PM
>> I have a [netCDF] file [with] which [both] R (up-to-date, with
>> package=ncdf4) and NCO (also up-to-date) [are happy].

To clarify, I took a source netCDF/IOAPI file, and

* removed data variables other than that in which I was interested
  (with NCO), greatly reducing the file size

* changed the data values in each layer (with R, specifically
  package=ncdf4), since they had been inadvertently summed upstream

* appended 2 layers to the datavar of interest (with NCO). Note the
  layers are not themselves spatial; i.e., they do not correspond to
  altitude (more below).

* wrote new data to the appended layers (with R)

>> However, when I try to open it as a dataset with VERDI

http://www.verdi-tool.org/
(version="1.4 2011-06-01")

>> I get [java.lang.NullPointerException]

Thanks for pointers to

http://cf-pcmdi.llnl.gov/conformance/compliance-checker/

(which I'll try when my group's data gets CF-compliant) and to
`nccopy` (which my files passed). The problem turned out to be

John Caron Fri, 02 Mar 2012 20:56:03 -0700
> Verdi uses the netcdf-java library, which knows that the file is an
> IOAPI file,

i.e., IOAPI-compliant (or, more correctly, "I/O API-compliant"), not
merely NetCDF-compliant.

> its likely that NCO / R has manipulated the file in a way that is
> not compliant to the IOAPI metadata spec.

Unfortunately there is not, AFAICS, a written specification for the
IOAPI metadata. Neither is there a compliance checker for IOAPI; at
least, the m3tool `m3stat`

http://www.baronams.com/products/ioapi/M3STAT.html

is more forgiving that VERDI. But thanks to

* John Caron's suggestion

> theres supposed to be a global attribute "VGLVLS" but its missing.

  (more below)

* repeated `diff -uwB  <( ncdump -h $A ) <( ncdump -h $B )`

I noted the following (in no particular order)

1 Removing datavars seems not to bother VERDI, provided one does not
  remove the IOAPI-specific "meta variable" TFLAG. My source.nc
  contained 29 datavars; after NCO, it contained just 2 (mine and
  TFLAG).

2 NCO becomes quite cross if one's datavars do not have an attribute
  named "_FillValue", so I fixed that.

3 I also changed the value of the global attribute NVARS (29 -> 1).

4 source.nc has a global attribute "VAR-LIST" containing a single
  string such that

* the string contains the names of the datavars, in order
* each datavar name has spaces appended to length=16 (i.e.,
  sprintf("%-16s", name))

  I changed target.nc (containing my changes) such that its VAR-LIST
  contained only the name (appropriately formatted) of my datavar of
  interest.

After that (i.e., removing datavars and changing global attrs), but
before adding layers, I was able to load the resulting target.nc,
despite not having altered datavar=TFLAG, which contains one date-time
pair per datavar (other than itself). I.e., for both source and target
`ncdump -v TFLAG` produces

>         int TFLAG(TSTEP, VAR, DATE-TIME) ;
>                 TFLAG:units = "<YYYYDDD,HHMMSS>" ;
>                 TFLAG:long_name = "TFLAG           " ;
>                 TFLAG:var_desc = "Timestep-valid flags:  (1) YYYYDDD or (2) 
> HHMMSS                                " ;
...
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0,
>   2002365, 0 ;

I then had to append layers. source.nc contained 42 layers
corresponding to 42 types of crops modeled by EPIC

http://epicapex.brc.tamus.edu/

such that each layer of the datavar (dimensions=(TSTEP, LAY, ROW,
COL)) contained an emission from that crop on a particular gridcell
(ROW, COL) at a particular TSTEP. Integrating those emissions required
knowing the proportion of the gridcell covered by that crop at that
TSTEP (obtained from BELD

http://www.epa.gov/ttnchie1/emch/biogenic/

) and then doing the appropriate sum of products. For explanatory and
debugging purposes I added a layer (43) to show the sum of the BELD
proportions on each gridcell, as well as a layer (44) to show the
integration of the emissions (i.e., the total estimated emissions due
to the modeled crops). I appended the layers using NCO (which is, in
my experience, more of a cleaver), then calculated their values using
R (more of a scalpel).

However appending layers (i.e., adding layers 43-44) to my datavar
broke its VERDI-compatibility, until I ...

5 changed the global attr=NLAYS appropriately (42 -> 44)

... which was not, alas, sufficient for VERDI. The fix was to also ...

6 change the global attr=VGLVLS "appropriately." (I suspect that this
  datum, a vector of floats, is intended to record the height of
  vertical layers, not applicable here.) I noted that, in source.nc,

* length(VGLVLS) == |layers|+1 (== 43 in source.nc)
* the first element == 1.f
* all subsequent elements 0 < e < 1

  So I appended two more elements 0 < e < 1 to VGLVLS (with R) ...

... restoring VERDI-compatibility! Note that I have not yet dealt with
TFLAG, though I probably will, just to prevent "getting bit" later on
(using these emissions as input to other IOAPI-using tools).

I also (at some unspecified future TSTEP :-) intend to find a
repository at which to provide IOAPI-specific tools, for the benefit
of the next poor bastard that goes down this road. If you know of such
tools already available (other than

* the m3tools (above). These use the fortran API (about which the
  maintainer is adamant), and are the "officially supported" tools,
  but provide (IMHO) a tiny slice of desired functionality. (I.e., I
  don't see how I could have done what I did above only with m3tools.)
  Unfortunately, working around the API (as I did) is probably
  unacceptable to the m3tools maintainer, and I currently lack the
  fortran chops to write R which "drives" the fortran API (though, One
  Day, I hope to be that good :-)

* the python ioapiTools, which, IIUC, are downlevel

) or have suggestions regarding where to put such tools, please reply
off-thread.

HTH, and thanks again, Tom Roche <Tom_Roche@xxxxxxxxx>



  • 2012 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: