RE: [GMLJP2] NetCDF <--> GML/JPEG2000

To: John Caron; Sean Forde
Subject: RE: [GMLJP2] NetCDF <--> GML/JPEG2000
From: rlake@xxxxxxxxxxxxx
Date: Thu, 12 May 2005 09:13:54 -0700

Hi,

There are two different issues combined here.

1.    Integer vs floating values.

2.    Dimensionality of the coverage

Integer vs floating values:

JPEG 2000 offers support for integers only.  It is not clear that this is a
serious constraint since any observation must have only finite precision and
hence can be scaled. This may be a pain, but it is common in all measurement
systems.

Dimensionality of the coverage:

In general a coverage can be seen as a function X = f(Y).  The dimension of
X can be some N (N=0,1,2, 3, 4 .. ).  With the current GMLinJP2K
specification, the dimensionality of Y is 2 - that is at each point p on a
2D surface (e.g. surface of the earth) we can have vector quantities X(p)
(X1, X2, .. Xn) where each Xi will be in a different JPEG 2000 codestream.

So for representation of measurement information the restriction of GMLJP2K
is that it does not allow the points p to be in a volume - note that the
surface on which p lies can be in 3-space however.  This is not really a
restriction of JPEG 2000, however, and I think we can extend the GML
description to allow the description of functions (as above) over
multi-dimensional "volumes".

Please note that in GML and O&M (=GML) an Observation is not the values but
the act - so an observation can be coverage-valued - like when I take your
picture.

Cheers

Ron

-----Original Message-----
On Behalf Of Michael P. Gerlek
Sent: May 12, 2005 8:38 AM

John wrote:

Im wondering if its fair to think of JPEG as inherently 2D arrays of
integers (perhaps floats in the future) ?. I assume that the wavelet
compression is optimized for 2D ?

If so, it may not be the right data structure for representing
observations, which may be thought of as lists of tuples, more like a
database table ?


Some background that might help (or just muddle things further...):

* JPEG 2000 considers an image to be a 2D array, where each array element is
a tuple ("pixel") which can contain N samples.  In the usual case, the
samples are 3 uint8's ("red", "green", "blue").  However, the standard
allows N to be very large (2^16 or so, I'd have to check), and futhermore
the N samples need not all be of the same datatype.

* Additionally, you can have M images ("codestreams") per file.  This is how
they do movies/video, among other things; each frame is a separate
codestream.  In general, of course, the M images need not all be of the same
shape (width, height, number of samples, etc).

* The wavelet is indeed a 2-D construct.  There was some work on an
extension for 3-D, but this has been abandoned (and that project also
included the floating point support, unfortunately).

* While the wavelet is 2-D, the "compression" part is not.  Under JPEG 2000,
the main compression win comes from the arithmetic encoder -- this mechanism
operates deep down in the bit domain, and so is agnostic as to number, type,
etc of the samples.

* Early on in the encoding process, there is a colorspace transform that is
done to improve compression ratios for RGB data: the data is (losslessly)
mapped into the YCbCr domain, which essentially decorrelates the color bands
better than RGB does.  If your data is not RGB, this transform can be
skipped -- or, indeed, if you know something about the properties of your
data's bands, with one of the extensions you can supply your own
decorrelating transformation...

-mpg

Follow-Ups:
- Re: [GMLJP2] NetCDF <--> GML/JPEG2000
  - From: John Caron

References:
- RE: [GMLJP2] NetCDF <--> GML/JPEG2000
  - From: Michael P. Gerlek