NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

Re: netCDF library

Nilesh,

Since Netcdf format is a simple matrix of fixed width cells, there is no simple way to save space by not storing zero values.

I think you are saying that a standard scientific file format is important to you. Since you have had such good luck with gridded data in Netcdf, I suggest that you stay with it. Consider these options to reduce archival file size:

1. Keep your current Netcdf format, but store your files gzip'ed. Make uncompressing a standard part of opening the file. Many application languages will allow you to call the shell to gunzip and delete a temporary file, so you can automate this. gunzip is rather fast, as I recall. As you stated, your file size is reduced by 99%.

2. Netcdf 16-bit packed format. Reduce file size by 50%. You get 16 bits for your combined precision and dynamic range.

3. Netcdf 8-bit packed format. Reduce file size by 75%. You get 8 bits for your combined precision and dynamic range.

It is possible to write support for a custom, non-Netcdf or contorted-Netcdf format to efficiently hold sparse data and exclude zeros. This would be very costly in terms of programming time and lack of compatibility. I recommend against this, and I say that as one who has done it the wrong way a few times. ;-)

--Dave Allured
CIRES Climate Diagnostics Center (CDC)
NOAA/ESRL, Physical Sciences Division (PSD)

Nilesh Lahoti wrote:
Dear Sir,

We are air quality modeling group at Rutgers University, New Jersey. We are processing emissions and running simulation models for our study of long range transport of Ozone and Particulate matter for our research and for regulatory work.

The netCDF library works great for us. However, I came across with one particular issue of netCDF and would like to discuss if there are any solution to this problem or something that can do to make its performance better. When we process emissions for our three dimensional grid of size (172 x 172 x 22) for 24 hours time period having hourly data, the file size is around 1 gigabyte(GB). There are several cells that have zero values and therefore the floating point value for pollutants in netCDF file has zero values. When we use gzip utility on unix to compress this files, the file size become almost 10 MB which saves us 99% of disk space. Now the question arise that if the netCDF is most compress scientific format, than is it possible to suppress this zero values of the floating point variable or is there any switch that can be used to handle zero values and reduce file size by any chance.

Looking forward to hear from you.

from,

Nilesh Lahoti
Research Specialist
CCL, EOHSI,
Rutgers University
Email: nilesh@xxxxxxxxxxxxxxxxxxx
Phone: 732-445-1416

===============================================================================
To unsubscribe netcdfgroup, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
===============================================================================


==============================================================================
To unsubscribe netcdfgroup, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================