NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Hi, I've thrown together a preliminary web page providing links to some documents with guidance for netCDF-4 compression and chunking: http://www.unidata.ucar.edu/software/netcdf/docs/compression.html In addition to NCO's ncks, you can also use the nccopy utility that comes with netCDF-4.1.2 or later to try out various compression and chunking schemes. For example, to convert netCDF-3 data to netCDF-4 data compressed at deflation level 1 and using 10x20x30 chunks for variables that use (time,lon,lat) dimensions: nccopy -d1 -c time/10,lon/20,lat/30 netcdf3.nc netcdf4.nc As an example of the kind of performance differences seen in accessing a data stored contiguously versus with chunking, here's what we saw in one benchmark, accessing all the data in a 3D float array of about 81 million values using cross sections in different orientations: 432 x 432 x 432 array of floats with chunk sizes of 36 x 36 x 36 Access Contiguous Chunking Slowdown (seconds) (seconds) or speedup 2D x,y cross-section write 0.559 1.97 3.5 x slower 2D x,z cross-section write 18.1 1.5 12 x faster 2D y,z cross-section write 223 9.55 23 x faster 2D x,y cross-section read 0.353 1.06 3 x slower 2D x,z cross-section read 6.22 1.45 4.3 x faster 2D y,z cross-section read 77.1 7.68 10 x faster The moral is that with chunking, accesses that are already fast may slow down a little while accesses that are very slow speed up a lot ... --Russ > Oops. Typing too fast: I meant > " If they want NetCDF3 files they can use the NetCDF subset service > or use tools like NCO that can read opendap and > generate NetCDF3..." > > On Mon, May 2, 2011 at 4:57 PM, Rich Signell <rsignell@xxxxxxxx> wrote: > > Jerry, > > > > You might also check into using NetCDF4 files with deflation instead > > of .nc.gz. =A0Your users can still download as opendap or any of the > > other services. =A0If they want netcdf4 files they can use the NetCDF > > subset service or use tools like NCO that can read opendap and > > generate NetCDF 3 files. =A0 You can convert NetCDF3 to NetCDF4 using > > level 1 deflation using NCO =A0( =A0ncks -4 -L 1 netcdf3.nc =A0 netcdf4.n= > c). > > =A0They should be about the same size as the nc.gz files, and will be > > much faster to read since you don't have to uncompress the whole file. > > > > -Rich > > > > > > On Mon, May 2, 2011 at 2:27 PM, jerry y pan <jerry.ypan@xxxxxxxxx> wrote: > >> Hi John, > >> Our TDS (4.2) uses some compressed netcdf files (*.nc.gz) and it works f= > ine, > >> except that the very first access to them were slow (relatively large fi= > les, > >> about 400 MB each). The subsequent accesses would be much faster, but it > >> would become slow again after a while of non-activity. I can see that TDS > >> uncompress these files to the temp data location, my question is that if= > TDS > >> cleans up these temp files, which leads to the work to decompress them n= > ext > >> time and hence the subsequent slowness? If so, is there a way to keep the > >> cache there permanently? Or, perhaps the faster response right after the > >> first access is due to in memory cache? Any configuration I could twist = > the > >> cache? > >> > >> > >> Thanks, > >> -Jerry Pan > >> > >> _______________________________________________ > >> thredds mailing list > >> thredds@xxxxxxxxxxxxxxxx > >> For list information or to unsubscribe, =A0visit: > >> http://www.unidata.ucar.edu/mailing_lists/ > >> > > > > > > > > -- > > Dr. Richard P. Signell=A0=A0 (508) 457-2229 > > USGS, 384 Woods Hole Rd. > > Woods Hole, MA 02543-1598 > > > > > > -- = > > Dr. Richard P. Signell=A0=A0 (508) 457-2229 > USGS, 384 Woods Hole Rd. > Woods Hole, MA 02543-1598 > > _______________________________________________ > thredds mailing list > thredds@xxxxxxxxxxxxxxxx > For list information or to unsubscribe, visit: http://www.unidata.ucar.edu= > /mailing_lists/=20
thredds
archives: