NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
On Tue, May 23, 2017 at 2:30 PM, Ed Hartnett <edwardjameshartnett@xxxxxxxxx> wrote: > On a related note, many users have complained of very poor performance on > files with a chunksize of 1 in the record dimension, when they are using > the data in other ways that reading one lat-lon grid at a time. Naturally, > this is understandable. To even get one value in the level, the entire > lat-lon grid must be read. > This is the inherent problem with chunking -- a good chunking strategy completely depends on the access pattern. > So perhaps having all the non-1 dimensions use a chunksize of their > fullest extent is not such a good idea. > exactly -- for defaults, I think it's better that full extend chunks NOT be used. I did some experiment a while back ,and wildly too small or large chunks had a big impact on performance, but it was not that sensitive to mid-size chunks. So if, for example, you have a 10kx10k lat-lon grid, you probably don't want to use 1,10k,10k chunks Better to use: 1, 1k, 1k, chunks. I'd bet that it would be almost as fast when accessing the full grid at a given time, but much faster when accessing only a small part of the grid. or maybe (10, 100, 100) would be best -- much better for a time series at a single point, and still probably not too slow for the whole grid (I found 1k chunks not too bad on that particular machine anyway...) -CHB >> -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@xxxxxxxx
netcdfgroup
archives: