NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Hi Russ, On Thu, Feb 27, 2014 at 2:38 PM, Russ Rew <russ@xxxxxxxxxxxxxxxx> wrote: > #define CHUNK_THRESHOLD (8192) /* variables with fewer bytes don't get > chunked */ > > The intent of the CHUNK_THRESHOLD minimum is to not create chunks > smaller than a physical disk block, as an I/O optimization, because > attempting to read a smaller chunk will still cause a whole disk block > to be read. So I take it 8k is a reasonable expectation for disk cache these days? But this is a great tidbit -- I'm working on code to write data in the "new" UGRID standard: https://github.com/ugrid-conventions/ugrid-conventions And the code: https://github.com/pyugrid/pyugrid And I wanted to set some reasonable defaults for chunking. In this case, you tend to have a lot of large 1-d arrays, and most of the discussions I've seen are about multi-dimensional arrays. It sounds like I should set a minimum chunk size of 8k bytes then. > However, I think for the next > release, we should lower the default threshold to 512 bytes, and > document the behavior. > Document -- of course, but why lower the threshold? Though maybe the thresholds are good for defaults, but if a user asks for smaller than optimum chunk sizes, maybe that's what they should get. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@xxxxxxxx
netcdfgroup
archives: