Re: [netcdfgroup] Alternate chunking specification

To: netcdfgroup@xxxxxxxxxxxxxxxx
Subject: Re: [netcdfgroup] Alternate chunking specification
From: Dave Allured - NOAA Affiliate <dave.allured@xxxxxxxx>
Date: Tue, 16 May 2017 13:30:57 -0600

Dennis,

Are you saying that the original function nc_def_var_chunking will be kept
intact, and there will be a new function that will simplify chunk setting
for some data scenarios?  You are not proposing any changes in the netcdf-4
file format?

--Dave


On Mon, May 15, 2017 at 1:29 PM, dmh@xxxxxxxx <dmh@xxxxxxxx> wrote:

> I am soliciting opinions about an alternate way to specify chunking
> for netcdf files. If you are not familiar with chunking, then
> you probably can ignore this message.
>
> Currently, one species a per-dimension decomposition that
> together determine how a the data for a variable is decomposed
> into chunks. So e.g. if I have variable (pardon the shorthand notation)
>   x[d1=8,d2=12]
> and I say d1 is chunked 4 and d2 is chunked 4, then x will be decomposed
> into 6 chunks (8/4 * 12/4).
>
> I am proposing this alternate. Suppose we have
>     x[d1,d2,...dm]
> And we specify a position 1<=c<m
> Then the idea is that we create chunks of size
>    d(c+1) * d(c+2) *...dm
> There will be d1*d2*...dc such chunks.
> In other words, we split the set of dimensions at some point (c)
> and create the chunks based on that split.
>
> The claim is that for many situations, the leftmost dimensions
> are what we want to iterate over: e.g. time; and we then want
> to read all of the rest of the data associated with that time.
>
> So, my question is: is such a style of chunking useful?
>
> If this is not clear, let me know and I will try to clarify.
> =Dennis Heimbigner
>  Unidata
>

Follow-Ups:
- Re: [netcdfgroup] Alternate chunking specification
  - From: dmh@xxxxxxxx

References:
- [netcdfgroup] Alternate chunking specification
  - From: dmh@xxxxxxxx