Re: [netcdfgroup] nccopy -c does not rechunk properly (4.3.1.1)

To: netcdfgroup@xxxxxxxxxxxxxxxx
Subject: Re: [netcdfgroup] nccopy -c does not rechunk properly (4.3.1.1)
From: Chris Barker <chris.barker@xxxxxxxx>
Date: Thu, 27 Feb 2014 14:58:05 -0800

Hi Russ,

On Thu, Feb 27, 2014 at 2:38 PM, Russ Rew <russ@xxxxxxxxxxxxxxxx> wrote:

>   #define CHUNK_THRESHOLD (8192)   /* variables with fewer bytes don't get
> chunked */
>
> The intent of the CHUNK_THRESHOLD minimum is to not create chunks
> smaller than a physical disk block, as an I/O optimization, because
> attempting to read a smaller chunk will still cause a whole disk block
> to be read.

So I take it 8k is a reasonable expectation for disk cache these days?

But this is a great tidbit -- I'm working on code to write data in the
"new" UGRID standard:

https://github.com/ugrid-conventions/ugrid-conventions

And the code:
https://github.com/pyugrid/pyugrid

And I wanted to set some reasonable defaults for chunking. In this case,
you tend to have a lot of large 1-d arrays, and most of the discussions
I've seen are about multi-dimensional arrays. It sounds like I should set a
minimum chunk size of 8k bytes then.

>   However, I think for the next
> release, we should lower the default threshold to 512 bytes, and
> document the behavior.
>

Document -- of course, but why lower the threshold?

Though maybe the thresholds are good for defaults, but if a user asks for
smaller than optimum chunk sizes, maybe that's what they should get.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@xxxxxxxx

References:
- [netcdfgroup] nccopy -c does not rechunk properly (4.3.1.1)
  - From: Simon Stähler
- Re: [netcdfgroup] nccopy -c does not rechunk properly (4.3.1.1)
  - From: Russ Rew

2014 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdfgroup archives: