[netcdf-java] java netcdf performance loss over time

To: netcdf-java@xxxxxxxxxxxxxxxx
Subject: [netcdf-java] java netcdf performance loss over time
From: "j.kruijf" <j.kruijf@xxxxxxxxxxxx>
Date: Wed, 26 Mar 2008 08:41:57 +0100

Hi,

Five years ago we started to use the then available version of netcdf.
Lets call that version 0.0.
Yesterday I downloaded 2.2.22.

When I compare the speed for opening, reading a variable (with 2Mentries in it) and writing a variable in a new file, or opening andreading a variable , then the old version is significantly faster:


--------------------------------------------------------------------------------
measurement for warmup - netcdf-0.0
--------------------------------------------------------------------------------
size = 25
time = 18173 ms
system time = 2220 ms
user time = 12102 ms
total cpu time = 14322 ms
time per iteration = 726920.0 usec
cpu time per iteration = 572880.0 usec

--------------------------------------------------------------------------------
measurement for copy - netcdf-0.0
--------------------------------------------------------------------------------
size = 25
time = 15363 ms
system time = 2138 ms
user time = 11150 ms
total cpu time = 13288 ms
time per iteration = 614520.0 usec
cpu time per iteration = 531520.0 usec

--------------------------------------------------------------------------------
measurement for read - netcdf-0.0
--------------------------------------------------------------------------------
size = 25
time = 5447 ms
system time = 830 ms
user time = 4618 ms
total cpu time = 5448 ms
time per iteration = 217880.0 usec
cpu time per iteration = 217920.0 usec







versus
--------------------------------------------------------------------------------
measurement for warmup - netcdf-2.2.22
--------------------------------------------------------------------------------
size = 25
time = 22418 ms
system time = 2935 ms
user time = 18656 ms
total cpu time = 21591 ms
time per iteration = 896720.0 usec
cpu time per iteration = 863640.0 usec

--------------------------------------------------------------------------------
measurement for copy - netcdf-2.2.22
--------------------------------------------------------------------------------
size = 25
time = 21037 ms
system time = 2557 ms
user time = 18437 ms
total cpu time = 20994 ms
time per iteration = 841480.0 usec
cpu time per iteration = 839760.0 usec

--------------------------------------------------------------------------------
measurement for read - netcdf-2.2.22
--------------------------------------------------------------------------------
size = 25
time = 6057 ms
system time = 629 ms
user time = 5716 ms
total cpu time = 6345 ms
time per iteration = 242280.0 usec
cpu time per iteration = 253800.0 usec

When I use a zipped input file(16 times smaller on disk) I see:
--------------------------------------------------------------------------------
measurement for warmup - netcdf-2.2.22
--------------------------------------------------------------------------------
size = 25
time = 22590 ms
system time = 2888 ms
user time = 18983 ms
total cpu time = 21871 ms
time per iteration = 903600.0 usec
cpu time per iteration = 874840.0 usec

--------------------------------------------------------------------------------
measurement for copy - netcdf-2.2.22
--------------------------------------------------------------------------------
size = 25
time = 20691 ms
system time = 2558 ms
user time = 18215 ms
total cpu time = 20773 ms
time per iteration = 827640.0 usec
cpu time per iteration = 830920.0 usec

--------------------------------------------------------------------------------
measurement for read - netcdf-2.2.22
--------------------------------------------------------------------------------
size = 25
time = 5969 ms
system time = 544 ms
user time = 5650 ms
total cpu time = 6194 ms
time per iteration = 238760.0 usec
cpu time per iteration = 247760.0 usec

(look at cputime per iteration; size=25 means the copy or read operationwas performed 25 times)

My initial conclusion is that reading becomes 15% slower and copying is50% slower.I'm reluctant to upgrade the library because of my findings.

Can you confirm my findings?

Do you have hints for me how to get a performance boost?

We typically load 30 files all at once in memory. Each file is about 10Munzipped on disk. The example file was 16M unzipped on disk.Once the files are loaded the arrays in them are merged and theniterated a couple of times.



Thanks for your help,

Jeroen Kruijf
IMC Trading