Re: Reading HDF5 data: memory issues

To: Jon Blower <jdb@xxxxxxxxxxxxxxxxxxxx>
Subject: Re: Reading HDF5 data: memory issues
From: John Caron <caron@xxxxxxxxxxxxxxxx>
Date: Tue, 11 Oct 2005 10:22:21 -0600

Hi Jon:

The need to read the entire variable I think only happens when the data is 
compressed, then you have no choice but to uncompress the whole thing and then 
subset it. Can you check to see if that's whats happening in your case? I think 
an hdfdump will tell you if the variable is compressed.

Thanks,

John

Jon Blower wrote:

Hi all,

I have been trying to use the latest version of the Java NetCDF libraries
(2.2.10) to read data from some rather large HDF5 files.  I kept running out
of memory, and after further investigation I found the cause.  It seems
that, when using Variable.read() to get data from the file, *all* the data
from the variable is read into memory, no matter what the subset details
specified.  So read("0,0,0") will read in all the variable's data into
memory, then wrap it as an Array object with a logical size of one data
point.

If I remember correctly, this used to be the behaviour for NetCDF files too,
until the new version of the libraries.  It means that reading even small
subsets of data from large HDF5 files is very slow or impossible.  Is it
possible to read a subset of data from an HDF5 file using the NetCDF libs
without loading all the data into memory?

Thanks, Jon

--------------------------------------------------------------
Dr Jon Blower              Tel: +44 118 378 5213 (direct line)
Technical Director         Tel: +44 118 378 8741 (ESSC)
Reading e-Science Centre   Fax: +44 118 378 6413
ESSC                       Email: jdb@xxxxxxxxxxxxxxxxxxxx
University of Reading
3 Earley Gate
Reading RG6 6AL, UK

--------------------------------------------------------------

References:
- Reading HDF5 data: memory issues
  - From: Jon Blower

2005 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdf-java archives: