NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.

[netcdfgroup] File with large number of variables

Hi,
I have to write and read data to/from a netcdf file that has 750
variables, all of them using unlimited dimensions (only one per
variable, some dimensions shared) and 10 fixed dimensions.

I have use netcdf-4 (because of the multiple unlimited dimensions
requirement) and C API.

I'm making some prototyping on my development machine (Linux 2GB RAM)
and found several performance issues that I hope someone can help me
fix/understand:

(1) when i create a file and try to define 1000 variables (all int)
and a single shared unlimited dimension, the process takes all
available RAM (swap included) and fails with "Error (data:def closed)
-- HDF error" after a (long)while.

If I do the same closing and opening the file again every 10 or 100
new definitions, it works fine.  I can bypass this by creating the
file once (ncgen) and using a copy of it on every new file, but I
would prefer not to. Why does creating the variables take that much
memory?

(2) writing and reading data to variables there's a huge performance
difference between writing/reading one record at a time and
writing/reading several records at a time (buffering). To keep the
logic of my program simple my first approach was to write one-on-one
(as the program works this way: reads 1 record on each variable,
processes and writes it down) and play with the chunk size and chunk
cache, but so far it hasn't helped much.

Should I build a custom "buffering" layer or the chunk cache can help
here? or should I simply get more ram :)?

(3) Even when buffering, I see a performance degradation (memory goes
down fast, and processing time increases) as the number of records per
variable processed (written or read) increase.

I really could use some "expert" advice on the best way to address this issues.

Thanks in advance.

Dani



  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: