NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Hi Dani, If you are really interesting in the program efficiency then HDF5 API should be considered. NetCDF4 API is actually a wrapper on top of HDF5 API providing interface familiar for NetCDF users. NetCDF4 provides "simpler" interface in one respect: the user doesn't worry to close objects opened or created earlier in the program. And this comes with a price: NetCDF API must keep in memory the whole file structure. That's why NetCDF API works much slower (and takes much more memory) than HDF5 API on files with complex structure. I have replaced NetCDF code with HDF5 in your example. The resulting code is shorter and it will run much faster: please try. Regards, Sergei -----Original Message----- From: netcdfgroup-bounces@xxxxxxxxxxxxxxxx [mailto:netcdfgroup-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Dani Sent: 03 May 2010 10:41 To: Ed Hartnett Cc: netcdfgroup@xxxxxxxxxxxxxxxx Subject: Re: [netcdfgroup] File with large number of variables Setting the cache to 0 has solved the problem on the definition of the file. Thanks a lot. Unfortunately, I'm still not able to write efficiently to the file I just created. It looks like every call to nc_put_vara takes memory that is not released. I attach a code snippet to illustrate this. It is very clear when executing with num_var = 100 (makes the test faster), num_elements_var=10000 and buffer_size=1. If I increase buffer_size the problem is less obvious but it's still there (set buffer_size = 10 and increase num_elemements_var=100000). Does not seem to be related to num_var this time but the number of times nc_put_vara is called. Any ideas? Thanks in advance, Dani On Fri, Apr 30, 2010 at 8:26 PM, Ed Hartnett <ed@xxxxxxxxxxxxxxxx> wrote: > Dani <pressec@xxxxxxxxx> writes: > >> Hi, >> I have to write and read data to/from a netcdf file that has 750 >> variables, all of them using unlimited dimensions (only one per >> variable, some dimensions shared) and 10 fixed dimensions. >> >> I have use netcdf-4 (because of the multiple unlimited dimensions >> requirement) and C API. >> >> I'm making some prototyping on my development machine (Linux 2GB RAM) >> and found several performance issues that I hope someone can help me >> fix/understand: >> >> (1) when i create a file and try to define 1000 variables (all int) >> and a single shared unlimited dimension, the process takes all >> available RAM (swap included) and fails with "Error (data:def closed) >> -- HDF error" after a (long)while. >> >> If I do the same closing and opening the file again every 10 or 100 >> new definitions, it works fine. I can bypass this by creating the >> file once (ncgen) and using a copy of it on every new file, but I >> would prefer not to. Why does creating the variables take that much >> memory? > > When you create a netCDF variable, HDF5 allocates a buffer for that > variable. The default size of the buffer is 1 MB. > > I have reproduced your problem, but it can be solved be explicitly > setting the buffer size for each variable to a lower value. I have > checked in my tests in libsrc4/tst_vars3.c, but here's the part with the > cache setting: > > for (v = 0; v < NUM_VARS; v++) > { > sprintf(var_name, "var_%d", v); > if (nc_def_var(ncid, var_name, NC_INT, 1, &dimid, &varid)) ERR_RET; > if (nc_set_var_chunk_cache(ncid, varid, 0, 0, 0.75)) ERR_RET; > } > > Note the call to nc_set_var_chunk_cache(), right after the call to > nc_def_var. > > When I take this line out, I get a serious slowdown around 4000 > variables. (I have more memory available than you do.) > > But when I add the call to set_var_chunk_cache(), setting the chunk > cache to zero, then there is no slowdown, even for 10,000 variables. > > Thanks, > > Ed > -- > Ed Hartnett -- ed@xxxxxxxxxxxxxxxx > Click https://www.mailcontrol.com/sr/wQw0zmjPoHdJTZGyOCrrhg== to report this email as spam. This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com
Attachment:
testlimits.c
Description: testlimits.c
netcdfgroup
archives: