NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
I have a problem that my netcdf write performance degrades as my file size increases. I have several variables with the unlimited dimension, and each timestep of my simulation I write out those variables, presumably to the end of the netcdf file. I am not manipulating any dimension attributes or accessing the file anywhere but at the end (in theory). At the beginning of my simulation, the netcdf writes are fast and the code runs normally. As the simulation proceedes, however, the netcdf write call takes longer and longer, eventually overwhelming the simulation to dominate the processor. It feels like the whole file is read/manipulated on each timestep, even though it shouldn't be. I am using Konrad Hinson's Python wrappers for netcdf on a Pentium III cluster running Redhat Linux 7.1 (and lam 6.5.2 for MPI, but that shouldn't matter here), and netcdf 3.5-beta6. I have corresponded with Konrad about this already, and he has not seen this problem before and thinks that the Python wrapper should be consistently fast (and so do I, looking at the code). The call he uses to write the data is ncvarputg() (from the old NetCDF API, right?). A simple way for me to demonstrate the problem is to write out the data at a constant value of the unlimited dimension, instead of incrementing it by one each time. If I always write, for example, to time = 5 (time is the unlimited dimension), then performance is consistent and fast. If I write to frame 5000, for instance, performance is consistent and slow. Thanks for any insight, John -- John Galbraith email: jgalb@xxxxxxxx Los Alamos National Laboratory, home phone: (505) 662-3849 work phone: (505) 665-6301
netcdfgroup
archives: