Dear developers,
I am seeing a memory leak in NF90_GET_VAR (hdf5-1.8.2/netcdf-4.0.1 and
previous release version netcdf-4.0). I am reading multiple (say 100)
netcdf-4 files to combine the data into a single file. (Each file
contains a rectangular portion of a contiguous grid). The data
variables are 3D plus an unlimited time dimension. It appears that
the memory increase occurs on the first read of a given variable from
a given file, and reading successive times of that variable doesn't
cause any further increase, that is for successive values of 'k' in
DO ivar = 1,nvariables
....
DO k = 1,ntimes
DO jj = 1,nfiles
...
! read portion of 3d data from each file:
status =
NF90_GET_VAR
(ncid(jj),varid1,var3D(istart:istop,jstart:jstop,:),start=(/1,1,1,k/))
ENDDO
! write big combined 3d data
status = NF90_PUT_VAR(ncid1,varid2,var3D,(/1,1,1,k/))
ENDDO
ENDDO
I'm guessing that the memory leak is about the size of the 3d array
that is read, but it's hard to say. After reading 43 3d variables
from 100 files, the memory usage (shown by 'top') has grown by almost
2GB, which is about the total data size of the first reads (1.8GB).
That could be coincidence but consistent with a second, different
dataset. In 'top', I see the memory jump at first 'k' iteration. If
I comment out only the NF90_GET_VAR call then the memory usage doesn't
change at all. When another program reads in data from the single big
netcdf file, I don't think I see any undue memory usage -- just the
total data size for one time, approximately.
I have no idea if this is a netcdf or hdf5 issue. Since I'm calling
from fortran I wonder if there is some temporary memory used to switch
the data from C to Fortran array ordering that doesn't get released?
Thanks for any help, and I'm willing to test fixes.
Best,
-- Ted