NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Is this writing a netcdf-3 file or a netcdf-4 file? =Dennis Heimbigner Unidata On 11/10/2016 2:01 PM, Liam Forbes wrote:
Hello! We are installing netCDF 4.4.1 w/ HDF5 1.8.17 on our new Intel based cluster. We've noticed the read performance on this cluster using ncks is extremely slow compared to a couple other systems. For example, parsing a file on our lustre 2.1 based filesystem takes less 8 seconds on our Cray XK6-200m. Parsing the same file on the same filesystem on our new cluster is taking 30+ seconds, with most of that time apparently spent reading in the file. Cray (hostname fish): fish1:lforbes$ time ncks test.nc <http://test.nc> out.nc <http://out.nc> real 0m4.804s user 0m3.180s sys 0m1.300s Cluster (hostname chinook): n0:loforbes$ time ncks mod.nc <http://mod.nc> out.nc <http://out.nc> real 0m32.435s user 0m29.240s sys 0m1.936s As part of trying to figure out what's going on, I strace'ed the process on both systems. One thing that jumps out at me is that the process running on a compute node on our new cluster is executing a _lot_ more brk() calls to allocate additional memory than on a login node of our Cray, at least 8 times as many in one test comparison (strace output files are available). I'm not sure if this means anything, or how I can impact this behaviour. I've tried recompiling NetCDF on our new cluster a variety of ways, stripping out features like szip and enabling others like MMAP, but none of the changes have impacted the performance. Based on what I've seen googling and reading through the mail list archives, I've also tried using `ncks --fix_rec_dmn` to generate a new version of the input file (which is just over 650MBs) with a limited time dimension. chinook01:loforbes$ ncdump -k test.nc <http://test.nc> netCDF-4 chinook01:loforbes$ ncdump -k mod.nc <http://mod.nc> netCDF-4 chinook01:loforbes$ ncdump -s test.nc <http://test.nc> | head netcdf test { dimensions: time = UNLIMITED ; // (21 currently) nv = 2 ; x = 352 ; y = 608 ; nv4 = 4 ; variables: double time(time) ; time:units = "seconds since 1-1-1" ; chinook01:loforbes$ ncdump -s mod.nc <http://mod.nc> | head netcdf mod { dimensions: time = 21 ; y = 608 ; x = 352 ; nv4 = 4 ; nv = 2 ; variables: float basal_mass_balance_average(time, y, x) ; basal_mass_balance_average:units = "kg m-2 year-1" ; This also didn't seem to make a difference. Unfortunately, as the cluster administrator, my NetCDF knowledge is very limited. The test file was provided by the researcher reporting this problem. What he is experiencing is a significant application slow down due to this issue occurring every time step when he reads/writes files. It more than doubles the run time, making our new cluster unusable to him. I don't think anything is necessarily "broken" with NetCDF, but I'm not sure what further diagnostics to attempt or if there are other changes to the input file I and the researcher should try. Any help would be appreciated. Thank you. -- Regards, -liam -There are uncountably more irrational fears than rational ones. -P. Dolan Liam Forbes loforbes@xxxxxxxxxx <mailto:loforbes@xxxxxxxxxx> ph: 907-450-8618 fax: 907-450-8601 UAF Research Computing Systems Senior HPC Engineer LPIC1, CISSP _______________________________________________ NOTE: All exchanges posted to Unidata maintained email lists are recorded in the Unidata inquiry tracking system and made publicly available through the web. Users who post to any of the lists we maintain are reminded to remove any personal information that they do not want to be made public. netcdfgroup mailing list netcdfgroup@xxxxxxxxxxxxxxxx For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/
netcdfgroup
archives: