NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
On 05/07/2014 09:39 AM, Kent Yang wrote:
-- There should be a paper that listed the flash benchmark comparison between parallel NetCDF from Northwest(or parallel netcdf-3) and parallel HDF5. However, it is an unfair comparison. It used collective IO for parallel NetCDF-3 but independent IO for parallel HDF5. You can find more detailed about the fair comparison with the collective IO for these two packages from http://www.spscicomp.org/ScicomP12/Presentations/User/Yang.pdf
netcdf has define mode and data mode separate. this restricts what the user can do, but it also means once you are out of define mode, the metadata will not change.
HDF5's metadata book keeping code means writes require not only a bulk data update, but also mean updating a bit of metadata. not a huge deal if you are moving tons of data, but if you are working with many small datasets, it can be a factor
Be aware this was also a bit old. Don't know what's the current status between these two packages.
for most people and most workloads, the simple fact that either pnetcdf or HDF5 is being used is great. I think both libraries have pain points (like when parallel-netcdf tries to read or write one of several record variables ) where performance can suffer.
==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA
netcdfgroup
archives: