NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.

Re: [netcdfgroup] Content-Based Checksums of a netCDF dataset

Hi Charlie,

sounds like exactly the things I had in mind.

https://github.com/willirath/netcdf-hash/blob/master/ncks_digest_demo.sh gets an example file, converts it to different equivalent files, and runs `ncks` with `--md5_write_attribute` across all files. The results look very promising.

I think, I'll be able to do everything I need with what's already possible with NCO. (That sentence must have been written or many times!)

I'm having a few issues with the order of dimensions in the output of `ncks`, though. I'll double-check and get back to you via the NCO bug tracker if necessary.

Thanks a lot!

Willi

On 08/24/2017 09:30 PM, Charlie Zender wrote:
NCO has supported MD5 digests since 2012

http://nco.sf.net/nco.html#md5

If I understand your intent, this might be
part of a suitable solution, since you can
use NCO to compute/print (and, optionally store as attributes)
the MD5 hashes of every field in a file,
and these hashes should agree for the same
data regardless of the underlying storage
compression level etc, filesystem type, etc.
It is a foolproof way to verify data integrity.
And NCO can get the files from remote filesystems
via Opendap or HSI or ESGF or scp/ftp.

Feedback welcome.

--

Willi Rath
Theorie und Modellierung
GEOMAR
Helmholtz-Zentrum für Ozeanforschung Kiel
Duesternbrooker Weg 20, Raum 422
24105 Kiel, Germany
------------------------------------------------------------
Tel. +49-431-600-4010
wrath@xxxxxxxxx
www.geomar.de
-----------------------



  • 2017 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: