NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

Re: [netcdfgroup] Content-Based Checksums of a netCDF dataset

You can turn on HDF5 checksums with nc_def_var_fletcher32() (See:
https://www.unidata.ucar.edu/software/netcdf/netcdf-4/newdocs/netcdf-c/nc_005fdef_005fvar_005ffletcher32.html
).

Is this what you want?

Thanks,
Ed Hartnett

On Thu, Aug 24, 2017 at 12:04 PM, dmh@xxxxxxxx <dmh@xxxxxxxx> wrote:

> A small note. Since the goal is equality testing rather than security,
> you should be able to get by with CRC32 or CRC64 checksums.
> SHA256 is overkill.
> =Dennis Heimbigner
>  Unidata
>
>
> On 8/24/2017 12:00 PM, Willi Rath wrote:
>
>> Hi all,
>>
>> I'd like to find a way to verify the contents of a given netCDF dataset
>> across different representations on disk.  (Think of the data set being
>> defined by its CDL code and different representations on disk being
>> realised by different choices of format, deflation, chunking, etc. but with
>> identical CDL.)
>>
>> There are tools that compare the contents of two netCDF files: cdo's diff
>> or nccmp. These tools do, however, rely on both files being present on the
>> same file system and at the same time.  A hash-based approach calculating
>> checksums from the contents rather than the binary representation of the
>> data set would be a nice solution to the problem.
>>
>> I've tried and collected all attempts made at verification of netCDF
>> files in: https://github.com/willirath/netcdf-hash (The most successful
>> of which circled around the possibility of including the functionality in
>> `ncks` and lead to a pair of tools for calculation and verification of MD5
>> checksums of netCDF files that are stored within the files.)
>>
>> There also is a demo outlining an approach digesting different
>> representations of the same netCDF data set into a sha256 hash and storing
>> the hex-value of this hash in global arguments in the respective files.
>>
>> I'd be very happy about any pointers to additional ideas (or perhaps
>> existing tools) solving the problem of netCDF-content verification, about
>> suggestions, remarks, etc.
>>
>> Cheers
>> Willi
>>
>>
> _______________________________________________
> NOTE: All exchanges posted to Unidata maintained email lists are
> recorded in the Unidata inquiry tracking system and made publicly
> available through the web.  Users who post to any of the lists we
> maintain are reminded to remove any personal information that they
> do not want to be made public.
>
>
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit:
> http://www.unidata.ucar.edu/mailing_lists/
>