NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Tim Hume wrote: > I like your idea. I threw together a quick pdksh script to implement > something like you suggest. It assumes you have ncdump and the NetCDF > operators (in particular the ncatted program). Basically, I ncdump the > file, and calculate the MD5 sum. I then create a global attribute called > md5sum. To check the file, I ncdump it again, being careful not to > include the line containing the md5sum global attribute. If you look at > the attached script you'll get the idea. > > The script seems to work OK on my Linux box, but I guess it is slow and > inefficient, especially on large NetCDF files. Perhaps someone has a > better solution, or might refine the script a bit? I think there are some good reasons to keep hashes such as MD5 or SHA-1 external to files they are intended to check, rather than embedded in the files: - If the digest is external, then something that corrupts the file might also corrupt the digest. - It's awkward to check an embedded hash, because it requires stripping out before recomputing the hash. - Updating an embedded hash whenever the file is updated is unacceptably inefficient. - It's easier to protect an externally stored hash from modification or corruption than a large file, for example the hash could be stored on write-once media. However, if you want the convenience of a single file that contains its own hash, I suggest just appending the hash on the end of a file. Such a file will behave exactly like the original netCDF file with respect to the netCDF interface, since nothing in the netCDF interface lets you determine the size of the file or lets you read beyond the last data written through the interface. If you try to read an array or record past the end of the netCDF data, you get the error "Index exceeds dimension bound". If you want to verify that appending to a netCDF file won't damage it, just append some text to the end of a netCDF file and run ncdump on it. You should get the same output as for the original file and no error messages. The hash could easily be split off the end of the resulting file and compared with the hash of the truncated file to verify it had not been damaged. --Russ
netcdfgroup
archives: