NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
I use NetCDF format to store molecular dynamics trajectories generated by OpenMM with the AMBER force field. Recently, one of the servers running the simulations had some unknown issue due to which all the NetCDF files (each ~8 GB) generated on this server are not readable by any of the netCDF utilities. The trajectories are nearly 750 ns long for which the typical runtime is ~2 months. I am looking for some help/advice to retrieve as much data as possible from the corrupted files. I am providing the required info based on a couple of threads (very old <https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg00201.html> and more recent <https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg14595.html>) on this mailing list. - *ncinfo gives the following error message:* (openmm) >> ncinfo prod.nc Traceback (most recent call last): File "/opt/anaconda3/envs/openmm/bin/ncinfo", line 11, in <module> sys.exit(ncinfo()) File "/opt/anaconda3/envs/openmm/lib/python3.9/site-packages/netCDF4/utils.py", line 550, in ncinfo f = Dataset(filename) File "src/netCDF4/_netCDF4.pyx", line 2307, in netCDF4._netCDF4.Dataset.__init__ File "src/netCDF4/_netCDF4.pyx", line 1925, in netCDF4._netCDF4._ensure_nc_success OSError: [Errno -51] NetCDF: Unknown file format: b'prod.nc' - *octal dump shows that the file of type netcdf3 and contains data (as noted here <https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg14595.html>) * (openmm) >> od -c prod.nc | head -n 30 0000000 C D F 002 \0 \0 x I \0 \0 \0 \n \0 \0 \0 006 0000020 \0 \0 \0 005 f r a m e \0 \0 \0 \0 \0 \0 \0 0000040 \0 \0 \0 \a s p a t i a l \0 \0 \0 \0 003 0000060 \0 \0 \0 004 a t o m \0 \0 O 271 \0 \0 \0 \f 0000100 c e l l _ s p a t i a l \0 \0 \0 003 0000120 \0 \0 \0 \f c e l l _ a n g u l a r 0000140 \0 \0 \0 003 \0 \0 \0 005 l a b e l \0 \0 \0 0000160 \0 \0 \0 005 \0 \0 \0 \f \0 \0 \0 006 \0 \0 \0 005 0000200 t i t l e \0 \0 \0 \0 \0 \0 002 \0 \0 \0 4 0000220 C R E A T E D a t 2 0 2 2 - 0000240 0 2 - 2 3 1 6 : 3 1 : 2 0 . 8 0000260 0 6 5 9 3 o n u s a m - a m 0000300 b e r 1 \0 \0 \0 \v a p p l i c a t 0000320 i o n \0 \0 \0 \0 002 \0 \0 \0 005 O m n i 0000340 a \0 \0 \0 \0 \0 \0 \a p r o g r a m \0 0000360 \0 \0 \0 002 \0 \0 \0 006 M D T r a j \0 \0 0000400 \0 \0 \0 016 p r o g r a m V e r s i 0000420 o n \0 \0 \0 \0 \0 002 \0 \0 \0 005 1 . 9 . 0000440 5 \0 \0 \0 \0 \0 \0 \v C o n v e n t i 0000460 o n s \0 \0 \0 \0 002 \0 \0 \0 005 A M B E 0000500 R \0 \0 \0 \0 \0 \0 021 C o n v e n t i 0000520 o n V e r s i o n \0 \0 \0 \0 \0 \0 002 0000540 \0 \0 \0 003 1 . 0 \0 \0 \0 \0 \v \0 \0 \0 \a 0000560 \0 \0 \0 \f c e l l _ a n g u l a r 0000600 \0 \0 \0 002 \0 \0 \0 003 \0 \0 \0 005 \0 \0 \0 \0 0000620 \0 \0 \0 \0 \0 \0 \0 002 \0 \0 \0 020 \0 \0 \0 \0 0000640 \0 \0 003 < \0 \0 \0 \f c e l l _ s p a 0000660 t i a l \0 \0 \0 001 \0 \0 \0 003 \0 \0 \0 \0 0000700 \0 \0 \0 \0 \0 \0 \0 002 \0 \0 \0 004 \0 \0 \0 \0 0000720 \0 \0 003 L \0 \0 \0 \a s p a t i a l \0 - *I know the structure of my NetCDF from an identical file that is readable* (openmm) >> ncdump -h prod.nc netcdf prod { dimensions: frame = UNLIMITED ; // (578 currently) spatial = 3 ; atom = 20504 ; variables: char spatial(spatial) ; float time(frame) ; time:units = "picosecond" ; float coordinates(frame, atom, spatial) ; coordinates:units = "angstrom" ; // global attributes: :Conventions = "AMBER" ; :ConventionVersion = "1.0" ; :application = "AmberTools" ; :program = "ParmEd" ; :programVersion = "3.4.1" ; :title = "ParmEd-created trajectory" ; } Given this information, could you please suggest ways to retrieve my data? Any help in this regard will be greatly appreciated. Best Ram
netcdfgroup
archives: