NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
[Jeremy Beal wrote that he has large quantities of both spatially and temporally irregular/sparse data that he needs to store and retrieve efficiently in a platform independent manner, and wonders how best to do this.] One additional question comes to mind immediately: Do you want fast selective random access? Be sure of your answer: in many cases it can make an astounding difference to the style of work you do. Fast selective random access makes an enormous difference for analysis and visualization. (I've also seen too many met and met-related) models built around sequential files that have become vast conspiracies to manipulate a complex shared state centered around the positions of a multiplicity of sequential file pointers.) If you don't need/want fast selective random access, then the XDR'ed binary file is an acceptable solution. Otherwise, for sparse data you need files with built-in indexing. HDF VSets are a partial solution to this, provided you don't have very many time steps: they have a doubly-linked list of index blocks interspersed with data blocks. Be aware, though, that the overhead of sequential access to those index blocks can kill you if you do have lots of time steps. If you have a year's worth of hourly met observations stored this way and you want to look at the 0Z Dec 1 observations, be prepared to sit for five or ten minutes while your disk drive grinds through the 8000 or so index blocks for Jan 1-Dec 1 before it can even begin to think about data. Something else worth checking is PDB, which is part of Livermore's Portable Application Code Toolkit; see http://www.llnl.gov/def_sci/pact/hact_homepage.html It seems to be a lower-level interface than netCDF, but does have support for building efficient index structures. fwiw xcc@xxxxxxxxxxxxxxxx Carlie J. Coats, Jr. coats@xxxxxxxx MCNC Environmental Programs phone: (919)248-9241 North Carolina Supercomputing Center fax: (919)248-9245 3021 Cornwallis Road P. O. Box 12889 Research Triangle Park, N. C. 27709-2889 USA "My opinions are my own, and I've got *lots* of them!"
netcdfgroup
archives: