NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Hi Thanh, > - i want to extract a particular variable from many NetCDF files (these > files are the same but different time, for example at 00, 06, 12, 18, 24 > hour) by the way of automatic, for example, i get a variable of temperature > from about 1000 NetCDF file, > so can you tell me how to extract temperature variable by the shortest > time? It depends on the format of the 1000 input files, the complexity of each file (in terms of the number of variables and attributes), and whether this is just something you want to do once or the same 1000 files will be accessed many times for similar tasks. If the files are not particularly complex, and you just want to extract all the temperature data once, I think the fastest way is to just loop through the files, opening them one at a time and reading the desired data. The time this takes may be dominated by the time to open each file, which for either netCDF-3 or -4 involves reading all the metadata into memory when the file is opened. If there are lots of variables and attributes that have nothing to do with the temperature data you want, this may take a while. You can either write a program to do this or make use of one of the packages designed for such tasks, such as NCO, NCL, or CDO. For descriptions and links, see the list of software for manipulating or displaying netCDF data: http://www.unidata.ucar.edu/netcdf/software.html An alternative may be advisable if there are a lot of other variables with a lot of attributes in each file and you want to support other similar data queries efficiently for future users. In that case you might want to first convert netCDF classic or 64-bit offset files into netCDF-4 classic model files, something that can be done easily by the nccopy utility. This will take a lot of time once, but after that you can take advantage of a feature of HDF5 access. Reading the desired data with HDF5 may not be any faster, but opening each file will be significantly faster, because HDF5 only reads the metadata when it's needed, so won't spend any time building a schema for all the variables and their attributes on open. However, you would actually have to use the HDF5 library to get this efficiency, as the netCDF-4 library still reads all the file metadata of a file on open. A third alternative, if the archive of files will be accessed a great many times with queries you can anticipate will each need to open lots of files, is to reorganize the data to match the pattern of anticipated queries. For example, if the data is stored spatially, but will be accessed as time series at a point, you may want to provide files organized with the time axis varying most rapidly. This can also be accomplished with the nccopy utility in netCDF-4.2. Other approaches include a recent innovative use of Hilbert space-filling curves and the Hadoop file system by Tanu Malik and colleagues: http://hpdgis.cigi.uiuc.edu/node/14 --Russ
netcdfgroup
archives: