NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Gday all [Hi all netcdf users] I have a few comments of this topic. (I have included a copy of Lindsay Pender's email that did not get distributed to the netcdfgroup). If you wish to reply, please reply to the netrcdfgroup so we can all be informed of this discussion. I think there may be many interested parties in this group. Cheers -Phil Morgan Addition comments at end by Lindsay Pender ******************************************************* LINDSAY PENDER WRITES: ==================== >> >From pender@xxxxxxxxxxx Mon Sep 14 21:14:10 1992 >> >Subject: NetCDF for underway oceanographic data storage >> > >> >I read with interest your mail message expressing your intention to use >> >netCDF >> >for underway data storage. I have also considered this approach for the same >> >reasons you give, however came up with some conceptual difficulties when I >> >was >> >looking at ways to implement it. It may be that your data is different, but >> >in >> >our case we have data coming from many different sources, each with >> >different >> >sampling rates. Some of our instruments are sampled at 2.5kHz, while others >> >are >> >as slow as once a minute. For an underway data storage system using netCDF >> >how >> >do you store such data with only one 'unlimited' dimension? What I have >> >considered doing, is to collect data from the various instruments into fixed >> >length blocks, and then after some suitable time writing all of the data to >> >a >> >netCDF file with the now known dimensions. Using this scheme, I would have >> >to >> >carry an extra variable for each instrument - the time stamp for each block. >> > >> >Any comment? >> > >> >Regards >> > >> >Lindsay Pender TIM HOLT WRITES: ============== >> OSU currently can manage it's data by logging 1 minute averages for >> all instruments. No one yet has asked for finer resolution from our >> common use equipment. CTD, ADCP, and other such higher resolution >> systems are managed and logged by their own software and are currently >> independent of the new netCDF system. Soon though, I will need to merge >> in some finer res. data (5 second GPS and ADCP). Here is my scheme, and >> I'm real curious what kinds of alternatives others can suggest. >> >> I'll see if I can describe my idea with a CDL file. It may not be the best >> way, but I guess it will work... >> >> >> >> <<< BEGIN multi_res.cdl >>> >> >> netcdf multires { >> >> dimensions: >> min_max_mean = 3; // store 3 numbers: min, max, mean >> ten_hz = 600; // number of 10.0 hZ samples in 1 minute >> five_hz = 300; // number of 5.0 hZ samples in 1 minute >> twopoint5_hz = 150; // number of 2.5 hZ samples in 1 minute >> one_hz = 60; // number of 1.0 hZ samples in 1 minute >> five_second = 20; // number of 0.05 hZ samples in 1 minute >> time = unlimited; // the "time" dimension >> >> variables: >> long time(time); // seconds since some fixed point in time >> float gps_lat(time); // gps latitude in sample period >> float gps_lon(time); // gps longitude in sample period >> short n_sats(time); // number of satellites used in fix >> float raw_gps_lat(time, five_second); // raw gps latitude >> float raw_gps_lon(time, five_second); // raw gps longitude >> float sea_temp(time, min_max_mean); // sea surface temperature >> float towed_cdt_temp (time, ten_hz); // raw CTD temperature >> float towed_ctd_cond (time, ten_hz); // raw CTD conductivity >> } >> >> <<< END multi_res.cdl >>> >> >> The idea is to pick the least common denominator (1 minute data) and >> pack anything that's a finer resolution into a new dimension. >> >> I did try this scheme for a towed vehicle logging/display system, but I >> found the netCDF time overhead (on a PC) was too high for me to log real >> time, raw 24 hZ CTD data. Too many variables to log -- more than the >> simple example above. I still used the same idea, but went to a simpler >> ASCII file for quick I/O. >> >> Comments??? >> >> Tim Holt, OSU Oceanography >> holtt@xxxxxxxxxxxx >> Reading in data and saving to a file in real-time will always be limited by the sampling rate and the number of samples monitored. Saving directly to netCDF format adds an extra cost in processing overheads. For fast sampling and/or large number of samples, it is best to save the "continuously" sampled data records from an instrument directly to a file(ascii or binary[fastest]). For example, read data into a record in fortran (yes, i know it's an extension) or a structure in C etc and write out the whole record in binary format. This file of records acts as a buffer from which you can run a program to convert the file of records into netCDF format. A picture of this for 2 instruments follows (could N instruments, each with a different number of component data elements). +---------+ | instr#1 | -->> (Read&Save) -->> instr#1 file --> (convert) ---> netCDF +---------+ of records +---------+ | instr#2 | -->> (Read&Save) -->> instr#2 file --> (convert) ---> netCDF +---------+ of records If data is acquired at a relatively slow rate then you may well have plenty of time to write directly to a netCDF format file. The netCDF file could be separate files for each instrument or all data merged into a single file. Separate instrument logged files (data at same sampling rate for each file) =============================== If all data from one instrument is at the same time base then this is easy. The time dimension can be set to "unlimited". Each instrument log file will have it's own time variable appropriate for the sampling rate. Comparisons between different instruments will need to account for the different time base in each file. This should be no problem but we do have several (instrument) files. Other specialised instruments with their own data acquisition and data storage format (eg ADCP,CTD) could have their data converted to netCDF files after aquisition is complete. If there are several data components at different sampling rates then data at the same sampling rate could be grouped together in a file. Thus each file will have it's own time base. ONE "MERGED" FILE (data components with different sampling rates) ================ If data components are sampled at different rates but using the same clock then there will be a common denominator (common time base) and the method suggested by Tim Holt IS EXCELLENT. Lindsay's concern of different sampling rates can be accommodated by Tim's method as long as there is a common clock from which the sampling rates are referenced. lindsay>> Some of our instruments are sampled at 2.5kHz, while others are lindsay>> as slow as once a minute. For an underway data storage system using netCDF how lindsay>> do you store such data with only one 'unlimited' dimension? The above case should encompass most common data aquisition situations. However, if high speed acquisition is sampled using different clocks then they do not have an exact common time base. Lindsay Pender has 2 solutions ============================= 1. The easiest solution may be to record the time base for each data component sampled at a different rate and from a different clock. Lindsay>> What I have >> >considered doing, is to collect data from the various instruments into fixed >> >length blocks, and then after some suitable time writing all of the data to >> >a >> >netCDF file with the now known dimensions. Using this scheme, I would have >> >to >> >carry an extra variable for each instrument - the time stamp for each block. >> > I believe that Lindsay is suggesting something like this ...(rough CDL) dimesions: xsample_no = 300 // say, no. of samples in blocks of x ysample_no = 40 // say, no. of samples in blocks of y // These are the user defined no. of blocks to read from each // instrument before writing out to a netCDF file indexx = 1000 // say indexy = 500 // say variables: // Instrument #1 data float signalx(indexx,xsample_no,other dims) long timex(indexx) // time stamps for each block // Instrument #2 data float signaly(indexy,ysample_no,other dims) long timey(indexy) // time stamps for each block This will require the aquisition program to count the number of samples and write out a netCDF file at appropriate times. Application programs will need to use the individual time stamps for each block of data from each instrument. If data aquisition is fast and processor time limited, it may be neccessary to write all data to a binary file and later convert to a netCDF file. 2. PADDING (Info directly from Lindsay) When sampling rates do not have a common clock one could use Tim Holt's scheme by rounding up the block length for each time interval (common for all instruments) each sample from each instrument was guarenteed to fit within the block. Note now that the number of samples in consecutive blocks may be different, depending upon the relative timing of the block and instrument sampling. This can be handled by using _FillValue for he unused samples in the block. ============end of file====== ============================================================================= Phil Morgan mail: CSIRO Oceanography _--_|\ GPO Box 1538, / \ Hobart Tas 7008, AUSTRALIA \_.--._/ email: morgan@xxxxxxxxxxxxxxxxxxx ----- phone: (002) 206236 +61 02 206236 \ / fax: (002) 240530 +61 02 240530 \*/ =============================================================================
netcdfgroup
archives: