NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Hi Ted, Thanks - I guess I'm under the impression that netcdf4 should always be used over netcdf3 (i.e. better). One test I did do is to remove all unlimited variables and fix them to a size of 900 and the file size is basically what one expects (350K vs 5Mb) so it really is the unlimited dimensions that are causing that large file size. I've cat the header to the netcdf file below in case anyone is interested - I would really like to keep the unlimited dimensions option available for data logging. I do use quite a few 2D dimensions and also two unlimited dimensions (fast and slow) where the fast has 100 samples for each slow. Once fully implemented I expect to be dumping about 2Mb/s to the netCDF file. Any advice much appreciated. Ross dimensions: slow_reg = UNLIMITED ; // (900 currently) fast_reg = UNLIMITED ; // (0 currently) group: array { group: frame { variables: uint status(slow_reg) ; ubyte received(slow_reg) ; uint nsnap(slow_reg) ; uint record(slow_reg) ; double utc(slow_reg) ; uint features(slow_reg) ; uint markSeq(slow_reg) ; } // group frame group: pt415 { variables: uint status(slow_reg) ; uint record(slow_reg) ; ... (quite a few more in here) float error_code(slow_reg) ; } // group pt415 group: sim900 { dimensions: dvm_volts_dim2 = 4 ; dvm_gnd_dim2 = 4 ; dvm_ref_dim2 = 4 ; therm_volts_dim2 = 4 ; therm_temperature_dim2 = 4 ; variables: uint status(slow_reg) ; uint record(slow_reg) ; double utc(slow_reg) ; float main_volt_monitor(slow_reg) ; float main_current_monitor(slow_reg) ; float main_power_monitor(slow_reg) ; float main_undervoltage(slow_reg) ; uint main_tick(slow_reg) ; float dvm_volts(slow_reg, dvm_volts_dim2) ; float dvm_gnd(slow_reg, dvm_gnd_dim2) ; float dvm_ref(slow_reg, dvm_ref_dim2) ; float therm_volts(slow_reg, therm_volts_dim2) ; float therm_temperature(slow_reg, therm_temperature_dim2) ; ... (Few more in here) float bridge_output_value(slow_reg) ; } // group sim900 } // group array group: antenna0 { group: frame { variables: uint status(slow_reg) ; ubyte received(slow_reg) ; uint nsnap(slow_reg) ; uint record(slow_reg) ; double utc(slow_reg) ; uint features(slow_reg) ; uint markSeq(slow_reg) ; } // group frame group: acu { variables: uint status(slow_reg) ; uint new_mode(slow_reg) ; ... uint px_checksum_error_count(slow_reg) ; uint px_resyncing(slow_reg) ; } // group acu group: gpsTime { variables: uint status(slow_reg) ; ... uint serialNumber(slow_reg) ; } // group gpsTime } // group antenna0 group: receiver { group: frame { variables: uint status(slow_reg) ; ubyte received(slow_reg) ; uint nsnap(slow_reg) ; uint record(slow_reg) ; double utc(slow_reg) ; uint features(slow_reg) ; uint markSeq(slow_reg) ; } // group frame group: bolometers { variables: uint status(slow_reg) ; } // group bolometers } // group receiver } On Mon, Jan 9, 2012 at 4:56 PM, Ted Mansell <ted.mansell@xxxxxxxx> wrote: > I don't think you can chunk an unlimited dimension by more than 1. What are > the variable dimensions? Your formula makes it sound like they are 1-D and > only sized by the unlimited dimension. If that is the case, compression > won't help. You might be better off with a netcdf-3 file? > > -- Ted > > On Jan 9, 2012, at 8:15 AM, Ross Williamson wrote: > >> I'm trying to get my head around the filesize of my netcdf-4 file - >> Some background. >> >> 1) I'm using the netcdf_c++4 API >> 2) I have an unlimited dimensions which I write data to about every second >> 3) There are a set of nested groups >> 4) I'm using compression on each variable >> 5) I'm using the default chunk size which I think is 1 for the >> unlimited dimensions and sizeof(type) for other dimensions >> 6) I take data for 900 samples - There are about 100 variables so I >> would expect (given doubles) a file size of 900x100x4 = 360K. Now I >> fully expect some level of overhead but my file sizes are 5MB which is >> incredibly large. >> >> Now compression doesn't make much difference (5Mb vs 5.3Mb). I'm >> assuming here the thing that is screwing me over is that I haven't got >> my chuncking set right. The issue is that I'm rather confused. It >> appears that you set the chunk size for each variable rather than the >> whole file which doesn't make sense to me. Would I just say multiply >> each chunk size by say 100 so have 100 for the unlimited dimension and >> sizeof(type)*100 for other dimensions? >> >> I'd really like to fix this as netcdf-4 seems ideal for my project but >> I can't deal with a size overhead of an order of magnitude. >> >> I can attach the header of the netcdf file if it helps. >> >> Ross >> >> -- >> Ross Williamson >> Associate Research Scientist >> Columbia Astrophysics Laboratory >> 212-851-9379 (office) >> 212-854-4653 (Lab) >> 312-504-3051 (Cell) >> >> _______________________________________________ >> netcdfgroup mailing list >> netcdfgroup@xxxxxxxxxxxxxxxx >> For list information or to unsubscribe, visit: >> http://www.unidata.ucar.edu/mailing_lists/ > -- Ross Williamson Associate Research Scientist Columbia Astrophysics Laboratory 212-851-9379 (office) 212-854-4653 (Lab) 312-504-3051 (Cell)
netcdfgroup
archives: