NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
On Thu, Jul 7, 2016 at 6:27 PM, Roy Mendelssohn - NOAA Federal < roy.mendelssohn@xxxxxxxx> wrote: > > Thank you very much Jeff. I think I’m too far to be able to explain > myself. Perhaps, this is the wrong list for this question but I sent it in > hope there is someone has deep understanding of netcdf data and use R. Let > me tell the story simpler. Assume that you read a numeric vector of data > from a netcdf file: > > > > data <- c(9.1999979, 8.7999979, 7.9999979, 3.0999980, 6.1000018, > 10.1000017, 10.4000017, 9.2000017) > > > > you know that the values above are a model output and also you know > that, physically, first and last values must be equal but somehow they are > not. > classic floating point precision issues -- nothing to do with netcdf or R, really. I think your data provider should have rounded before writing the file, but what can you do? > > And now, you want to use “periodic” spline for the values above. > > > > spline(1:8, data, method = “periodic”) > > > > Voila! spline method throws a warning message: “spline: first and last y > values differ - using y[1] for both”. actually, it seems that warning aside, the spline function is doing the right thing :-) -- though Ideally it would let the user specify a precision with which to check for "equality" -- you almost never want to check equality of floating point values directly. > Then I go on digging and discover 2 attributes in netcdf file: “precision > = 2” and “least_significant_digit = 1”. And I also found their definitions > at [1]. > Interesting -- something like that really should be in CF .... > > precision -- number of places to right of decimal point that are > significant, based on packing used. Type is short. > yeach! using "right of the decimal point" rather than some number of significant figures is pretty limiting (what if you have large magnitude numbers? > > least_significant_digit -- power of ten of the smallest decimal place in > unpacked data that is a reliable value. Type is short. > This sure sounds like the same thing -- with the same limitations. unless it can be negative, in which case you could be specifying large-magnitude numbers. According to: https://www.unidata.ucar.edu/software/netcdf/docs/BestPractices.html packing involves storing values as integers in fixed point: unpacked_data_value = packed_data_value * scale_factor + > add_offset (not sure what +> means....typo?) anyway, this scheme allows any magnitude value to be stored, but the ncep definitions seem to only support order-1 values. this is really a question for NCEP. I can't find the reference in the docs at that link (No, I didn't dig deep), so maybe there is more there, but from what the OP posted: Note that this is about how the data was packed, rather than how accurate the data were/are in the first place. Which is odd, as if you pack and unpack the data in the same way, then you should get the same values, which was not the case here. Which indicates to me that difference is in the actual data, not a result of the packing method. so precision and least_significant digit are actually irrelevant to the OP's issue :-) none the less, I understand the confusion -- these seem to be the same thing, and are not consistent: precision = 2 seems to mean that you can trust the first two decimal digits after the decimal point -- i.e. the hundreds place, so you'd want to round to 2 decimal points (round(x,2) in Python, probably the same in R. but least_significant_digit = 1 seems to mean that the least significant digit is the tenths place -- or one digit after the decimal point. In which case you would round to one digit -- round(x,1) However, in the OP's case, the first and last values are the same to 5 values after the decimal point, so ging with round(x, 2) or 3 or 4 or 5 would all work. note that I see a lot of the digits "999979" in there, which looks like a binary representation issue (for maybe .9?), which makes this data look "good" to 5 digits to me. where "good" means re-creating the packed data values -- not accuracy of teh data in the first place. NCEP should enhance those docs :-) I'd add an example -- worth a thousand words! > Please, do not condemn me, english is not my main language :). At this > point, as a scientist, what would you do according to explanations above? I > think I didn’t exactly understand the difference between precision and > least_significant_digit. One says “significant” and latter says “reliable”. > Should I round the numbers to 2 decimal places or 1 decimal place after > decimal point? > If the packing and unpacking is done the same way (which is pretty much has to be) then you'll get the same exact floating point values if the inputs were the same -- so that difference between the first and last values were in the original data, and are not an artifact of packing. I suspect that despite the wording in the docs -- "precision" in this case is referring to the precision of the original data, not the limitation of the packing scheme. Is that data even packed in the original file? So I'd probably round to 2 digits after the decimal place. Even better, get some clarification from NCEP. The data comes out that way because of the way R encodes floating points, R does the same thing as every other system.... and netcdf itself if those data are in floats to begin with. > But as the user later wrote: > > > For instance, If you check the header information of omega.2015.nc file > it says; > > > > $ ncdump -h omega.2015.nc > > > > ... > > omega:precision = 3s; > > omega:least_significant_digit = 3s; > > > and if you check the output of rhum.2015.nc; > > > > $ ncdump -h rhum.2015.nc > > ... > > rhum:precision = 2s ; > > rhum:least_significant_digit = 0s ; > this is starting to look to me like, despite the definitions, they are trying to capture significant figures here. i.e.: precision means number of sig figs, and least_significant is telling you the magnitude of teh numbers. you really need to ask someone that's involved with generating these files! -CHB > If you have a good answer, please reply all so that the original poster > can see the response. > I don't seem to have the OP's email in this thread... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@xxxxxxxx
netcdfgroup
archives: