NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
On 08/19/2015 03:55 PM, Gerry Creager - NOAA Affiliate wrote:
I'll open a case to determine if Cray's MPI-IO library has this problem.
OK. Might not be any need to do so: David Knaak told me (via off-list correspondence) that it was fixed in Cray MPI-IO much the same way I fixed it in ROMIO.
==rob
gerry On Wed, Aug 19, 2015 at 7:47 PM, Rob Latham <robl@xxxxxxxxxxx <mailto:robl@xxxxxxxxxxx>> wrote: On 08/18/2015 02:31 PM, Ward Fisher wrote: Hello all, I just wanted to jump in and comment that this issue, recently reported to us by David Knaak at Cray, is now handled in the netCDF-C development branch on GitHub. This fix will be in the upcoming release candidate and eventual final release of netCDF-C 4.4.0. Regarding the question of short reads providing more warning; netcdf specifically was already checking for short reads when ‘paging in’ data from a file, but was assuming an error when one would occur (due to a non-zero |errno| value). The fix shouldn’t incur any performance penalty. The new thing I learned about “short reads” is that it is possible for this to occur /without/ being the result of an error, but rather the result of an interrupt. I found these short reads would happen in ROMIO when trying to read 2 GiB of data in one shot. Linux would give me back (2GiB-4k) worth of data. Today, most MPI-IO libraries should detect and retry this case. Cray's MPI-IO library is closed source, so i don't know what they do. In general, since they are technically allowed I think developers are going to have to accommodate the possibility of short reads in their software, one way or another. Developers should already be checking the return value of |read()|, and when short, the fix is essentially: 1. Check to see if errno is |EINTR| 2. If so, perform some calculations and resume the read. While that's strictly correct, I worry about short reads that for whatever reason don't set EINTR. So I would check how much data was read. If it is less than requested, continue the read to fetch the missing data. If that continued read returns 0, then you are EOF and you are done. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA _______________________________________________ netcdfgroup mailing list netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/ -- Gerry Creager NSSL/CIMMS 405.325.6371 ++++++++++++++++++++++ “Big whorls have little whorls, That feed on their velocity; And little whorls have lesser whorls, And so on to viscosity.” Lewis Fry Richardson (1881-1953)
-- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA
netcdfgroup
archives: