Hi Bill,
Well, how many files do you have? On my Linux CentOS 5.2 box  I can have 
> 300,000 files open at once (cat /proc/sys/fs/file-max).
BTW, I pulled the open/close calls out of the inner loop in your code 
and found that both the C and Java versions ran about four time as fast.
- Joe
Bill Moninger wrote:
Hi Joe,
good suggestions all, but they're probably not so useful in in my 
case. We get enough requests per day that they load down our web 
server if the app that generates the soundings takes too much cpu 
time--and a doubling of the cpu time for each access does indeed bring 
web response to a crawl.
Keeping the file(s) open isn't a desirable option either, because 
users are accessing many different forecasts at many different hours, 
and each is stored in a different file.  While there are certainly 
caching options we could use, the bookkeeping probably wouldn't be 
worth the effort--particularly since we have a working system (with C) 
now.
But I hope you and others will continue providing ideas. Our sounding 
generation software has to read netCDF, grib, and grib2, and at the 
moment we have different code for each. netcdf-java would allow 
unified code that would be much easier to maintain.
-Bill
On 5/7/2009 1:04 PM, Joe Sirott wrote:
  Hi Bill,
One question I have: does it really matter if your Java Web 
application is 2 or 3x slower than your C application? You mentioned 
that your current application takes significantly less than one 
second to produce a plot; even if your Java Web app takes, say, one 
second to produce a plot that still would allow for ~100,000 plot 
requests per day. And you could easily increase this capability by 
implementing a caching scheme.
If you do need to speed up your application, I found when I profiled 
the Java netCDF library a couple of years ago athat  it can be more 
expensive to open a netCDF file than to read small amounts of data 
(like 2D slices from a 4D variable) from the file. So one strategy 
(at least in a Web application environment) is to keep the file open 
so repeated reads of the file don't incur the overhead of reopening 
the file. There are some issues with this -- the library isn't thread 
safe, so you don't want to share the file object across threads, and 
you might run into a problem with too many open files if you have a 
lot of files, but there are strategies to work around this.
- Joe
Bill Moninger wrote:
Hello John, Jon, and Bob,
Thanks for your useful questions and comments.
I was testing the timing from the command line, and I agree that 
java startup time might have been a big issue.
So I took the lead from the modified program that John sent back, 
which did a loop of opening and the netcdf file, and pulling a 
hyperslab (a hyperline really) out of the file, then closing it.
I amended both the java and C programs (attached as a tar file) to 
take the number of times through the loop as the sole argument and 
got the following results when reading the netcdf file available at
http://ruc.noaa.gov/ruc_native_40.nc (53 M in size):
%> sounding.x 10000
C: elapsed time for 10000 reads is 16.630000 seconds
(varied between 13.9 and 16.6 secs)
%> java -server Tester2 10000
java: elapsed time for 10000 reads is 44.466998 seconds
(varied between 20.1 sec and 44.5 sec)
So, it looks like something other than the startup cost is causing 
java to be slower than C by about 1.2 to 2.5x. But the java times 
appear to be a lot more variable than the C times.
Perhaps I am using the libraries non-optimal; if so, I'll be very 
grateful for any suggestions
-Bill
On 5/6/2009 4:21 PM, John Caron wrote:
Hi Bill:
I made a few mods to your program (attached)
1) removed the print statements, which are notoriously slow.
2) did the whole open/read/close loop 100 times
3) added timing, and got:
that took 1248.659775 millisecs
which is about 13 msecs per call. When I get a chance I will try to 
compare to the C code.
None of this is all that definitive, its very hard to get accurate 
timings on small programs. For one thing, the java compiler happens 
at runtime, and its somewhat indeterministic. so running a program 
once will very likely look very bad. If you are doing a CGI type 
server, where the java application starts up for each request, that 
will be very slow.
I can pretty much promise you that java performance is within a 
factor of 2 of C code, and more likely within 20% of C code, in a 
long-running server environment. There are certain things it can do 
faster, like memory allocation and multithreading.
Anyway, I could look at your actual production code to see if there 
are some ways to help speed it up. It is possible that for various 
reasons, Java will be "several times slower" than C code, so you'll 
have to decide if the increase in productivity is worth it.
Bill Moninger wrote:
 
_______________________________________________
netcdf-java mailing list
netcdf-java@xxxxxxxxxxxxxxxx <mailto:netcdf-java@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe, visit: 
http://www.unidata.ucar.edu/mailing_lists/