NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
A while back I polled the user's of Splus, HDF and NetCDF for programs that interface Splus to a common data format. I'd like to thank everyone for their timely and helpful responses. Below, I have compiled the comments from the various newsgroups. Thanks again, Andrea Wood Pacific Northwest Laboratory ------------------------------------------------------------------- ***From phil@xxxxxxxxxxx Wed Mar 16 14:05:18 1994*** Why don't you take a look (anonymous ftp) at the code in: leghorn.geog.ubc.ca:/pub/splus? We'd be interested in hearing from anyone else that may contact you on this, as well. Our interface is currently under development, and it crashes occasionally due to a bug (we believe) in Splus, but we use it daily to get random access to large netCDF arrays. (Note that CDFchange isn't working yet). I've appended the file CDF_SPLUS.doc, and the letter we have written to someone at Statsci about the bug. <phil@coot: pub/splus> cat CDF_SPLUS.doc # $Id: CDF_SPLUS.doc,v 1.2 1993/09/09 06:27:49 mintha Exp mintha $ # # Documentation for the CDF->Splus->CDF interface # # $Log: CDF_SPLUS.doc,v $ # Revision 1.2 1993/09/09 06:27:49 mintha # Prelim SPLUS->CDF docs & clean up # # Revision 1.1 1993/08/13 22:50:06 mintha # Initial revision # CDF->Splus To read files into splus. User functions: CDFgetinfo(filename, close=T) - Reads in all the information about the given cdf file and returns a structure with all the global attibutes and a component for each variable, and all its dimensions and attributes. If 'close' if FALSE then the file is not closed (used internally) CDFget(filename, varname, start=NULL, count=NULL) - Reads in the data for the given variable (variable id can be substituted for name) The dimnames are attached and any attributes are added as Splus attributes. If 'start' is NULL (default) the origin (1,1,...) is used. If 'count' is NULL then the rest of the variable is used. If any dimension in 'count' is -1 then that is replaced by the remaining amount of that dimension. Internal functions: CDFgetdata(info, var, start=NULL, count=NULL) - Does the actual work of reading in the data and attaching the dimnames and attributes. 'info' is the structure returned by CDFgetinfo, 'var' is the variable name or ID. 'start' and 'count' are as above. CDFginfo(cdfp) - Reads in all the global information about a cdf file. Including dimension names, values, and labels, as well as global attributes. 'cdfp' is the structure returned from the cdf_open call. CDFvinfo(ginfo, var) - Reads in all the information for the given variable ID ('var'). Including dimensions, labels, and attributes. 'ginfo' is the structure returned by CDFginfo. CDFgetinfo returns the folloing structure: (note this is a ficitious example) $global: $ncid 23 $numvars 4 $dims (5, 10, 20) $dnames ("Time", "Droplets", "Ozone") $dimlabs (NULL, "rx ry rz", "r1 r2 r3 r4 dydt dzdt") $unlimdim 4 $history "Processed by ..." $doc "Ozone samples" $comment "Created by Jim" $variable1 $name "variable1" $id 5 $type 5 $dimids (1, 2) $dims (20, 10) $labels ("r1 r2 r3 r4 dydt", "rx ry rz") $title "Just a variable" $variable2 $name "variable2" $id 1 $type 6 $dimids (2, 3) $dims (5, 10) $labels ( NULL, "rx ry rz") $title "Bond, James Bond" $FillValue -999 CDFget returns something like this: rx ry rz dy 2 3 4 dz 5 11 9 dydz 13 12 11 attr(,"title"): [1] "Hello" attr(,"file"): [1] "test2.nc" SPLUS <- CDF To write files out of splus CDFappend(filename, varname, data, coord=NULL) - Add data to a variable in a netCDF file. The variable must already exist and one of its dimensions must be unlimited. If 'coord' is not NULL the data is appended to the coordinate variable which has the same name as the unlimited dimension (see documentation on ncdump). CDFchange(filename, varname, data, start=NULL, count=NULL) - Changes an existing netCDF file. If the variable named does not exist then it is added to the file. If it does exist then the data given replaces that in the file. When adding a new variable the 'dnames' attribute must be present. In both cases the dimension must match. A hyperslab can be specified with 'start' and 'count' as above. CDFput(filename, varname, data, clobber=F, unlim=NULL) - Create a new netCDF file named 'filename'. One variable 'varname' is created, and filled with 'data'. 'data' must have the 'dnames' attribute. The named dimensions are created and given the sizes of the array unless an attribute 'dims' is defined. If the variable has dimnames these are added as labels for the dimensions. This is also overrided with the 'labels' attribute. Unless 'clobber' is (T)rue this function will not overwrite an existing file. If 'unlim' is not NULL the dimension given is made unlimited. Cheers, Phil --------------------------------------------------------------------- ***From tam@xxxxxxxxxxx Thu Mar 17 12:49:20 1994*** A couple of years ago I was asked to interface Splus and HDF so that users could retrieve and put data into HDF files much like they did in fortran. The code is still here although I don't know whether it still being used. It should still work. The package consists of C code Glue code that was called by Splus and then in turn, calls HDF routines after fiddling with the parameters so they'll work right. This code was compiled into a module (S2hdf.o) that was dynamically loaded into Splus via the dyn.load() command. Splus code These were a set of Splus routines that made working with HDF more palatable (instead of trying to the .C() call yourself and having to mangle the output into the usual Splus convention). Here's the README file (which upon reading myself is not really friendly and looks half-baked). I hope you can get the gist of the operations that this package offers. -------------------------------------------------------------------------------- Notes on interface to use HDF library for input-ouput of floating-point arrays into the Splus data analysis package. noctiluca:/usr/local/src/hdf/splus/README laplante@xxxxxxxxxxx 90may09 HDFload() makes the HDF library routine available by dynamically loading /usr/local/lib/S2hdf.o . NB: UBC fixes to HDF library are necessary for correct operation in cases where rank>2. HDFget() reads in an entire HDF Scientific Data Set into an S "object" (array). By specifying some of parameters 'start' and 'end' (or 'axis') the user can get only a part of the SDS. For example 'start=c(1,1,1),end=c(7,5,1)' would get a 2-dimensional array. The form 'axis=2' would get a vector along the second dimension from the specified start. The resulting array is collapsed to a lower rank if some dimensions are 1. All parameters have reasonable defaults: file=HDFcurrentfile (the last file loaded by this user) ref=1 (select one of several SDS in file) start=c(1,1,...) (start at beginning) end=c(...) (end at end) HDFaddata() appends the specified S array as an SDS to the named HDF file. Error messages should be self-explanatory ;-) . The functions normally reside in /usr/local/splus/s/.Functions/HDF* . ===================================================================== 1) EXAMPLE OF USE: File "753.hdf" is a 7*5*3 test matrix containing 111.0 112.0 ... 752.0 753.0 % Splus S-PLUS : Copyright (c) 1988, 1989, 1990 Statistical Sciences, Inc. S : Copyright AT&T. Version 2.3 Release 1 for Sun4 : 1990 > HDFload() S - HDF interface library loaded > ls() [1] "HDFadddata" "HDFcurrentfile" "HDFget" "HDFgetarray" [5] "HDFgetdata" "HDFgetdims" "HDFgetslice" "HDFload" [9] "last.dump" > HDFget(file="753.hdf",ref=1) HDFget: start= 1 1 1 slicedims= 7 5 3 , , 1 [,1] [,2] [,3] [,4] [,5] [1,] 111 121 131 141 151 [2,] 211 221 231 241 251 [3,] 311 321 331 341 351 [4,] 411 421 431 441 451 [5,] 511 521 531 541 551 [6,] 611 621 631 641 651 [7,] 711 721 731 741 751 , , 2 [,1] [,2] [,3] [,4] [,5] [1,] 112 122 132 142 152 [2,] 212 222 232 242 252 [3,] 312 322 332 342 352 [4,] 412 422 432 442 452 [5,] 512 522 532 542 552 [6,] 612 622 632 642 652 [7,] 712 722 732 742 752 , , 3 [,1] [,2] [,3] [,4] [,5] [1,] 113 123 133 143 153 [2,] 213 223 233 243 253 [3,] 313 323 333 343 353 [4,] 413 423 433 443 453 [5,] 513 523 533 543 553 [6,] 613 623 633 643 653 [7,] 713 723 733 743 753 > HDFget(end=c(7,5,1)) HDFget: using file 753.hdf HDFget: start= 1 1 1 slicedims= 7 5 1 [,1] [,2] [,3] [,4] [,5] [1,] 111 121 131 141 151 [2,] 211 221 231 241 251 [3,] 311 321 331 341 351 [4,] 411 421 431 441 451 [5,] 511 521 531 541 551 [6,] 611 621 631 641 651 [7,] 711 721 731 741 751 > HDFget(start=c(3,4,1),axis=3) HDFget: using file 753.hdf HDFget: start= 3 4 1 slicedims= 1 1 3 [,1] [,2] [,3] 341 342 343 ===================================================================== 2- FUNCTION LISTINGS > HDFload function() { dyn.load("S2hdf.o") cat("S - HDF interface library loaded\n") invisible() } > HDFgetdims function(file = HDFcurrentfile, maxrank = 10) { if(missing(file)) cat("Using file: ", file, "\n") else HDFcurrentfile <<- file z <- .C("HDFgetdims", file, rank = integer(1), shape = integer(maxrank), as.integer(maxrank)) z$shape[1:z$rank] } > HDFget function(file = HDFcurrentfile, ref = 1, start = NULL, end = NULL, axis = NULL ) { # HDFget(file,ref,start,end,axis) # file (default last name, saved in global string HDFcurrentfile) # ref (serial number of data sets in the file) # start (vector: coordinates of starting point) # end (vector: slicedims = end - start + 1) # -or- # axis (dimension number for extracting a vector from array) if(missing(file)) cat("HDFget: using file ", file, "\n") else HDFcurrentfile <<- file .C("DFSDrestart") # rewind for(i in 1:ref) filedims <- HDFgetdims(file) # step through data sets filerank <- length(filedims) if(missing(start)) start <- rep(1, filerank) else if(filerank!=length(start)) { cat("HDFget dimension mismatch: start=", start, "but", file, "has", filedims, "\n") return(NULL) } if(missing(end)) { if(missing(axis)) { end <- filedims } else { end <- start # End = start except along axis. end[axis] <- filedims[axis] } } else if(filerank!=length(end)) { cat("HDFget: end=", end, "but", file, "has", filedims, "\n") return(NULL) } slicedims <- end - start + 1 cat("HDFget: start=", start, "\n") cat("\tslicedims=", slicedims, "\n") z <- .C("HDFgetslice", file, as.integer(start), as.integer(slicedims), dat = single(prod(slicedims)), as.integer(slicedims)) array(z$dat, slicedims[slicedims!=1]) } > HDFadddata function(file, data, append = F) { if(missing(file)) cat("File must be supplied\n") else if(missing(data)) cat("Missing data parameter\n") else { if(append) cat("Appending data to ", file, " ...\n") else cat("Replacing data in ", file, " ...\n") d <- dim(data) .C("HDFadddata", file, as.integer(length(d)), as.integer(d), as.single(data), as.integer(append)) } invisible() } -------------------------------------------------------------------------------- Still interested? Joseph Tam <tam@xxxxxxxxxxx> \___ \___________ Department of Oceanography \___ \___________ University of British Columbia \___ \___________ Vancouver, BC, Canada \___ \___ V6T 1Z4 \___ \___ \___ \___ \___ \___ Phone: (604) 822-3911 \_________ \__ \___ \__ Fax: (604) 822-6091 \______ \__ \___ \__ --------------------------------------------------------------- ***From russ@xxxxxxxxxxxxxxxx Wed Mar 16 12:36:02 1994*** I got an evaluation copy of Splus and looked into the feasibility of building a netCDF interface. I also evaluated the possibility of using netCDF for the external representation of Splus datasets. I liked the powerful expression language, the fully worked out semantics for "NA" missing values, and C and Fortran interfaces, and much else about Splus. I decided netcdf was not powerful enough to represent some kinds of Splus recursive data structures, e.g. lists of lists or nested lists with named elements. This meant it would be impractical to try to export arbitrary Splus data structures into netCDF files. However, an interface that allowed the import of netCDF data into Splus seemed quite practical. That's about as far as I got during the evaluation. __________________________________________________________________________ Russ Rew UCAR Unidata Program russ@xxxxxxxxxxxxxxxx P.O. Box 3000 (303)497-8645 Boulder, Colorado 80307-3000 --------------------------------------------------------------- ***From maa017@xxxxxxxxxxxxxxxxx Wed Mar 16 13:21:50 1994*** I wrote some code to send HDF SDS data to the DTM outport so that I could use XDataslice to visualise 3-dimensional arrays. It works quite nice: Splus> arr_array(runif(50*50*50),dim=c(50,50,50)) Splus> sendarraydtm(arr) and bingo, up it comes in Data Slice for animation, slicing, whatever. My needs didnt go past SDS data, so I havent coded raster images, pallettes or anything else!! My one worry is that the HDF/DTM library uses malloc and so gobbles up all the memory eventually. Anyway, if you want to see my S/C code for it, I'll mail it to you! Barry Rowlingson Research Associate, Maths/Stats Dept Lancaster University ---------------------------------------------------------------- ***From maclean@xxxxxxxxxxxxxxxxxx Wed Mar 16 12:51:35 1994*** We have a function, fun.readnetcdf(),which will read a scalar or multidimensioned NetCDF variable into S. This function calls C functions which in turn make the NetCDF calls. Right now it must read an entire variable - you cannot give the indices of the NetCDF "hyperslab" that you want to read. However, it should give you the basic functionality and would be easy to upgrade for your needs. Let me know if you want it and I can send the C and S code to you. Gordon Maclean Surface and Sounding Systems Facility Atmospheric Tech. Div. National Center for Atmospheric Research P.O. Box 3000, Boulder CO 80307-3000 email: maclean@xxxxxxxxxxxxx phone: (303) 497-8794 FAX: (303) 497-8770 ----------------------------------------------------------------- ***From Jaason.Haapakoski@xxxxxx Wed Mar 16 22:47:44 1994*** I have build a read-only interface from Splus to netCDF files. If that's enough then you can get the C routines, Splus functions and the help pages on request. The interface utilizes generic Splus database functions so that you can attach to a netcdf file and then read the variables in the file just refering to them in the same way you refer to ordinary Splus objects in a directory. Possible netcdf attributes are read and assigned to the resulting object. Netcdf dimensions are translated to the dim-attribute of Splus. Missing values are translated to NA's. Jaason Haapakoski I forgot to mention in my previous message that our interface supports only variables of fixed dimensions (uses the function ncvarget), not records (does not use the function ncrecget). We have had no use for record-type netcdf variables yet. I think it would be fairly simple to add missing routines, however, but I have no time for that now. On the contrary, adding a write interface from Splus to netcdf is not as simple because there are more types of data structures in Splus than in netcdf. I think it would be wise to restrict oneself only to Splus arrays, for example (a data frame as a special case). Jaason Haapakoski ATBC Cancer Prevention Study The National Institute of Public Health in Finland
netcdfgroup
archives: