NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Hi netCDF team This email is rather long, so please bear with me ... The short read and main question is: How to find the full dimension names (paths with groups) for all dimensions that a variable has? Example: Note: Incomplete CDL syntax group: g16 { dimensions: lon1=4; //dimension that has a coordinate variable down in scope at /g16/g16g1/lon1(lon1) group: g16g1 { variables: float lon1(lon1); //coordinate variable /g16/g16g1/lon1 that has dimension (/g16/lon1) in scope float lon1_var(lon1); // variable /g16/g16g1/lon1_var that has dimension (/g16/lon1) in scope *and* coordinate (/g16/g16g1/lon1) in scope data: lon1=0.,1.,2.,3.; lon1_var=0.,1.,2.,3.; Note that coordinate variables can share dimensions; here's a case of a "parallel" group /g16/g16g2/ of /g16/g16g1/ where variables have their own local coordinate variable that share the ancestor dimension (/g16/lon1) group: g16g2 { variables: //coordinate variable (/g16/lon1) float lon1(lon1); float lon1_var(lon1); It is possible to construct other cases, variables with n dimensions, each one defined in different groups (and each one of these dimensions can have coordinate variables in *other* different groups ) More broadly, I am trying to construct a model for ncks of a netCDF4 file that includes : 1) A list of all "objects" in the file I call an "object" what I call an object in HDF5: either a group or a variable (a variable is commonly called in HDF5 a "dataset" ). 2) netCDF4 has dimensions. HDF5 does not (Let's ignore HDF5 dimension scales for now, to keep this simple... Coincidently netCDF4 *happens* to use HDF5 dimension scales in its inner model, but my understanding is that it did not had to be that way... I think. Imagine for example that HDF5 dimension scales did not exist... It would be perfectly possible for netCDF4 to use HDF5 as the underlying format... HDF5 dimension scales are not part of the HDF5 format, they are just an abstraction layer build above HDF5 with a so called "High Level" API.... At the time the requirement was for HDF5 to have the equivalent of HDF(4) "coordinate variables", that could be shared between HDF5 datasets) excellent article about dimension scales http://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf4_shared_dimensions Let's call these netCDF4 dimensions, "unique dimensions". These are defined in groups. 3) This model stores *full names* of things: full names, for groups, variables and unique dimensions. Also, full names for coordinate variables. 4) Coordinate variables. >From the netCDF manual "It is legal for a variable to have the same name as a dimension. Such variables have no special meaning to the netCDF library. However there is a convention that such variables should be treated in a special way by software using this library. A variable with the same name as a dimension is called a coordinate variable." Dimensions and coordinate variables are used by variables. So, variables must know where dimensions and coordinate variables (if existent for that variable) are. Example of an output, that prints either a dimension or a coordinate variable for any variable /g16/g16g1/lon1 ---> coordinate variable lon1[0]=0 lon1[1]=1 lon1[2]=2 lon1[3]=3 /g16/g16g1/lon1_var ---> variable with coordinate variable lon1[0]=0 lon1_var[0]=0 lon1[1]=1 lon1_var[1]=1 lon1[2]=2 lon1_var[2]=2 lon1[3]=3 lon1_var[3]=3 The API function that returns a dimension name for a variable is >From the netCDF C manual http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c/nc_005finq_005fdim-Family.html#nc_005finq_005fdim-Family int nc_inq_dimname (int ncid, int dimid, char *name); ncid NetCDF ID, from a previous call to nc_open or nc_create. dimid Dimension ID, from a previous call to nc_inq dimid or nc_def_dim. name Returned dimension name. Note: here "ncid" is actually a "location" ID (either a group or the main netCDF file ID), so I think you should change this in the documentation The "dimid" parameter is an ID of a dimension. This is obtained with the API function int nc_inq_dimid (int ncid, const char *name, int *dimidp); ncid NetCDF ID, from a previous call to nc_open or nc_create. name Dimension name. dimidp Pointer to location for the returned dimension ID. >From the manual: "When searching for a dimension, the specified group is searched, and then its parent group, and then its grandparent group, etc., up to the root group." Ok, great, the dimension ID "dimidp" can be in a ancestor group, but how to know where? My understanding is that netCDF4 group IDs are "unique"; dimension IDs are not, they can have duplicated values in several groups. In the above call nc_inq_dimid, dimension IDs in ancestor groups are returned, but duplicates may happen. I think storing IDs, even unique group IDs, in the model above is a recipe for disaster. I see IDs as an equivalent of the paper ticket number I am given when I take the train and want to keep my luggage at a station for a while. When I get my bags back, I dispose the ticket number. That ticket is helpful for the person that has to identify my bags only. As a developer, for debugging purposes, or even as a netCDF4 user, it is also much easier to identify something by name than by ID. Possible ways to solve this (to get full dimension name for a variable): 1) Iterate ancestor groups, get all variables for each group, get variables dimension IDs, and compare with group dimension Ids ? 2) Iterate ancestor groups, try to construct a possible full dimension name and match ? Below is some code sample that tries to solve this using option 2) above, But as a netCDF API user, I don't think that I should have to do this, mainly because it could just be wrong (it could not cover all cases, for example). What I think is needed here is a new API function that returns the *full* dimension names for all dimensions used by a variable, instead of an ID and relative name only. With information if that dimension "name" is a coordinate variable or just a dimension. Would it be possible for the netCDF group to supply this function? There is a similar function for groups: http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c/nc_005finq_005fgrpname_005ffull.html#nc_005finq_005fgrpname_005ffull int nc_inq_grpname_full(int ncid, size_t *lenp, char *full_name); ncid The group id for this operation. full_name Pointer to allocated space of correct length. That returns the *full name* of the group from the group ID, this is one of the most helpful functions to construct this "full path" model Thanks for your help Pedro ------ Pedro Vicente, Earth System Science University of California, Irvine http://www.ess.uci.edu/ PS: here's the code that tries to find full dimension names /* Loop *object* traversal table */ for(unsigned uidx=0;uidx<trv_tbl->nbr;uidx++){ if(trv_tbl->lst[uidx].nco_typ == nco_obj_typ_var){ trv_sct trv=trv_tbl->lst[uidx]; /* Obtain group ID using full group name */ (void)nco_inq_grp_full_ncid(nc_id,trv.grp_nm_fll,&grp_id); /* Obtain variable ID using group ID */ (void)nco_inq_varid(grp_id,trv.nm,&var_id); /* Get number of dimensions for variable */ (void)nco_inq_varndims(grp_id,var_id,&nbr_dmn_var); /* Get dimension IDs for variable */ (void)nco_inq_vardimid(grp_id,var_id,dmn_id_var); /* Obtain dimension IDs for group. NB: go to parents */ (void)nco_inq_dimids(grp_id,&nbr_dmn_grp,dmn_id_grp,flg_prn); /* Loop over dimensions of variable */ for(int dmn_idx_var=0;dmn_idx_var<nbr_dmn_var;dmn_idx_var++){ /* Get dimension name */ (void)nco_inq_dimname(grp_id,dmn_id_var[dmn_idx_var],dmn_nm_var); /* Now the exciting part; we have to locate where "dmn_var_nm" is located 1) Dimensions are defined in *groups*: find group where variable resides 2) Most common case is for the dimension to be defined in the same group where variable is 3) If not, we have to traverse the group back until the dimension name is found From: "Dennis Heimbigner" <dmh@xxxxxxxxxxxxxxxx> Subject: Re: [netcdfgroup] defining dimensions in groups 1. The inner dimension is used. The rule is to look up the group tree from innermost to root and choose the first one that is found with a matching name. 2. The fact that it is a dimension for a coordinate variable is not relevant for the choice. However, note that this rule is only used by ncgen when disambiguating a reference in the CDL. The issue does not come up in the netcdf API because you have to specifically supply the dimension id when defining the dimension for a variable. 4) Use case example: /g5/g5g1/rz variable and rz(rlev), where dimension "rlev" resides in /g5/rlev */ /* Loop over dimensions of group *and* parents */ for(int dmn_idx_grp=0;dmn_idx_grp<nbr_dmn_grp;dmn_idx_grp++){ /* Get dimension name for group */ (void)nco_inq_dimname(grp_id,dmn_id_grp[dmn_idx_grp],dmn_nm_grp); /* Does dimension name for *variable* match dimension name for *group* ? */ if(strcmp(dmn_nm_var,dmn_nm_grp) == 0){
netcdfgroup
archives: