NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
I think keeping hdf and netcdf most separate is the correct solution. A couple of points. 1. A group can be a dictionary with the following keys: dimensions, types, attributes (group level), variables, and data. 2. Ordering matters in netcdf, so each of the group pieces (dimensions, etc) needs to be a list. 2. Variables have a number of unordered parts that are best represented as a dictionary containing: name, type, attributes. 3. A set of attributes could be represented as a dictionary with the attribute names serving as keys. But remember that each attribute has a number of parts: type, name, and a list of values. 4. In netcdf, there are several kinds of user-defined types: 1. enumerations: an enumeration consists of a name, a basetype (an integer type) and a set of enumeration constants. Each such constant consists of a name and a value. 2. compound type (a structure in C terms): consisting of a name and an ORDERED list of fields. Each field is a variable (see above). 3. vlen type: A variable length set of instances of some arbitrary base type. =Dennis Heimbigner Unidata On 10/20/2016 5:50 PM, Pedro Vicente wrote:
my thought was to make a netcdfJSON, then add features to make anhdfJSON. (and netcdfJSON would look a lot like CDL)So a netcdfJSON file would be a valid hdfJSON file, but not the otherway around. on better thinking , this design has the problem of netCDF having things that HDF5 does not (named dimensions), and HDF5 has things that netCDF does not, so it's a bit of a catch 22 ; so maybe just keep them separate my design method is usually a bit of specification , then a bit of code , then when something new comes up that was not planned, go to step 1 , and re-write the spec, sometimes re-write the code -Pedro ----- Original Message ----- *From:* Pedro Vicente <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx> *To:* Chris Barker <mailto:chris.barker@xxxxxxxx> *Cc:* HDF Users Discussion List <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx> ; netCDF Mail List <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> *Sent:* Thursday, October 20, 2016 7:33 PM *Subject:* Re: [netcdfgroup] How to dump netCDF to JSON? >>my thought was to make a netcdfJSON, then add features to make an hdfJSON. (and netcdfJSON would look a lot like CDL) >>So a netcdfJSON file would be a valid hdfJSON file, but not the other way around. yes, sounds like a good plan I''ll send you an email when I have things ready, thanks -Pedro ----- Original Message ----- *From:* Chris Barker <mailto:chris.barker@xxxxxxxx> *To:* Pedro Vicente <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx> *Cc:* John Readey <mailto:jreadey@xxxxxxxxxxxx> ; netCDF Mail List <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> ; HDF Users Discussion List <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx> *Sent:* Thursday, October 20, 2016 6:17 PM *Subject:* Re: [netcdfgroup] How to dump netCDF to JSON? On Thu, Oct 20, 2016 at 3:00 PM, Pedro Vicente <pedro.vicente@xxxxxxxxxxxxxxxxxx <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>> wrote: __ >>> This is making me think that we may want a spec for netcdf-json that would be a subset of the hdf-json spec. that is one option; other option is to make a JSON form of netCDF CDL , completely unaware of HDF5 (just like the netCDF API is) http://www.unidata.ucar.edu/software/netcdf/workshops/2011/utilities/CDL.html <http://www.unidata.ucar.edu/software/netcdf/workshops/2011/utilities/CDL.html> yup. Are they mutually exclusive approaches? my thought was to make a netcdfJSON, then add features to make an hdfJSON. (and netcdfJSON would look a lot like CDL) So a netcdfJSON file would be a valid hdfJSON file, but not the other way around. Like a netcdf4 file is a valid hdf5 file now. -CHB with the "data" part being optional, which was one of the goals of my design, to transmit just metadata over the web, for a quick remote inspection -Pedro ----- Original Message ----- *From:* Chris Barker <mailto:chris.barker@xxxxxxxx> *To:* John Readey <mailto:jreadey@xxxxxxxxxxxx> *Cc:* Pedro Vicente <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx> ; netCDF Mail List <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> ; HDF Users Discussion List <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx> *Sent:* Thursday, October 20, 2016 4:48 PM *Subject:* Re: [netcdfgroup] How to dump netCDF to JSON? On Thu, Oct 20, 2016 at 12:02 PM, John Readey <jreadey@xxxxxxxxxxxx <mailto:jreadey@xxxxxxxxxxxx>> wrote: So we came up with a scheme of Group, Dataset, and Datatype collections with a UUID to identify each object. That way if you a reference to a specific UUID, you can always access the object regardless of what shenanigans may be happening with the links in the file. ____ It’s true that this makes path look ups a bit more cumbersome, but it’s a more general way of specify a directed graph (the HDF5 link structure) on a tree (the JSON hierarchy). Hmm -- interesting. I hadn't realized that HDF was this flexible. For my part, I've only really used netcdf. This is making me think that we may want a spec for netcdf-json that would be a subset of the hdf-json spec. That way they can be as compatible as possible without "cluttering up" the netcdf spec too much. -CHB John____ ____ *From: *Pedro Vicente <pedro.vicente@xxxxxxxxxxxxxxxxxx <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>> *Date: *Tuesday, October 18, 2016 at 9:37 PM *To: *John Readey <jreadey@xxxxxxxxxxxx <mailto:jreadey@xxxxxxxxxxxx>>, Chris Barker <chris.barker@xxxxxxxx <mailto:chris.barker@xxxxxxxx>> *Cc: *netCDF Mail List <netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>>, HDF Users Discussion List <hdf-forum@xxxxxxxxxxxxxxxxxx <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>> *Subject: *Re: [netcdfgroup] How to dump netCDF to JSON?____ ____ @John____ ____ >> 1. Complete fidelity to all HDF5 features____ >> 2. Support graphs that are not acyclic.____ ____ ok, understood.____ ____ In my case I needed a simple schema for a particular set of files.____ ____ But why didn't you start with the official HDF5 DDL____ ____ https://support.hdfgroup.org/HDF5/doc/ddl.html <https://support.hdfgroup.org/HDF5/doc/ddl.html>____ ____ and try to adapt to JSON?____ ____ Same thing for netCDF, there is already an official CDL, so any JSON spec should be "identical".____ ____ ____ ____ @Chris____ ____ { "dset1" : ["dataset", "STAR_INT32", 2, [3, 4], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]] }____ ____ >> * Do you need "rank"? ____ ____ sometimes a bit of redundancy is useful, to make it visually clear____ ____ >> BTW, is a "dataset" in HDF the same thing as a "variable" in netcdf?)____ ____ yes____ ____ >>It would be really great to have this become an "official" spec -- if you want to get it there, you're probably going to need to develop it more out in the open with a wider community. These lists are the way to get that started, but I suggest ____ >>1) put it up somewhere that people can collaborate on it, make suggestions, capture the discussion, etc. gitHub is one really nice way to do that. See, for example the UGRID spec project: ____ ____ ____ ok, anyone interested send me an off list email ____ ____ ____ -Pedro____ ____ ____ ____ ----- Original Message ----- ____ *From:*John Readey <mailto:jreadey@xxxxxxxxxxxx> ____ *To:*Chris Barker <mailto:chris.barker@xxxxxxxx> ; Pedro Vicente <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx> ____ *Cc:*netCDF Mail List <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> ; Charlie Zender <mailto:zender@xxxxxxx> ; HDF Users Discussion List <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx> ; David Pearah <mailto:David.Pearah@xxxxxxxxxxxx> ____ *Sent:*Tuesday, October 18, 2016 11:15 PM____ *Subject:*Re: [netcdfgroup] How to dump netCDF to JSON?____ ____ Hey,____ ____ The hdf5-json code is here: https://github.com/HDFGroup/hdf5-json <https://github.com/HDFGroup/hdf5-json> and docs are here: http://hdf5-json.readthedocs.io/en/latest/ <http://hdf5-json.readthedocs.io/en/latest/>. ____ ____ The package is both a library of HFD5 <-> JSON conversion functions and some simple scripts for converting HDF5 to JSON and vice-versa. E.g. ____ $ python h5tojson.py –D <hdf5-file> ____ outputs JSON minus the dataset data values.____ ____ While it may not be the most elegant JSON schema, it’s designed with the following goals in mind:____ 1. Complete fidelity to all HDF5 features (i.e. the goal is that you should be able to take any HDF5 files, convert it to JSON, convert back to HDF5 and wind up with a file that is semantically equivalent to what you started with.____ 2. Support graphs that are not acyclic. I.e. a group structure like <root> links with A, and B. And A and B links to C. The output should only produce one representation of C.____ Since NetCDF doesn’t use all these features, it’s certainly possible to come up with something simpler for just netCDF files.____ ____ Suggestions, feedback, and pull requests are welcome!____ ____ Cheers,____ John____ ____ *From: *Chris Barker <chris.barker@xxxxxxxx <mailto:chris.barker@xxxxxxxx>> *Date: *Friday, October 14, 2016 at 12:32 PM *To: *Pedro Vicente <pedro.vicente@xxxxxxxxxxxxxxxxxx <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>> *Cc: *netCDF Mail List <netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>>, Charlie Zender <zender@xxxxxxx <mailto:zender@xxxxxxx>>, John Readey <jreadey@xxxxxxxxxxxx <mailto:jreadey@xxxxxxxxxxxx>>, HDF Users Discussion List <hdf-forum@xxxxxxxxxxxxxxxxxx <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>>, David Pearah <David.Pearah@xxxxxxxxxxxx <mailto:David.Pearah@xxxxxxxxxxxx>> *Subject: *Re: [netcdfgroup] How to dump netCDF to JSON?____ ____ Pedro, ____ ____ When I first started reading this thread, I thought "there should be a spec for how to represent netcdf in JSON"____ ____ and then I read:____ ____ 1) The specification to convert netCDF/HDF5 to "a" JSON format (note the "a" here)____ ____ Awesome -- that's exactly what we need -- as you say there is not one way to represent netcdf data in JSON, and probably far more than one "obvious" way.____ ____ Without looking at your spec yet, I do think it should probably look as much like CDL as possible -- we are all familiar with that.____ ____ (why Python? HDF5 developer tools should be all about writing in C/C++)____ ____ Because Python is an excellent language with which to "drive" C/C++ libraries like HDF5 and netcdf4. If I were to do this, I'd sure use Python. Even if you want to get to a C++ implementation eventually, you'd probably benefit from prototyping and working out the kinks with a Python version first.____ ____ But whoever is writing the code....____ ____ ____ The specification is here http://www.space-research.org/____ ____ Just took a quick look -- nice start. ____ ____ I've only used HDF through the netcdf4 spec, so there may be richness needed that I'm missing, but my first thought is to a make greater use of "objects" in JSON (key-value structures, hash tables, dicts in python), rather than array position for heterogeneous structures. For instance, you have:____ ____ a dataset____ { "dset1" : ["dataset", "STAR_INT32", 2, [3, 4], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]] }____ ____ I would perhaps do that as something like:____ ____ {____ ...____ "dset1":{"object_type": "dataset",____ "dtype": "INT32"____ "rank": 2,____ "dimensions": [3,4],____ "data": [[1,2,3,4],____ [5,6,7,8],____ [9,10,11,12]]____ }____ ...____ }____ ____ NOTES:____ ____ * I used nested arrays, rather than flattening the 2-d array -- this maps nicely to things like numpy arrays, for example -- not sure about the C++ world. (you can flatten and un-flatten numpy arrays easily, too, but this seems like a better mapping to the structure) And HDF is storing this all in chunks and who knows what -- so it's not a direct mapping to the memory layout anyway.____ ____ * Do you need "rank"? -- can't you check the length of the dimensions array?____ ____ * Do you need "object_type" -- will it always be a dataset? Or you could have something like:____ ____ {____ ...____ "datasets": {"dset1": {the actual dataset object},____ "dset2": {another dataset object},____ ....____ } ____ ____ Then you don't need object_type or a name____ ____ ____ (BTW, is a "dataset" in HDF the same thing as a "variable" in netcdf?)____ ____ I would like to make this some kind of "official" netCDF/HDF5 JSON format for the community, so I encourage anyone to read the specification____ ____ If you see any flaw in the design or anything in the design that you would like to have change please let me know now____ ____ done :-)____ ____ It would be really great to have this become an "official" spec -- if you want to get it there, you're probably going to need to develop it more out in the open with a wider community. These lists are the way to get that started, but I suggest:____ ____ 1) put it up somewhere that people can collaborate on it, make suggestions, capture the discussion, etc. gitHub is one really nice way to do that. See, for example the UGRID spec project:____ ____ https://github.com/ugrid-conventions/ugrid-conventions <https://github.com/ugrid-conventions/ugrid-conventions>____ ____ (NOTE that that one got put on gitHub after there was a pretty complete draft spec, so there isn't THAT much discussion captured. But also note that that is too bad -- there is no good record of the decision process that led to the spec)____ ____ At the moment it only (intentionally) uses common generic features of both netCDF and HDF5, which are the numeric atomic types and strings.____ ____ Good plan.____ ____ -Chris____ ____ ____ -- ____ Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 <tel:%28206%29%20526-6959> voice 7600 Sand Point Way NE (206) 526-6329 <tel:%28206%29%20526-6329> fax Seattle, WA 98115 (206) 526-6317 <tel:%28206%29%20526-6317> main reception Chris.Barker@xxxxxxxx <mailto:Chris.Barker@xxxxxxxx>____ -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 <tel:%28206%29%20526-6317> main reception Chris.Barker@xxxxxxxx <mailto:Chris.Barker@xxxxxxxx> -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@xxxxxxxx <mailto:Chris.Barker@xxxxxxxx> ------------------------------------------------------------------------ _______________________________________________ NOTE: All exchanges posted to Unidata maintained email lists are recorded in the Unidata inquiry tracking system and made publicly available through the web. Users who post to any of the lists we maintain are reminded to remove any personal information that they do not want to be made public. netcdfgroup mailing list netcdfgroup@xxxxxxxxxxxxxxxx For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/ _______________________________________________ NOTE: All exchanges posted to Unidata maintained email lists are recorded in the Unidata inquiry tracking system and made publicly available through the web. Users who post to any of the lists we maintain are reminded to remove any personal information that they do not want to be made public. netcdfgroup mailing list netcdfgroup@xxxxxxxxxxxxxxxx For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/
netcdfgroup
archives: