NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.

Re: orthogonality (was Re: New attempt)

I think the word dataset is causing trouble. There are at least three potential meanings for this word in the context of THREDDS:

1) an entity that is considered as a unit by human beings

2) an entity that can be operated on as a unit by the THREDDS API

3) an entity that can be operated on as a unit by a data access protocol

Right now, only the entities described by "access" tags meet all of 1, 2, and 3.

The tags "dataset" and "collection" both describe entities that only meet 1 and 2. Thus I agree with benno that there is not a very meaningful distinction between them (and reconsider my listing of them as orthogonal concepts in my previous message).

I wonder if it would be a good idea to merge these concepts and use a less loaded word, say "entry", to refer to an entity that has meaning to THREDDS and to end users, but not to a data access protocol, i.e.

<catalog>
<service name="X"/>
<service name="Y"/>
...

<entry name="my_dataset">

   <metadata name="global-metadata" url="..."/>
   <access name="global-X-access"/>

   <entry name="monthly-data">
     <metadata name="monthly-metadata" url="..."/>
     <access name="X-with-COARDS" serviceType="X" url="..."/>
     <access name="X-with-no-COARDS" serviceType="X" url="..."/>
     <access name="X-flattened-to-2D" serviceType="X" url="http://..."/>
     <access name="Y" serviceType="Y" url="..."/>
     ....
   </entry>


</entry>

- Joe


Daniel Holloway wrote:

Benno Blumenthal wrote:


John Caron wrote:


Much harder question is the distinction between a dataset and a
collection,
since a dataset is a collection of data. I have conceptualized it as

follows: a dataset is something that can be selected, and then it is

processed in a protocol-dependent way. A collection is a
protocol-independent mechanism for grouping datasets.

I think this is what is getting us into trouble.    The concept of a
dataset should be independent of the services available for it:  a
dataset served from two different servers could very well have
different services/protocols available, depending on the server.  (the
aggregation server converts collections to datasets, for example).
Yet from the THREDDS/educational point of view, it is the same object.




I agree with this as well.   I've been trying to reconcile how a catalog
might look for a
particular multifile 'dataset' which has both WMS and DODS access
available for it.   For WMS (for multifile) datasets the access point
would be at the
collection level, while for 'non-aggregated' datasets the DODS access
would
be lower than the collection level, at the THREDDS dataset level.   It
seems that
the concept of a dataset resides more at the collection level, maybe the
service
access binding is too tightly coupled to the dataset concept in the
current draft.

Dan


Benno


--
Dr. M. Benno Blumenthal          benno@xxxxxxxxxxxxxxxx
International Research Institute for climate prediction
Lamont-Doherty Earth Observatory of Columbia University
Palisades NY 10964-8000                  (845) 680-4450








--
Joe Wielgosz
joew@xxxxxxxxxxxxx / (707)826-2631
---------------------------------------------------
Center for Ocean-Land-Atmosphere Studies (COLA)
Institute for Global Environment and Society (IGES)
http://www.iges.org


  • 2002 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: