NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Workshop participants were asked to provide a one-paragraph description of affiliation, and of related THREDDS activities, prior to the workshop. These were valuable to all of the participants because of the diversity of the THREDDS partners. During the course of the workshop, time was allocated for the partners to present thoughts on how their system could be integrated into THREDDS. This consisted of five-minute presentations by each participant.
Participants at the workshop include:
Data providers have collections of datasets and are willing to make them available on-line. Clients are software that accesses the data.. Discovery centers provide browse and search services for multiple data collections. Third-party providers create logical dataset collections and additional metadata.
Types of data: archived (static catalogs); realtime (catalogs polled/notify); or dynamically generated by request.
THREDDS' present technology focus is on acquiring real data (not just pictures of data), creating a framework for loosely coupled systems,developing "human in the loop" automation tools, and metadata standards. Future development will include making choices about communications mechanisms. Phase One development, which is drawing to a close, has focussed on data catalogs creation. Developers want feedback from providers using the tools presently in place.
Granularity issues at the catalog level affect the number and size of catalogs and how they are included in Discovery Centers. Ctalog updating frequency has also been an area of concern.
Phase Two will focus on catalog servers and augmented metadata for discovery centers. Phase Three's focus will be data semantics, tools that allow data classification, and creating a collaborative "knowledge building environment" (KBE).
Some of the issues facing developers include:
The catalogs are hierarchical collections of datasets requiring minimal metadata to keep barriers to entry low.
There is no THREDDS data object model. THREDDS focuses on metadata. Other long-term technical goals are to use existing and emerging standards for efficient handling of large datasets keeping things as simple and clean as possible. THREDDS client software is in Java and eventually may be ported to C..
Goal: to automate catalog generation as much as possible
Because catalog generation is tedious when more than a handful of datasets are involved, a THREDDS goal is to automate the generation as much as possible.
A first-generation catalog generator creating a Unidata model data catalog is currently running on UCAR computer "motherlode." While functional, it is difficult to maintain. Currently being developed is a Java application that scans local directories and can generate THREDDS catalogs or an aggregation server config file. It can also create catalogs from GrADS servers. Current weaknesses include: requires human setup, can only scan local file, and it does not "know" anything about data.
For the short term, plans are to expand the directives language, do some cleanup, and improve the handling of GDS 1.2 XML catalogs. Long-term plans include building a DODS server crawler, building a user interface (build XML input files, create additional metadata); and, determining how XML schemas will impact catalog generation efforts.
The problem of creating metadata for a real-time dataset is that the dataset
is changing so rapidly the metadata represents the data inaccurately. To solve
this problem, the Dynamic Catalog Generator is invoked on command to generate
metadata by scanning directory structures in real time to create catalogs. One
real-time dataset, the NEXRAD radar feed, generates 2.8 million products/week
or about 5 products/sec. The Radar feed was used as a prototype to demonstrate
that the Dynamic Catalog Generator could handle these kind of problems.
Other high volume real-time datasets are being considered as candidates for
the Dynamic Catalog Generator. These datasets may present different problems
from the Radar dataset, such as the METAR datasets that have reports embedded
in bulletins. FSL's MADIS has reports in NetCDF files.
Random breakout groups formed to review topics and issues.
THREDDS Collaboration Tools - Chris Klaus
Following a brief demo of NSDL's WIKI site, the group agreed to try it out for collaboration purposes.
ACTION: Chris will follow up with the group with instructions for accessing the software.
All participants had the opportunity to articulate recommended steps to be taken to better integrate their projects with THREDDS and suggest what tasks they would recommend for the Unidata-THREDDS to pursue. These next steps are included in Participant Input and Discussion.