NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
-------- Original Message -------- Subject: Fwd: [uaf_tech] Re: time start / end Date: Fri, 11 Jun 2010 09:58:50 -0700 From: Bob Simons <Bob.Simons@xxxxxxxx> Organization: NOAA/ERD To: John Caron <caron@xxxxxxxxxxxxxxxx> In case you aren't on the uaf_tech mail list, this is my pitch for adding a subscription service to THREDDS. (I understand you are very busy and this is likely low priority.) P.S. I also wonder if there is another solution. Why does THREDDS need metadata from the catalog.xml files to determine the dataset's start and end time? Doesn't it have this information from when it does the dataset's aggregation? For that matter, it seems like a lot of the metadata we put in the catalog.xml files could be gathered by THREDDS from the dataset's metadata and data. Could THREDDS be modified to use the dataset's metadata and data so we wouldn't have to duplicate the information in the catalog.xml files? Thank you. -------- Original Message -------- Subject: [uaf_tech] Re: time start / end Date: Thu, 10 Jun 2010 10:42:03 -0700 From: Bob Simons<Bob.Simons@xxxxxxxx> Organization: NOAA/ERD To: _OAR PMEL UAF Tech List<uaf_tech@xxxxxxxx> My $0.02: The Ideal - The ideal situation is to have Start and End have specific dates and times, e.g., Start: 2010-06-03 12:00:00Z End: 2010-06-10 12:00:00Z and to have this always perfectly up-to-date. Statement of Intent - Something like Start: present - 7 days End: present is pretty good as is, because it is a statement of intent. Not Really Right - If it gets translated to some instantaneous values, e.g., Start: 2010-06-03 12:04:57Z End: 2010-06-10 12:04:57Z then it is less desirable. It implies accuracy and precision, but isn't correct (e.g., perhaps the dataset is just updated daily sometime every morning). Polling - Having a downstream server (e.g., RAMADA or ERDDAP) frequently check with TDS to find out the actual Start and End times isn't ideal. The extremes cases are * The downstream server polls infrequently, and so is usually way out-of-date. * The downstream server polls frequently, and so is closer, but never perfectly up-to-date. The problem with polling is: if lots of downstream servers are polling 100's of datasets frequently, it can be a burden on the TDS. So polling is never an ideal solution. (Note that one implementation of polling is RSS.) Subscriptions: it would be great if TDS had a subscription service, so that TDS would automatically send an email or ping some client specified URL whenever the dataset changed. This is *much* more efficient than polling, and the downstream servers are notified within seconds when a dataset changes. With subscriptions, the downstream server could display accurate (up-to-date) and precise Start and End times. And people would find other uses for a general purpose subscription system. As an example of how subscriptions are useful: ERDDAP has * A subscription system (http://coastwatch.pfeg.noaa.gov/erddap/information.html#subscriptions and more specifically http://coastwatch.pfeg.noaa.gov/erddap/subscriptions/add.html) * A flag system (http://coastwatch.pfeg.noaa.gov/erddap/download/setup.html#flag). If one ERDDAP is pointing to a dataset at a remote ERDDAP, it subscribes to the remote ERDDAP's dataset (humans have to confirm the subscriptions). Whenever the remote dataset changes, the remote ERDDAP contacts a special URL on the first ERDDAP to set a flag, which indicates that a specific dataset should be reloaded/checked because it has changed. As soon as possible, the first ERDDAP reloads the dataset. So the two ERDDAPs stay in synch, usually within a few seconds. It would be great if TDS could offer a similar subscription system so other TDS installations, ERDDAP, RAMADA, and other clients could be notified immediately whenever a specific TDS dataset changes. On 6/10/2010 9:03 AM, Kevin O'Brien wrote:
Below is a bounced message from John Caron.....------------------------------------------------------------------------ Subject: BOUNCE uaf_tech@xxxxxxxx: Non-member submission from [John Caron <caron@xxxxxxxxxxxxxxxx>] From: uaf_tech-owner@xxxxxxxx Date: Thu, 10 Jun 2010 05:13:15 -0700 To: uaf_tech-approval@xxxxxxxx To: uaf_tech-approval@xxxxxxxx Date: Thu, 10 Jun 2010 06:13:06 -0600 From: John Caron<caron@xxxxxxxxxxxxxxxx> Subject: Re: [uaf_tech] Next UAF telcon: June 10th, 12:30pm EDT In-reply-to:<AANLkTimmSgUqJCXhWj91GL1-7Dkdj1_aaI17xYNWbQQN@xxxxxxxxxxxxxx> To: Rich Signell<rsignell@xxxxxxxx> Cc: Ted Habermann<ted.habermann@xxxxxxxx>, Steve Hankin<Steven.C.Hankin@xxxxxxxx>, David Neufeld<David.Neufeld@xxxxxxxx>, _OAR PMEL UAF Tech List<uaf_tech@xxxxxxxx>, Ethan Davis<edavis@xxxxxxxxxxxxxxxx>, support-thredds@xxxxxxxxxxxxxxxx Hi Rich, et al: I agree that modifying NcML in the TDS when files arrive is not a viable solution. You need to use a scan element for this, although we are replacing<scan> elements with<collection> elements (in FMRC right now, will be extended to other aggregations in 4.3). 1) Specifying the time range in the catalog for this case is possible. Heres how we do it on motherlode: <timeCoverage> <end>present</end> <duration>7 days</duration> </timeCoverage> this means that the starting time is "present" - 7 days. The TDS generates the actual ISO dates in the catalog, eg at this moment: TimeCoverage: Start: 2010-06-03 12:04:57Z End: 2010-06-10 12:04:57Z Duration: 7 days A bit more detail at: http://www.unidata.ucar.edu/projects/THREDDS/tech/catalog/v1.0.2/InvCatalogSpec.html#timeCoverageType 2) One can also generate time ranges from the filename, see "Adding timeCoverage" in http://www.unidata.ucar.edu/projects/THREDDS/tech/tds4.2/reference/DatasetScan.html this is used when you have files with the starting time embedded in the filename and a known duration. 3) We are moving towards automatic generation of the time coverage, as Rich mentioned, we do that now in the FMRC, and we will try to extend that to other aggregations where the time coordinate can be extracted Not sure if I covered all the issues. John Rich Signell wrote:Guys, Sorry to sent this twice, but I wanted to cc John Caron and Ethan Davis to allow them to comment. -Rich On Wed, Jun 9, 2010 at 6:02 PM, Rich Signell<rsignell@xxxxxxxx> wrote:Ted, With time aggregations, the virtual dataset is served dynamically via THREDDS as new data arrives without modifying the underlying catalog that specifies the aggregation. We don't want to be modifying NcML in the catalog every time a file arrives. So it seems we have two choices: (1) have the crawler actually read the last time value and since it's CF-compliant, this is easy (there is a NetCDF-Java function for this). I think both ncISO and RAMADDA already do this. (2) we ask Unidata to modify the TDS so that it automatically generates the stop time as THREDDS metadata. It already does this for FRMC aggregations. On the plus side, this ensures that we get the right time without reading the time values. The disadvantage is that it would only work for TDS served data. -Rich On Wed, Jun 9, 2010 at 5:42 PM, Ted Habermann<ted.habermann@xxxxxxxx> wrote:Rich et al., Seems to me our first choice should be to use an existing standard for describing time periods. In my experience the most commonly used is ISO 8601. Describing time periods of known duration is straightforward if we know the starting point. For example a period with duration 7 days starting today would be: 20100609/P7D. There are probably a couple ways to expressing this explicitly in NcML: <attribute name="time_coverage_start" value="2010-06-09"/> <attribute name="time_coverage_duration" value="P7D"/> or, it may make sense to just calculate the end time and write it into the file: <attribute name="time_coverage_start" value="2010-06-09"/> <attribute name="time_coverage_end" value="2010-06-16"/> If we are dealing with collection level NcML (?), one could say <attribute name="time_coverage_start" value="present"/> <attribute name="time_coverage_duration" value="P7D"/> I'm not sure off hand how this would get translated to ISO. Maybe <gmd:temporalElement> <gmd:EX_TemporalExtent> <gmd:extent> <gml:TimePeriod gml:id="t3"> <gml:beginPosition indeterminatePosition="now"/> <gml:endPosition>P7D</gml:endPosition> </gml:TimePeriod> </gmd:extent> </gmd:EX_TemporalExtent> </gmd:temporalElement> Ted On 6/9/2010 12:34 PM, Steve Hankin wrote: David Neufeld wrote: Hi Rich, Steve, I think if we move toward a model where metadata is handled as a service as opposed to a static file this problem starts to go away. Agree in principle. I have argued this same pov with Ted -- that we should not insist that metadata be inserted into files, if that metadata is derivable from information already contained in the file. Ideas for implementing this approach? The most appealing to me is that TDS, itself, would generate data discovery metadata such as time_coverage_start = "present minus 30 days"; // a running archive time_coverage_end = "present plus 10 days"; // a forecast based upon coordinates and use metadata found inside the dataset, and perhaps some new ncML directives that govern the "metadata service". But the questions remain: who would do this work and when? And what should UAF do in the interim (i.e. now)? - Steve So for example, if we generate metadata dynamically and it contains the standard static attributes along side of dynamically retrieved values for geographic and temporal bounds then we're in good shape at the catalog level. There is still the issue of how often to harvest the metadata in other clearinghouses like RAMADDA or Geonetwork, but that can be left more for the portal provider to determine. Dave On 6/9/2010 10:39 AM, Steve Hankin wrote: Rich Signell wrote: UAF Folks, I can't make the 12:30 ET/9:30 PT meeting tomorrow, but here are my two issues: Hi Rich, Sorry you cannot make it. With that in mind have started the conversations here by email ... 1) How to handle temporal metadata for time aggregated datasets that are changing every day (or perhaps every 15 min for the HF Radar measurements). I got bit by this when I did a temporal/ geospatial search in RAMADDA for UAF data in the Gulf of Mexico during the last week and turned up no datasets. It should have turned up the NCOM Region 1 model data, HF radar data and USGS COAWST model results. I'm pretty sure the problem is that RAMADDA harvested the data from the clean catalog more than a week ago, so the "stop dates" in the metadata database are older than one week ago. How should this best be fixed? Might this be best addressed by using the Unidata Discover Attribute recommendations: http://www.unidata.ucar.edu/software/netcdf-java/formats/DataDiscoveryAttConvention.html? They offer the global attribute: time_coverage_end = "present" Arguably within UAF we should insert such global attributes into the relevant datasets and also work to communicate the need back to the data providers to do so on their own THREDDS servers. An alternative to consider is putting this information into the THREDDS metadata instead of into the ncML of the dataset. btw: A seeming omission in the Unidata recommendations is any way to represent "3 months ago" as the start time. A start time of this style is pretty common in operational outputs. 2) How to represent FMRC data. If we scan a catalog with a Forecast Model Run Collection we currently get hundreds of datasets, because the FRMC automatically produces datasets for the daily forecasts as well as the "Best Time Series" dataset that most people are interested in. In the latest version of the Thredds Data Server (4.2 beta), the provider can specify they only want the best time series dataset to be exposed. This will help significantly, but it will take a while to get everybody with FMRCs retrofit. I will bring this up on the Model Data Interoperability Google Group. Might be best to hold off this topic until you are on the phone, since you are our resident expert. No? - Steve -- ==== Ted Habermann =========================== Enterprise Data Systems Group Leader NOAA, National Geophysical Data Center V: 303.497.6472 F: 303.497.6513 "I entreat you, I implore you, I exhort you, I challenge you: To speak with conviction. To say what you believe in a manner that bespeaks the determination with which you believe it. Because contrary to the wisdom of the bumper sticker, it is not enough these days to simply QUESTION AUTHORITY. You have to speak with it, too." Taylor Mali,www.taylormali.com ====Ted.Habermann@xxxxxxxx ==================-- Dr. Richard P. Signell (508) 457-2229 USGS, 384 Woods Hole Rd. Woods Hole, MA 02543-1598-- Kevin O'Brien UW/JISAO Research Scientist NOAA/PMEL/TMAP 206-526-6751http://www.pmel.noaa.gov "The contents of this message are mine personally and do not necessarily reflect any position of the Government or the National Oceanic and Atmospheric Administration."
-- Sincerely, Bob Simons IT Specialist Environmental Research Division NOAA Southwest Fisheries Science Center 1352 Lighthouse Ave Pacific Grove, CA 93950-2079 Phone: (831)658-3205 Fax: (831)648-8440 Email: bob.simons@xxxxxxxx The contents of this message are mine personally and do not necessarily reflect any position of the Government or the National Oceanic and Atmospheric Administration. <>< <>< <>< <>< <>< <>< <>< <>< <><
thredds
archives: