NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Hi all, It has been awhile since I worked on the CrawlableDataset code. There is some documentation, its a bit rough and was never really announced or linked in to the rest of the documentation. A quick summary: Behind the scenes of the datasetScan element, the thredds.cataloggen.CatalogBuilder interface is used to build catalogs. It uses the thredds.crawlabledataset.CrawlableDataset interface to scan for datasets. The default CrawlableDataset implementation is thredds.crawlabledataset.CrawlableDatasetFile. The only other implementation in the TDS distribution is thredds.crawlabledataset.CrawlableDatasetDods. Simon, I'm curious about the changes you've made as well. Are they in CrawlableDatasetDods? Since the old OPeNDAP servers don't have a standard directory interface, the code ends up scraping HTML and that just gets ugly and hard to be general. So, I wouldn't be surprised if that code might need tweaking depending on the OPeNDAP server it is crawling. Anyway, here some of the docs we have: http://www.unidata.ucar.edu/projects/THREDDS/tech/cataloggen/devel/architecture.html http://www.unidata.ucar.edu/projects/THREDDS/tech/cataloggen/devel/userImplementation.html http://www.unidata.ucar.edu/projects/THREDDS/tech/cataloggen/devel/datasetScanElement.html also javadoc for thredds.cataloggen and thredds.crawlabledataset are available in our complete javadoc: http://www.unidata.ucar.edu/software/netcdf-java/v4.0/javadocAll/index.html Ethan Rich Signell wrote: > Simon, > > This sounds extremely useful and I'd love to give it a try. > > Can you please tell us what the "trivial" changes are to NetCDF-Java? > > And do you have a real-life example of the catalog below that works > with publicly available OpenDAP data? > > Thanks, > Rich > > On Wed, Apr 8, 2009 at 8:28 PM, <Simon.Pigot@xxxxxxxx> wrote: >> Hi Pauline, >> >> The following works ok for us (as an example - non-essential details >> removed): >> >> <?xml version="1.0" encoding="UTF-8"?> >> <catalog name="YOUR SITE OPeNDAP Catalog" >> xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >> xmlns:xlink="http://www.w3.org/1999/xlink"> >> >> <service name="yoursiteopendap" serviceType="OpenDAP" >> base="http://www.yoursite.com/dods/nph-dods/dods-data/"/> >> <datasetScan name="climatology-netcdf" path="climatology-netcdf" >> location="http://www.yoursite.com/dods/nph-dods/dods-data/climatology-netcdf"> >> <serviceName>yoursiteopendap</serviceName> >> <crawlableDatasetImpl >> className="thredds.crawlabledataset.CrawlableDatasetDods" /> >> </datasetScan> >> <datasetScan name="bluelink" path="bluelink" >> location="http://www.yoursite.com/dods/nph-dods/dods-data/bluelink"> >> <serviceName>yoursiteopendap</serviceName> >> <crawlableDatasetImpl >> className="thredds.crawlabledataset.CrawlableDatasetDods" /> >> </datasetScan> >> </catalog> >> >> I'm not sure if its all documented somewhere - I worked it out the slow way >> by poking around in the netcdf java code and hunting through the archives of >> the thredds mailing list. There are also some trivial changes you need to >> make to the code (in netcdf-java) to filter out some unwanted artifacts >> created when the scan picks through the html from the OpenDAP server - >> otherwise you end up with some strange, non-functional things in your >> catalog. Maybe there is a better way to do the above? >> >> By way of introduction, we want this sort of catalog to work as part of a >> thredds metadata harvester I'm adding to GeoNetwork which produces ISO19115 >> metadata records and ISO19119 records for thredds services. Its nearly at >> the stage where it is working reliably but there are a few more issues I >> need to solve and I'm still learning about Thredds :-) >> >> Cheers and I hope this helps, >> Simon >> >> ________________________________________ >> From: thredds-bounces@xxxxxxxxxxxxxxxx [thredds-bounces@xxxxxxxxxxxxxxxx] On >> Behalf Of Pauline Mak [Pauline.Mak@xxxxxxxxxxx] >> Sent: Thursday, 9 April 2009 8:56 AM >> To: thredds@xxxxxxxxxxxxxxxx >> Subject: [thredds] Running THREDDS on top of old OPeNDAP servers >> >> Hi all, >> >> I'm figuring out ways to serve data using THREDDS on top of old OPeNDAP >> servers. I'm aware that you can configure datasets based on a URL, but >> that's for a single file... (correct me if I'm wrong!) However, are >> there ways to apply to an directory? Sort of like a datasetScan + >> filters for a directory URL? When poking through the THREDDS catalog >> XSD, there's a crawlableDatasetImpl element. Is that the sort of things >> I need to look at? >> >> Thanks, >> >> -Pauline. >> >> -- >> Pauline Mak >> >> ARCS Data Services >> Ph: (03) 6226 7518 >> Email: pauline.mak@xxxxxxxxxxx >> Jabber: pauline.mak@xxxxxxxxxxx >> http://www.arcs.org.au/ >> >> TPAC >> Email: pauline.mak@xxxxxxxxxxx >> http://www.tpac.org.au/ >> >> >> >> _______________________________________________ >> thredds mailing list >> thredds@xxxxxxxxxxxxxxxx >> For list information or to unsubscribe, visit: >> http://www.unidata.ucar.edu/mailing_lists/ >> >> _______________________________________________ >> thredds mailing list >> thredds@xxxxxxxxxxxxxxxx >> For list information or to unsubscribe, visit: >> http://www.unidata.ucar.edu/mailing_lists/ >> > > > -- Ethan R. Davis Telephone: (303) 497-8155 Software Engineer Fax: (303) 497-8690 UCAR Unidata Program Center E-mail: edavis@xxxxxxxx P.O. Box 3000 Boulder, CO 80307-3000 http://www.unidata.ucar.edu/ ---------------------------------------------------------------------------
thredds
archives: