NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.

Re: [thredds] How to download bulk datasets?

I too wish there is a simple link on the server side to allow our users to use 
wget for downloading all files of a collection. 

I think it might be the case that every user would want to do a bulk file 
download from a thredds server, and I think it might be the case that a 
provider should have the burden to have a simple link for bulk download all 
granules. 

We can do some configurations or add a servlet for this, like Heiko has done, 
although I think it would be a nice to have feature directly from TDS software. 
It would seem to me that this is can be implemented as a dynamic URL at 
collection level to return a list of HTTP download URLs of the files 
('fielServer') for the simplest case.

Comments?

Thanks,
-Jerry

> -----Original Message-----
> From: thredds-bounces@xxxxxxxxxxxxxxxx 
> [mailto:thredds-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Heiko Klein
> Sent: Monday, May 10, 2010 5:18 AM
> To: John Caron
> Cc: thredds@xxxxxxxxxxxxxxxx
> Subject: Re: [thredds] How to download bulk datasets?
> 
> Hi John,
> 
> I played a bit more with the catalog.xml. This works well 
> with wget. I managed now to download all the netcdf-files 
> from a directory:
> 
> wget -nc -r -l2 -A.nc   -I /thredds/fileServer/,/thredds/catalog/
> 'http://dev-vm188/thredds/catalog/osisaf/met.no/ice/'
> 
> I use here the existing datasetScan catalog.xml file, and 
> fetch all nc-files up to two links away. Beside the nc-file, 
> I get the catalog-file of the nc-file (e.g.
> http://dev-vm188/thredds/catalog/osisaf/met.no/ice/catalog.htm
> l?dataset=met.no/ice/ice_conc_nh_200911261200_CF.nc),
> too.
> 
> A catalog-file in the fileServer would be saver, since the 
> 2-levels (parent and child) might include other information, 
> but at least I can offer our users something already now.
> 
> 
> Best regards,
> 
> Heiko
> 
> On 2010-05-06 21:31, John Caron wrote:
> > Hi Heiko:
> > 
> > We use catalog.xml exactly because theres no standard html 
> index format.
> > A simple java GUI app could make this easy to do, but Im 
> not clear if 
> > that would help your case.
> > 
> > John
> > 
> > On 5/6/2010 3:16 AM, Heiko Klein wrote:
> >> Hi John,
> >>
> >> I don't think there is a standard format for directory 
> index / listings.
> >> Looking at the different implementations (Tomcat (DefaultServler, 
> >> listing = true), Jetty (dirAllowed = true), Apache (mod_dir,
> >> DirectoryIndex)) the common pattern is, that they all have 
> links to 
> >> all
> >> (non-hidden) files in the directory, and not much more (possibly 
> >> parent directory and some gifs/png differing between file 
> and directory).
> >> Thredds listings of 'datasetScan' look very similar to the tomcat 
> >> listings, except that they link to the dataset-overview 
> page, and not 
> >> to the fileServer page.
> >>
> >> RAMMADDA looks like a solution for a completely different type of 
> >> users, except for the embedded ftp server.
> >>
> >> Best regards,
> >>
> >> Heiko
> >>
> >>
> >> On 2010-05-05 01:28, John Caron wrote:
> >>   
> >>> Hi Heiko:
> >>>
> >>> TDS specializes in the logical subsetting of datasets, so 
> we havent 
> >>> thought much about file downloading.
> >>>
> >>> The index is provided by THREDDS catalogs, eg
> >>>
> >>> 
> view-source:http://thredds.met.no/thredds/catalog/data/met.no/ice-dr
> >>> ift/catalog.xml
> >>>
> >>>
> >>>
> >>> If it was me, I would write a nice little client app to 
> make it easy 
> >>> to select files and download. Perhaps we will throw one together.
> >>>
> >>> If  there is some standard format for "index.html" that 
> works with 
> >>> wget and other clients, perhaps we can provide that.
> >>>
> >>> Otherwise, RAMMADDA is another good solution.
> >>>
> >>> John
> >>>
> >>> On 5/3/2010 3:47 AM, Heiko Klein wrote:
> >>>     
> >>>> Hi,
> >>>>
> >>>> we are moving more and more from our ftp-solutions to 
> thredds with 
> >>>> http and opendap enabled.
> >>>>
> >>>> Some users complain about this solution, since it is no longer 
> >>>> possible to download bulk datasets, that is, all files in one 
> >>>> directory. Our ftp-server supported 'ls' and several ftp-clients 
> >>>> have support for that so e.g.
> >>>> ftp ftp.my.server
> >>>> $ cd directory
> >>>> $ mget *.nc
> >>>> worked well.
> >>>>
> >>>> There are some http-downloader which support mirroring of a 
> >>>> directory which would be comparable, but this requires a proper 
> >>>> directory-listing for the http-download.
> >>>>
> >>>> An example:
> >>>> http://thredds.met.no/thredds/catalog/data/met.no/ice-drift/
> >>>> contains daily files of several years. To clicks further 
> >>>> 
> http://thredds.met.no/thredds/fileServer/data/met.no/ice-drift/ice-
> >>>> 
> drift_ice_drift_nh_polstere-625_multi-oi_200912311200-201001021200.
> >>>> nc
> >>>>
> >>>>
> >>>> is one of those files.
> >>>>
> >>>> wget -r -l1 --no-parent -A.nc
> >>>> 'http://thredds.met.no/thredds/fileServer/data/met.no/ice-drift/'
> >>>> was my best try to get all netcdf-files in the ice-drift catalog.
> >>>> Unfortunately, this requires a ice-drift/index.html (or
> >>>> directory-listing) which doesn't exists.
> >>>>
> >>>>
> >>>> Does anybody knows about a solution to download several 
> (hundred) 
> >>>> files from a thredds-server in a simple way?
> >>>> I even thought about aggregation, but as far as I see, 
> this doesn't 
> >>>> work with the http-downloader, but requires a opendap 
> client (i.e. 
> >>>> nco), which might be to complicated, and might lead to errors if 
> >>>> products change of the years (better resolution, updated 
> >>>> metadata...)
> >>>>
> >>>> Best regards,
> >>>>
> >>>> Heiko
> >>>>
> >>>> _______________________________________________
> >>>> thredds mailing list
> >>>> thredds@xxxxxxxxxxxxxxxx
> >>>> For list information or to unsubscribe,  visit:
> >>>> http://www.unidata.ucar.edu/mailing_lists/
> >>>>
> >>>>        
> >>> _______________________________________________
> >>> thredds mailing list
> >>> thredds@xxxxxxxxxxxxxxxx
> >>> For list information or to unsubscribe,  visit:
> >>> http://www.unidata.ucar.edu/mailing_lists/
> >>>      
> > 
> 
> _______________________________________________
> thredds mailing list
> thredds@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit: 
> http://www.unidata.ucar.edu/mailing_lists/ 
> 


  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: