NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Hi Hein:The first time that TDS 4.3 reads a grib file, it has to create indices (*.gbx9 and *.ncx). That will be very slow the first time, then much faster after that. Check to see if thats whats happening. Wait till all the index files are created, and then see how fast things are.
John On 8/29/2013 6:52 AM, Hein Zelle wrote:
Dear all, coming back to my question on NCEP grib aggregation a while back: John Caron wrote:What you are seeing is the limitations of aggregations. In this case, there are 3 different time coordinates in the collection, but NcML aggregation can only aggregate on one of them. You want to use feature collections instead. Replace your entire <dataset> element with something like: <featureCollection name="myCollectionName" featureType="GRIB" path="grib/NCEP/GFS/etc"> <collection spec="/pub/data/nccf/com/gfs/prod/gfs.2013060600/gfs\.t00z\.pgrb2f..$" dateFormatMark="#prod/gfs.#yyyyMMddHH" /> </featureCollection>I've finally implemented the GRIB featurecollection, and it seems to be working. I can access my grib files (although indexing of the full ncep ensemble takes a while!) and the data comes out OK. (timeseries for a single location, for each ensemble member). I'm experiencing a problem though: data extraction is extremely slow. I'm comparing to my old situation where I listed all ensemble member grib files in a single ncml file. A data extraction for one location (all members, for a single 15 day forecast) took 5 minutes in thredds 4.1 with this system. In the new situation (thredds 4.3), I use a grib feature collection sorted by directory (forecast cycle). After half an hour, the extraction is still not done. In the featureCollectionScan.log I can see that thredds keeps scanning all folders:[2013-08-29T12:40:28.786+0000] INFO thredds.inventory.MFileCollectionManager: 2013082618 : was scanned MCollection{name='2013082618', dirName='/output/operational/atmosphere/ncep/gefs/1.0deg/2013082618', wantSubdirs=true, ff=WildcardMatchOnPath{wildcard=null regexp=gefs\..*\.f.*\.grib2$}}It does this for ALL forecast cycle folders (I have about 20), even though I am accessing only the 2013082900 directory. Could anyone give me tips on how to prevent thredds from continuously re-scanning the whole directory structure with grib files? Current setup: <featureCollection name="gefs_col" featureType="GRIB" path="ncep/gefs/1.0deg"> <!-- be specific here with the file selector, other grib2 files may be hanging around in the tree --> <collection spec="/output/operational/atmosphere/ncep/gefs/1.0deg/**/gefs\..*\.f.*\.grib2$" dateFormatMark="#0deg/#yyyyMMddHH" timePartition="directory" name="gefs_col_unique" /> <update startup="true" trigger="allow"/> </featureCollection> This organizes the data the way I want: I get a single url per cycle: .../thredds/dodsC/ncep/gefs/1.0deg/2013082900/best .../2013082800/best .../2013082700/best The data comes out the way I want, but as mentioned above it's _extremely_ slow, likely due to re-scanning of the disk structure. I don't really need automatic updating, a manual trigger when a new forecast is downloaded would be ok too. I would prefer thredds to scan and index the grib files only once upon a manual trigger. Any hints on how to improve this? Kind regards, Hein ZelleSend thredds mailing list submissions to thredds@xxxxxxxxxxxxxxxx To subscribe or unsubscribe via the World Wide Web, visit http://mailman.unidata.ucar.edu/mailman/listinfo/thredds or, via email, send a message with subject or body 'help' to thredds-request@xxxxxxxxxxxxxxxx You can reach the person managing the list at thredds-owner@xxxxxxxxxxxxxxxx When replying, please edit your Subject line so it is more specific than "Re: Contents of thredds digest..." thredds mailing list thredds@xxxxxxxxxxxxxxxx For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/ Today's Topics: 1. Re: aggregating GFS data, problem with accumulated (Hein Zelle) 2. Re: aggregating GFS data, problem with accumulated (John Caron) ---------------------------------------------------------------------- Message: 1 Date: Thu, 6 Jun 2013 11:30:36 +0200 From: Hein Zelle <hein.zelle@xxxxxxxxxxxxx> To: thredds@xxxxxxxxxxxxxxxx Subject: Re: [thredds] aggregating GFS data, problem with accumulated Message-ID: <20130606093035.GA17727@xxxxxxxxxxxxxxxxxxxx> Content-Type: text/plain; charset="us-ascii" Dear John, attached to this email is a complete ncml file that we place next to the data files. The data files themselves are too big to upload, but they are standard gfs grib2 files, you can find them at ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gfs.2013060600 (that's for this morning, modify the date as needed) The files are the grib2 files at 0.5 degree, e.g. gfs.t00z.pgrb2bf30 (50 mb each). The previous snippet of ncml I sent should also work, you'll have to modify the paths to the correct folder of course. A variable to check is for example Total_precipitation_surface_3_Hour_Accumulation These should have multiple time steps, but I get only 1 time step (the first, for the +03 forecast). The +00 analysis doesn't contain the precipitation fields. Any variable with an accumulation or averaging interval exhibits the problem. Kind regards, Hein Zelle
thredds
archives: