NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.

Re: [Fwd: [Fwd: Forecast Model Run Collection Aggregation prototype available]]

  • To: John Caron <caron@xxxxxxxxxxxxxxxx>
  • Subject: Re: [Fwd: [Fwd: Forecast Model Run Collection Aggregation prototype available]]
  • From: "dan.swank" <Dan.Swank@xxxxxxxx>
  • Date: Wed, 16 Aug 2006 17:21:12 -0400
John Caron wrote the following on 8/16/2006 3:39 PM:
> Hi Dan:
> 
> dan.swank wrote:
> 
>> This will be a challenge for sure.
>> The NARR, for example, will be an aggregation of ~75000 grib files.
>> Stored in a basic ./YYYYMM/YYYYMMDD tree.  The recursive datasetScan
>> tag added recently helps a ton with this.  Some of our datasets have
>> forecast hours, some don't.  Doing n forecast hour aggregation across
>> the 00hr will help termendously with all of them, however.
>> While it works wonderfully for NetCDF, I cannot see the NcML agg.
>> working with this set of data ~
>> mainly due to the changing reference times.
>>  
>>
> I think the FMRC will probably solve it. However, a 75,000 file
> aggregation will be a challenge. Im actually pretty sure we can solve it
> (with enough server memory!) but it does worry me that with a single
> dods call, someone could make a request that requires opening 75,0000
> files to satisfy. OTOH, if thats the service you want to provide, it
> sure is a lot better doing it on the server!!! Any thoughts?

Throttles... If the dev team could create an element to specify
the maximum size of a request in either bytes returned or
 number of files accessed, that would be great.
> 
> Looking at the NARR data:
>  - it looks like you have them divided by day, then all for the same month.
>  - it looks like all the time coordinates are either 0 or 3 hour offsets
> from run time.

The NARR is a reanalysis, as it contains variables
defined at instantaneos initial time,
   or a 0 to 3 hour average/total/ or other operation.

>  - whats the difference bewteen narr_a and narr_b? Should they be
> combined or kept seperate?

The differences are explained here:
http://nomads.ncdc.noaa.gov/data.php?name=narrdiffs

>  - i assume new files are added now and then?  how often? ever deleted?

New NARR comes in from NCEP on an irregular basis.  Typeically,
this is on a once a month or less frequency.  This archive is set to
grow indefinately, the files are never deleted.
> 
>> According to NCEP, our NAM & GFS will soon be foreced into GRIB2.
>> But NCDC-NOMADS NWP it currently entirely a GRIB-1 archive.
>> Only recently home-grown NCDC datasets are created in NetCDF.
>>
>> For NAM & GFS, we have about 6 months online, which comes out to
>> about 700 file when stripped to a 1 forecast time
>> (say 00hr) aggregation.  But there are 61 forecast times for GFS, and 21
>> for NAM.
>>  
>>
> Do you store each hour seperately, or are all the forecast hours for a
> run in the same file?

We store them in a one file per forecast hour, which contains all
parameters and vertical levels for that forecast hour.


-Dan


  • 2006 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: