Re: [netcdf-java] [thredds] GRIB variable name changes in 4.3

To: don.murray@xxxxxxxx
Subject: Re: [netcdf-java] [thredds] GRIB variable name changes in 4.3
From: Glenn Rutledge <glenn.rutledge@xxxxxxxx>
Date: Wed, 29 Feb 2012 10:40:22 -0500
Authentication-results: mr.google.com; spf=pass (google.com: domain of glenn.rutledge@xxxxxxxx designates 10.224.116.6 as permitted sender) smtp.mail=glenn.rutledge@xxxxxxxx

Touche Don. (or should I say Dave Bowman)
This is a difficult issue for me to decide upon b/c in a sense, we are all
right.  Can we achieve a parsing of the long name to the short name for
client displays etc.
John- care to chime in ?
Glenn


On Wed, Feb 29, 2012 at 10:14 AM, Don Murray <don.murray@xxxxxxxx> wrote:

> Hi Glenn-
>
> Thanks for the response.  What I hear you saying is that the underlying
> infrastructure that John is creating (i.e. the GribFeatureCollection) and
> the fixes to what's broken in the identification of the data (e.g. the
> break out of the variables on different accumlation times) will help you
> provide consistent results.  I agree that these changes are necessary.
>
> However, I think the same thing can be achieved with the human readable
> variable names.  There is no guarantee that the VAR_* names won't change in
> the future.  As John discussed with me last week, if he finds a new PDS
> variable that he thinks is important, it could be added to the variable
> name and then we go through the pain again.  That's no different than
> changing the human readable names.  The lookup for creating consistent
> human readable names is already there to create the long name.
>
> Even with the human readable names, there will be pain for tool developers
> that access the data, because some names will change.  It will require
> changes to the IDV, but at least they will be manageable. The permalinks in
> the Godiva WMS viewer that is part of the TDS will break because they use
> the variable name to get the data.
>
> I think the human readable names serve the end users better than the VAR_*
> names.  For example, if I go to NOMADS now and go to a GRIB2 file and
> choose the OPeNDAP view, I get a list of variables that I can choose. Ex:
>
> http://nomads.ncdc.noaa.gov/**thredds/dodsC/gfs4/201202/**
> 20120229/gfs_4_20120229_0000_**180.grb2.html<http://nomads.ncdc.noaa.gov/thredds/dodsC/gfs4/201202/20120229/gfs_4_20120229_0000_180.grb2.html>
>
> The variables that are selectable are in bold letters and easy to read.  I
> can quickly scroll through the page to find the variable I'm interested in.
> While the long_name is listed in lesser print, it doesn't stand out like
> the variable name does.  In the new scheme, what will stand out on the page
> is lots of VAR_* names which all look similar. You could argue that no one
> uses this OPeNDAP interface, but I know that there are some who do.
>
> Or, if I go to the NetcdfSubsetService for a grib file on motherlode:
>
> http://motherlode.ucar.edu/**thredds/ncss/grid/fmrc/NCEP/**
> GFS/Global_onedeg/files/GFS_**Global_onedeg_20120229_0600.**
> grib2/dataset.html<http://motherlode.ucar.edu/thredds/ncss/grid/fmrc/NCEP/GFS/Global_onedeg/files/GFS_Global_onedeg_20120229_0600.grib2/dataset.html>
>
> I see human readable names. In the end, I don't see that the VAR_ names
> serve the end user.
>
> As someone on the IDV users list said, "Hal, who do you serve: machines or
> humans?" ;-)
>
> Don
>
>
>
> On 2/29/12 7:16 AM, Glenn Rutledge wrote:
>
>> Hi Don,
>> That is a very good question and I left that out in my response.
>>
>> Long term access for users in archives means we constantly have to work
>> to fully document, understand, track down any data provenance issues,
>> and verifying (to a lessor degree), the data.  What it says it is- it
>> actually is.  Its just a form of quality assurance for users.  Data
>> providers - especially 'real time' ones don't necessarily concern
>> themselves with these issues. They make a product- and move on.  I'll
>> bet you are fully aware that the WOC/Gateway does not even provide a
>> complete DTG in the file name for many NWP products!  I used to work w/
>> John Stackpole (great guy)- the original developer of Grib. He made grib
>> as a compact communications protocol- not, as I'll also bet you are also
>> aware, for archives.
>>
>> NOMADS has about 1+ petabyte to manage for users- we serviced a growing
>> 550TB last year and we need to scale.  By aggregating the data most used
>> by users (common state variables, most popular, etc.)  we can allow
>> streaming of files/records that allows the 50K+ users and ~300 million
>> downloads per year on NOMADS much better. Methods such as
>> pre-staging/caching most requested data on disk from tape, etc. etc.
>>
>> What John is attempting to do will facilitate the access for multiple
>> users, requesting multiple files using aggregations and other streaming
>> caching (I don't quite understand the details there). Now- we can't even
>> ascertain with any degree of confidence what is what- in order to even
>> be able to aggregate- let alone feel comfortable about the accuracy of
>> the data we are serving to users.
>>
>> It does not really help users find data- per se.  It will help users
>> have more confidence that a aggregated monthly mean product from CFSR is
>> mean for each cycle (0, 6, 12, ..) for individual days of the month (the
>> diurnals)- rather then a typical monthly mean avg'ed over the entire day.
>>
>> hope that makes sense.   I'm not sure what other impacts this will have
>> for us here - LAS? our TDS to ESGF capabilities?  It's kinda scary, but
>> John's radical change looks to solve a major archive problem I do know
>> that.   We will run 4.2 and 4.3 in parallel I will tell you that for
>> some time.
>>
>> Best regards, Glenn
>>
>> On Tue, Feb 28, 2012 at 2:19 PM, Don Murray <don.murray@xxxxxxxx
>> <mailto:don.murray@xxxxxxxx>> wrote:
>>
>>    Hi Glenn-
>>
>>
>>    On 2/28/12 11:43 AM, Glenn Rutledge wrote:
>>
>>        John and Community-
>>        While I do not represent the NCDC Archive, for the NCDC NOMADS
>>        systems
>>        and our users, I must agree that the changes John is proposing will
>>        facilitate the long term use of grib data.  While painful to
>>        (existing)
>>        client (software | decoders), the proposed change will allow our
>>        users
>>        (with a more scalable way) to -better find and use our data.  I'll
>>        suggest that if this is adopted, NOMADS servers could provide
>>        both 4.2
>>        and 4.3 versions to (give software developers time to adapt)
>>        allow the
>>        client-side to adapt.
>>
>>
>>    Could you elaborate on how you see that the new variable names will
>>    allow the users to better find and use your data versus the human
>>    readable names?  For example, if I want to get the 500 hPa heights
>>    from a model in your archive, how will the new names facilitate that?
>>
>>    Don
>>
>>    --
>>    Don Murray
>>    NOAA/ESRL/PSD and CIRES
>>    303-497-3596 <tel:303-497-3596>
>>    
>> http://www.esrl.noaa.gov/psd/_**_people/don.murray/<http://www.esrl.noaa.gov/psd/__people/don.murray/>
>>
>>    
>> <http://www.esrl.noaa.gov/psd/**people/don.murray/<http://www.esrl.noaa.gov/psd/people/don.murray/>
>> >
>>
>>
>>
>>
>> --
>> Glenn K. Rutledge
>> Meteorologist/Physical Scientist
>> NOMADS Team Leader
>> National Climatic Data Center
>> Asheville, NC 28801
>> (828) 271-4097
>> nomads.ncdc.noaa.gov <http://nomads.ncdc.noaa.gov>
>>
>>
> --
> Don Murray
> NOAA/ESRL/PSD and CIRES
> 303-497-3596
> http://www.esrl.noaa.gov/psd/**people/don.murray/<http://www.esrl.noaa.gov/psd/people/don.murray/>
>



-- 
Glenn K. Rutledge
Meteorologist/Physical Scientist
NOMADS Team Leader
National Climatic Data Center
Asheville, NC 28801
(828) 271-4097
nomads.ncdc.noaa.gov

Follow-Ups:
- Re: [netcdf-java] [thredds] GRIB variable name changes in 4.3
  - From: Don Murray

References:
- [netcdf-java] GRIB variable name changes in 4.3
  - From: John Caron
- Re: [netcdf-java] GRIB variable name changes in 4.3
  - From: Don Murray
- Re: [netcdf-java] GRIB variable name changes in 4.3
  - From: John Caron
- Re: [netcdf-java] GRIB variable name changes in 4.3
  - From: Don Murray
- Re: [netcdf-java] [thredds] GRIB variable name changes in 4.3
  - From: Glenn Rutledge
- Re: [netcdf-java] [thredds] GRIB variable name changes in 4.3
  - From: Don Murray
- Re: [netcdf-java] [thredds] GRIB variable name changes in 4.3
  - From: Glenn Rutledge
- Re: [netcdf-java] [thredds] GRIB variable name changes in 4.3
  - From: Don Murray