NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
NOTE: The decoders
mailing list is no longer active. The list archives are made available for historical reasons.
David, This is discouraging, I spent hours looking through raw bulletins to "try" make the decoder correct. I might try looking at the station ID before processing, don't know if that will help. My philosophy is it's better to disregard bulletins/reports rather then enter "bad" data into a file. That said, your example bulletin should be discarded. ugh. Will let you know about my new ideas. Robb... On Thu, 4 Mar 2004, David Larson wrote:
I do see non-US bulletins that are split "badly" according to this code change ... 628 SANK31 MNMG 041706 METAR MNPC 041700Z 08012KT 7000 BKN016 29/26 Q1015 MNRS 041700Z 06010KT 9999 FEW022 BKN250 30/23 Q1012 MNJG 041700Z 36004KT 7000 VCRA BKN016 22/17 Q1015 MNJU 041700Z 10006KT 9999 SCT025 32/19 Q1013 MNCH 041700Z 02010KT 9999 SCT030 33/21 Q1012 MNMG 041700Z 07016KT 9999 SCT025 32/20 Q1011 A2988 MNBL 041700Z 10008KT 9999 SCT019 SCT070 29/25 Q1014 Not only is each line a new report, but what worsens it is that the *last* entry *is* separated by an equal! Yuck. Perhaps this is just a junk bulletin? I'm suprised that it could even go out this way. Does anyone you know even make an attempt to use the perl metar decoder for non-US stations? I've tried long enough to estimate the work as a *lot*. Dave David Larson wrote: > I've looked into this problem, which I didn't know existed. > > Your code is now: > > # Separate bulletins into reports > if( /=\n/ ) { > s#=\s+\n#=\n#g ; > @reports = split( /=\n/ ) ; > } else { > #@reports = split ( /\n/ ) ; > s#\n# #g ; > next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; > $reports[ 0 ] = $_ ; > } > > But based on your assumption that these bulletins will not, and > cannot, contain multiple reports (which seems and appears to be > reasonable), then there really only needs to be one split, right? > Because if there is no equal, the entire line will be placed into the > first report. This seems to be a slight simplification: > > # Separate bulletins into reports > if( /=\n/ ) { > s#=\s+\n#=\n#g ; > } else { > s#\n# #g ; > } > @reports = split( /=\n/ ) ; > ... snip ... the next line is placed down many lines > next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; > > Also, it is an error to have multiple time specifications in any > report, right? So that can be generalized as well, as I have done above. > > You asked for my comments, and well, there you have them! :-) I > might take a closer look at the rest of the changes as well, but that > will be delayed a bit. > > I sure appreciate your quick responses to all correspondence. > > Dave > > Robb Kambic wrote: > >> David, >> >> Yes, I know about the problem. The problem exists in bulletins that >> don't >> use the = sign to seperate reports. The solution is to assume that >> bulletins >> that don't use = only have one report. I scanned many raw reports and >> this >> seems to be true, so I changed the code to: >> >> < @reports = split ( /\n/ ) ; >> --- >> >> >>> #@reports = split ( /\n/ ) ; >>> s#\n# #g ; >>> next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; >>> $reports[ 0 ] = $_; >>> >> >> >> The new code is attached. I'm also working on a newer version of the >> decoder, it's in the ftp decoders directory. ie >> >> metar2nc.new and metar.cdl.new >> >> The pqact.conf entry needs to change \2:yy to \2:yyyy because it now >> uses >> the century too. The cdl is different, merges vars that have different >> units into one. ie wind knots, mph, and m/s are all store using winds >> m/s. Also, store all reports per station into one record. Take a >> look, I >> would appreciate any comments before it's released. >> >> Robb... >> >> >> On Tue, 2 Mar 2004, David Larson wrote: >> >> >> >>> Robb, >>> >>> I've been chasing down a problem that seems to cause perfectly good >>> reports to be discarded by the perl metar decoder. There is a comment >>> in the 2.4.4 decoder that reads "reports appended together wrongly", >>> the >>> code in this area takes the first line as the report to process, and >>> discards the next line. >>> >>> To walk through this, I'll refer to the following report: >>> >>> 132 >>> SAUS80 KWBC 021800 RRD >>> METAR >>> K4BL 021745Z 12005KT 3SM BR OVC008 01/M01 RMK SLP143 NOSPECI 60011 >>> 8/2// T00061006 10011 21017 51007 >>> >>> The decoder attempts to classify the report type ($rep_type on line 257 >>> of metar2nc), in doing so, it classifies this report as a "SPECI" ... >>> which isn't what you'd expect by visual inspection of the report. >>> However, perl is doing the right thing given that it is asked to match >>> on #(METAR|SPECI) \d{4,6}Z?\n# which exists in the remarks of the >>> report. >>> >>> The solution is probably to bind the text to the start of the line with >>> a caret. Seems to work pretty well so far. >>> >>> I've changed the lines (257-263) in metar2nc-v2.4.4 from: >>> >>> if( s#(METAR|SPECI) \d{4,6}Z?\n## ) { >>> $rep_type = $1 ; >>> } elsif( s#(METAR|SPECI)\s*\n## ) { >>> $rep_type = $1 ; >>> } else { >>> $rep_type = "METAR" ; >>> } >>> >>> To: >>> >>> if( s#^(METAR|SPECI) \d{4,6}Z?\n## ) { >>> $rep_type = $1 ; >>> } elsif( s#^(METAR|SPECI)\s*\n## ) { >>> $rep_type = $1 ; >>> } else { >>> $rep_type = "METAR" ; >>> } >>> >>> I simply added the caret (^) to bind the pattern to the start of the >>> report. >>> >>> Let me know what you think. >>> Dave >>> >>> >> >> ------------------------------------------------------------------------ >> >> #! /usr/local/bin/perl >> # >> # usage: metar2nc cdlfile [datatdir] [yymm] < ncfile >> # >> # >> #chdir( "/home/rkambic/code/decoders/src/metar" ) ; >> >> use NetCDF ; >> use Time::Local ; >> # process command line switches >> while ($_ = $ARGV[0], /^-/) { >> shift; >> last if /^--$/; >> /^(-v)/ && $verbose++; >> } >> # process input parameters >> if( $#ARGV == 0 ) { >> $cdlfile = $ARGV[ 0 ] ; >> } elsif( $#ARGV == 1 ) { >> $cdlfile = $ARGV[ 0 ] ; >> if( $ARGV[ 1 ] =~ /^\d/ ) { >> $yymm = $ARGV[ 1 ] ; >> } else { >> $datadir = $ARGV[ 1 ] ; >> } >> } elsif( $#ARGV == 2 ) { >> $cdlfile = $ARGV[ 0 ] ; >> $datadir = $ARGV[ 1 ] ; >> $yymm = $ARGV[ 2 ] ; >> } else { >> die "usage: metar2nc cdlfile [datatdir] [yymm] < ncfile $!\n" ; >> } >> print "Missing cdlfile file $cdlfile: $!\n" unless -e $cdlfile ; >> >> if( -e "util/ncgen" ) { >> $ncgen = "util/ncgen" ; >> } elsif( -e "/usr/local/ldm/util/ncgen" ) { >> $ncgen = "/usr/local/ldm/util/ncgen" ; >> } elsif( -e "/upc/netcdf/bin/ncgen" ) { >> $ncgen = "/upc/netcdf/bin/ncgen" ; >> } elsif( -e "./ncgen" ) { >> $ncgen = "./ncgen" ; >> } else { >> open( NCGEN, "which ncgen |" ) ; >> $ncgen = <NCGEN> ; >> close( NCGEN ) ; >> >> if( $ncgen =~ /no ncgen/ ) { >> die "Can't find NetCDF utility 'ncgen' in PATH, util/ncgen >> /usr/local/ldm/util/ncgen, /upc/netcdf/bin/ncgen, or ./ncgen : $!\n" ; >> } else { >> $ncgen = "ncgen" ; >> } >> } >> # the data and the metadata directories $datadir = "." if( ! $datadir >> ) ; >> $metadir = $datadir . "/../metadata/surface/metar" ; >> # redirect STDOUT and STDERR >> open( STDOUT, ">$datadir/metarLog.$$.log" ) || >> die "could not open $datadir/metarLog.$$.log: $!\n" ; >> open( STDERR, ">&STDOUT" ) || >> die "could not dup stdout: $!\n" ; >> select( STDERR ) ; $| = 1 ; >> select( STDOUT ) ; $| = 1 ; >> >> die "Missing cdlfile file $cdlfile: $!\n" unless -e $cdlfile ; >> >> # year and month >> if( ! $yymm ) { >> $theyear = (gmtime())[ 5 ] ; >> $theyear = ( $theyear < 100 ? $theyear : $theyear - 100 ) ; >> $theyear = sprintf( "%02d", $theyear ) ; >> $themonth = (gmtime())[ 4 ] ; >> $themonth++ ; >> $yymm = $theyear . sprintf( "%02d", $themonth ) ; >> } else { >> $theyear = substr( $yymm, 0, 2 ) ; >> $themonth = substr( $yymm, 2 ) ; >> } >> # file used for bad metars or prevention of overwrites to ncfiles >> open( OPN, ">>$datadir/rawmetars.$$.nc" ) || die "could not >> open $datadir/rawmetars.$$.nc: $!\n" ; >> # set error handling to verbose only >> $result = NetCDF::opts( VERBOSE ) ; >> >> # set interrupt handler >> $SIG{ 'INT' } = 'atexit' ; >> $SIG{ 'KILL' } = 'atexit' ; >> $SIG{ 'TERM' } = 'atexit' ; >> $SIG{ 'QUIT' } = 'atexit' ; >> >> # set defaults >> >> $F = -99999 ; >> $A = \$F ; >> $S1 = "\0" ; >> $AS1 = \$S1 ; >> $S2 = "\0\0" ; >> $AS2 = \$S2 ; >> $S3 = "\0\0\0" ; >> $AS3 = \$S3 ; >> $S4 = "\0\0\0\0" ; >> $AS4 = \$S4 ; >> $S8 = "\0" x 8 ; >> $AS8 = \$S8 ; >> $S10 = "\0" x 10 ; >> $AS10 = \$S10 ; >> $S15 = "\0" x 15 ; >> $AS15 = \$S15 ; >> $S32 = "\0" x 32 ; >> $AS32 = \$S32 ; >> $S128 = "\0" x 128 ; >> $AS128 = \$S128 ; >> >> %CDL = ( >> "rep_type", 0, "stn_name", 1, "wmo_id", 2, "lat", 3, "lon", 4, >> "elev", 5, >> "ob_hour", 6, "ob_min", 7, "ob_day", 8, "time_obs", 9, >> "time_nominal", 10, "AUTO", 11, "UNITS", 12, "DIR", 13, "SPD", 14, >> "GUST", 15, "VRB", 16, "DIRmin", 17, "DIRmax", 18, "prevail_VIS_SM", >> 19, "prevail_VIS_KM", 20, "plus_VIS_SM", 21, "plus_VIS_KM", 22, >> "prevail_VIS_M", 23, "VIS_dir", 24, "CAVOK", 25, "RVRNO", 26, >> "RV_designator", 27, "RV_above_max", 28, "RV_below_min", 29, >> "RV_vrbl", 30, "RV_min", 31, "RV_max", 32, "RV_visRange", 33, "WX", >> 34, "vert_VIS", 35, "cloud_type", 36, "cloud_hgt", 37, >> "cloud_meters", 38, "cloud_phenom", 39, "T", 40, "TD", 41, >> "hectoPasc_ALTIM", 42, "inches_ALTIM", 43, "NOSIG", 44, >> "TornadicType", 45, "TornadicLOC", 46, "TornadicDIR", 47, >> "BTornadic_hh", 48, "BTornadic_mm", 49, >> "ETornadic_hh", 50, "ETornadic_mm", 51, "AUTOindicator", 52, >> "PKWND_dir", 53, "PKWND_spd", 54, "PKWND_hh", 55, "PKWND_mm", 56, >> "WshfTime_hh", 57, "WshfTime_mm", 58, "Wshft_FROPA", 59, "VIS_TWR", 60, >> "VIS_SFC", 61, "VISmin", 62, "VISmax", 63, "VIS_2ndSite", 64, >> "VIS_2ndSite_LOC", 65, "LTG_OCNL", 66, "LTG_FRQ", 67, "LTG_CNS", 68, >> "LTG_CG", 69, "LTG_IC", 70, "LTG_CC", 71, "LTG_CA", 72, "LTG_DSNT", 73, >> "LTG_AP", 74, "LTG_VcyStn", 75, "LTG_DIR", 76, "Recent_WX", 77, >> "Recent_WX_Bhh", 78, "Recent_WX_Bmm", 79, "Recent_WX_Ehh", 80, >> "Recent_WX_Emm", 81, "Ceiling_min", 82, "Ceiling_max", 83, >> "CIG_2ndSite_meters", 84, "CIG_2ndSite_LOC", 85, "PRESFR", 86, >> "PRESRR", 87, >> "SLPNO", 88, "SLP", 89, "SectorVIS_DIR", 90, "SectorVIS", 91, "GR", 92, >> "GRsize", 93, "VIRGA", 94, "VIRGAdir", 95, "SfcObscuration", 96, >> "OctsSkyObscured", 97, "CIGNO", 98, "Ceiling_est", 99, "Ceiling", 100, >> "VrbSkyBelow", 101, "VrbSkyLayerHgt", 102, "VrbSkyAbove", 103, >> "Sign_cloud", 104, "Sign_dist", 105, "Sign_dir", 106, "ObscurAloft", >> 107, >> "ObscurAloftSkyCond", 108, "ObscurAloftHgt", 109, "ACFTMSHP", 110, >> "NOSPECI", 111, "FIRST", 112, "LAST", 113, "Cloud_low", 114, >> "Cloud_medium", 115, "Cloud_high", 116, "SNINCR", 117, >> "SNINCR_TotalDepth", 118, >> "SN_depth", 119, "SN_waterequiv", 120, "SunSensorOut", 121, >> "SunShineDur", 122, >> "PRECIP_hourly", 123, "PRECIP_amt", 124, "PRECIP_24_amt", 125, >> "T_tenths", 126, >> "TD_tenths", 127, "Tmax", 128, "Tmin", 129, "Tmax24", 130, "Tmin24", >> 131, "char_Ptend", 132, "Ptend", 133, "PWINO", 134, "FZRANO", 135, >> "TSNO", 136, "PNO", 137, "maintIndicator", 138, "PlainText", 139, >> "report", 140, "remarks", 141 ) ; >> >> # default netCDF record structure, contains all vars for the METAR >> reports >> @defaultrec = ( $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS3, >> $A, $A, >> $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS2, $A, $A, [( $S3, $S3, $S3, >> $S3 )], >> [( $F, $F, $F, $F )], [( $F, $F, $F, $F )], [( $F, $F, $F, $F )], [( >> $F, $F, $F, $F )], [( $F, $F, $F, $F )], [( $F, $F, $F, $F )], $AS32, >> $A, >> [( $S4, $S4, $S4, $S4, $S4, $S4 )], [( $F, $F, $F, $F, $F, $F )], >> [( $F, $F, $F, $F, $F, $F )], [( $S4, $S4, $S4, $S4, $S4, $S4 )], >> $A, $A, $A, $A, $A, $AS15, $AS10, $AS2, $A, $A, $A, $A, $AS4, $A, $A, >> $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS10, $A, $A, $A, $A, $A, >> $A, $A, $A, $A, $A, $AS2, [( $S8, $S8, $S8 )], [( $F, $F, $F )], >> [( $F, $F, $F )], [( $F, $F, $F )], [( $F, $F, $F )], $A, $A, $A, $A, >> $A, $A, $A, $A, $AS2, $A, $A, $A, $A, $AS2, $AS8, $A, $A, $A, $A, >> $AS3, $A, $AS3, $AS10, $AS10, $AS10, $AS8, $AS3, $A, $A, $A, $A, $A, >> $AS1, $AS1, $AS1, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, >> $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS128, $AS128, >> $AS128 ) ; >> >> # two fold purpose array, if entry > 0, then var is requested and >> it's value >> # is the position in the record, except first entry >> @W = ( 0 ) x ( $#defaultrec +1 ) ; >> $W[ 0 ] = -1 ; >> >> # open cdl and create record structure according to variables >> open( CDL, "$cdlfile" ) || die "could not open $cdlfile: $!\n" ; >> $i = 0 ; >> while( <CDL> ) { >> if( s#^\s*(char|int|long|double|float) (\w{1,25})## ) { >> ( $number ) = $CDL{ $2 } ; >> push( @rec, $defaultrec[ $number ] ) ; >> $W[ $number ] = $i++ ; >> } >> } >> close CDL ; >> undef( @defaultrec ) ; >> undef( %CDL ) ; >> >> # read in station data >> if( -e "etc/sfmetar_sa.tbl" ) { >> $sfile = "etc/sfmetar_sa.tbl" ; >> } elsif( -e "./sfmetar_sa.tbl" ) { >> $sfile = "./sfmetar_sa.tbl" ; >> } else { >> die "Can't find sfmetar_sa.tbl station file.: $!\n" ; >> } >> open( STATION, "$sfile" ) || die "could not open $sfile: $!\n" ; >> >> while( <STATION> ) { >> s#^(\w{3,6})?\s+(\d{4,5}).{40}## ; >> $id = $1 ; >> $wmo_id = $2 ; >> $wmo_id = "0" . $wmo_id if( length( $wmo_id ) == 4 ) ; >> ( $lat, $lon, $elev ) = split ; >> $lat = sprintf( "%7.2f", $lat / 100 ) ; >> $lon = sprintf( "%7.2f", $lon / 100) ; >> >> # set these vars ( $wmo_id, $lat, $lon, $elev ) $STATIONS{ >> "$id" } = "$wmo_id $lat $lon $elev" ; >> } >> close STATION ; >> >> # read in list of already processed reports if it exists >> # open metar.lst, list of reports processed in the last 4 hours. >> if( -e "$datadir/metar.lst" ) { >> open( LST, "$datadir/metar.lst" ) || die "could not open >> $datadir/metar.lst: $!\n" ; >> while( <LST> ) { >> ( $stn, $rtptime, $hr ) = split ; >> $reportslist{ "$stn $rtptime" } = $hr ; >> } >> close LST ; >> #unlink( "$datadir/metar.lst" ) ; >> } >> # Now begin parsing file and decoding observations breaking on cntrl C >> $/ = "\cC" ; >> >> # set select processing here from STDIN >> START: >> while( 1 ) { >> open( STDIN, '-' ) ; >> vec($rin,fileno(STDIN),1) = 1; >> $timeout = 1200 ; # 20 minutes >> $nfound = select( $rout = $rin, undef, undef, $timeout ); >> # timed out >> if( ! $nfound ) { >> print "Shut down, time out 20 minutes\n" ; >> &atexit() ; >> } >> &atexit( "eof" ) if( eof( STDIN ) ) ; >> >> # Process each line of metar bulletins, header first >> $_ = <STDIN> ; >> #next unless /METAR|SPECI/ ; >> s#\cC## ; >> s#\cM##g ; >> s#\cA\n## ; >> s#\c^##g ; >> >> s#\d\d\d \n## ; >> s#\w{4}\d{1,2} \w{4} (\d{2})(\d{2})(\d{2})?.*\n## ; >> $tday = $1 ; >> $thour = $2 ; >> $thour = "23" if( $thour eq "24" ) ; >> $tmin = $3 ; >> $tmin = "00" unless( $tmin ) ; >> next unless ( $tday && defined( $thour ) ) ; >> $time_trans = thetime( "trans" ) ; >> if( s#(METAR|SPECI) \d{4,6}Z?\n## ) { >> $rep_type = $1 ; >> } elsif( s#(METAR|SPECI)\s*\n## ) { >> $rep_type = $1 ; >> } else { >> $rep_type = "METAR" ; >> } >> # Separate bulletins into reports if( /=\n/ ) { >> s#=\s+\n#=\n#g ; >> @reports = split( /=\n/ ) ; >> } else { >> #@reports = split ( /\n/ ) ; >> s#\n# #g ; >> next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; >> $reports[ 0 ] = $_ ; >> } >> >> >
============================================================================== Robb Kambic Unidata Program Center Software Engineer III Univ. Corp for Atmospheric Research rkambic@xxxxxxxxxxxxxxxx WWW: http://www.unidata.ucar.edu/ ==============================================================================
decoders
archives: