NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Dave, > > Backreference \n always refers to the subexpression enclosed by the > > n-th > > unescaped left parenthesis. > > OK. This is what I read in some documentation that I eventually dug up > on the Web after I sent my support request. > > As I recall, the first field in a WMO header has six characters: four > > letters followed by two digits. The above ERE, however, would match, > > for example, "SIVA ", "SIWA ", "SHV ", "SHXX ", and "SSA " -- which > > don't fit the pattern of the first field of a WMO header. > > That problem occurred to me late last night, so I stuck wild card > characters in wherever I needed to get the first field up to 6 > characters. The new version looks like: > > WMO (^S[IMN]V[^GINS]..)|(^S[IMN]W[^KZ]..)|(^S(HV...|HXX|S[^X]...))| > (^SX(VD..|V.50|US(2[03]|08|40|82|86)))|(^Y[HO]XX84) .... ([0-3][0-9]) > ([0-2][0-9]).. > FILE -close data/surface/(\9:yy)(\9:mm)\9\(10)_boy.wmo > > I've chosen \9 and \(10) to try to match the day and hour information > in the pattern, based on the number of unescaped left parentheses > preceding those fields, of which I count 8. > > The files that get saved as a result are named, literally, "(:yy) > (:mm)_boy.wmo", so the choices of \9 and \(10) don't appear to match > anything. This suggests to me that there might be (for this purpose) > effectively fewer than 8 parenthetical expressions preceding the day > and hour fields, unless there's another error in there somewhere. The "|" operator has the lowest precedence, so many of the subpatterns between "|" operators can loose their outermost parentheses. The subpattern "^S(HV...|HXX|S[^X]...)" will match the four character string "SHXX" as well as many six character strings, which is probably not what you want. The subpattern before the subpattern " .... ([0-3][0-9])([0-2][0-9]).." should be enclosed in parentheses because it's the one that's trying to match the first six characters with a sequence of alternatives. As the ERE stands now, only product-identifiers that start with a "Y" will match on the date and hour. I'm not exactly sure what you're trying to match, but the following ERE-s might help: (^S[IMN]V[^GINS]..|^S[IMN]W[^KZ]..|^S(HV...|HXX..|S[^X]...)|^SX(VD..|V.50|US(2[03]|08|40|82|86))|^Y[HO]XX84) .... ([0-3][0-9])([0-2][0-9]).. ^(S(([IMN](V[^GINS]|W[^KZ]))..|(HV.|HXX|S[^X].)..|X(VD..|V.50|US(2[03]|08|40|82|86)))|Y[HO]XX84) .... ([0-3][0-9])([0-2][0-9]).. Backreferences for the day and hour would be, respectively, \5 and \6 for the first ERE and \8 and \9 for the second. > > To simplify things, you can always break-up a complicated ERE into > > multiple pqact(1) entries, each one handling a subset of the > > complicated > > ERE. > > I'll try this. Would there be any complications arising from using > "FILE -close" on each entry? If your computer is fast enough to handle the rate at which files are opened and closed, then there shouldn't be any complications. The LDM log file will tell you if the pqact(1) process is falling behind. Regards, Steve Emmerson Ticket Details =================== Ticket ID: RJS-786355 Department: Support LDM Priority: Normal Status: Closed