piet@cs.ruu.nl (Piet van Oostrum) (01/15/90)
It seems that matching word boundaries with \b in regexps doesn't work properly. I get random results. The following script should illustrate the problem. Feed it with lines of the form: vote yes (or: vote no) Or an I doing something wrong??? ---------------------------------------------------------------- #! /usr/bin/perl $yes = 0; $no = 0; $vote = 0; while (<STDIN>) { chop; print "=$_=\n"; if (/vote/i || /comp.text.tex/i) { $vote ++; $yes ++ if /\byes\b/i; $no ++ if /\bno\b/i; } print "vote = $vote, yes = $yes, no = $no\n"; } ---------------------------------------------------------------- Piet* van Oostrum, Dept of Computer Science, Utrecht University, Padualaan 14, P.O. Box 80.089, 3508 TB Utrecht, The Netherlands. Telephone: +31-30-531806 Uucp: uunet!mcsun!hp4nl!ruuinf!piet Telefax: +31-30-513791 Internet: piet@cs.ruu.nl (*`Pete')
lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (01/16/90)
In article <2295@ruuinf.cs.ruu.nl> piet@cs.ruu.nl (Piet van Oostrum) writes:
: It seems that matching word boundaries with \b in regexps doesn't work
: properly.
: I get random results. The following script should illustrate the problem.
: Feed it with lines of the form: vote yes (or: vote no)
: Or an I doing something wrong???
It's a bug. I fixed it a week or two ago here, and it will be fixed in
patch 9. There was a bad interaction between the code that handles
case insensitivity and the code that checks for \b-ness at the beginning
of a string. I tried your test program and it works under the new version.
Soon.
Larry
jv@mh.nl (Johan Vromans) (01/16/90)
In article <2295@ruuinf.cs.ruu.nl> piet@cs.ruu.nl (Piet van Oostrum) writes: > It seems that matching word boundaries with \b in regexps doesn't work > properly. [example deleted] Knowing Piet* is running perl on HP-UX, I tried a little, and found out that: - on VAX/Ultrix it behaves like expected - on HP-UX it fails. - it runs fine on HP-UX if the 'ignore case' spec is removed from the matches: $yes++ if /\byes\b/; So I think it has something to do with HP's NLS system (a wild guess, but - who knows?) Johan -- Johan Vromans jv@mh.nl via internet backbones Multihouse Automatisering bv uucp: ..!{uunet,hp4nl}!mh.nl!jv Doesburgweg 7, 2803 PL Gouda, The Netherlands phone/fax: +31 1820 62944/62500 ------------------------ "Arms are made for hugging" -------------------------
jand@kuling.UUCP (Jan Dj{rv) (01/18/90)
In article <JV.90Jan15220501@mhres.mh.nl> jv@mh.nl (Johan Vromans) writes: :In article <2295@ruuinf.cs.ruu.nl> piet@cs.ruu.nl (Piet van Oostrum) writes: :> It seems that matching word boundaries with \b in regexps doesn't work :> properly. :[example deleted] : :Knowing Piet* is running perl on HP-UX, I tried a little, and found :out that: : - on VAX/Ultrix it behaves like expected : - on HP-UX it fails. : - it runs fine on HP-UX if the 'ignore case' spec is removed from the : matches: : $yes++ if /\byes\b/; : So I think it has something to do with HP's NLS system (a wild : guess, but - who knows?) : :Johan Well, I tried the example on our HP 835 running HP-UX 7.0 and it worked just fine. (O.B.S. This is not to be taken as a defence of HP-UX. 7.0 was installed last Friday, and the bugs keeps coming in :-( ) Probably there is something more subtle to the error. I agree that Johans guess is a wild one, did you forget to put out a :-) somewhere, Johan ? Jan D.