piet@cs.ruu.nl (Piet van Oostrum) (01/15/90)
It seems that matching word boundaries with \b in regexps doesn't work
properly.
I get random results. The following script should illustrate the problem.
Feed it with lines of the form: vote yes (or: vote no)
Or an I doing something wrong???
----------------------------------------------------------------
#! /usr/bin/perl
$yes = 0;
$no = 0;
$vote = 0;
while (<STDIN>) {
chop;
print "=$_=\n";
if (/vote/i || /comp.text.tex/i) {
$vote ++;
$yes ++ if /\byes\b/i;
$no ++ if /\bno\b/i;
}
print "vote = $vote, yes = $yes, no = $no\n";
}
----------------------------------------------------------------
Piet* van Oostrum, Dept of Computer Science, Utrecht University,
Padualaan 14, P.O. Box 80.089, 3508 TB Utrecht, The Netherlands.
Telephone: +31-30-531806 Uucp: uunet!mcsun!hp4nl!ruuinf!piet
Telefax: +31-30-513791 Internet: piet@cs.ruu.nl (*`Pete')lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (01/16/90)
In article <2295@ruuinf.cs.ruu.nl> piet@cs.ruu.nl (Piet van Oostrum) writes:
: It seems that matching word boundaries with \b in regexps doesn't work
: properly.
: I get random results. The following script should illustrate the problem.
: Feed it with lines of the form: vote yes (or: vote no)
: Or an I doing something wrong???
It's a bug. I fixed it a week or two ago here, and it will be fixed in
patch 9. There was a bad interaction between the code that handles
case insensitivity and the code that checks for \b-ness at the beginning
of a string. I tried your test program and it works under the new version.
Soon.
Larryjv@mh.nl (Johan Vromans) (01/16/90)
In article <2295@ruuinf.cs.ruu.nl> piet@cs.ruu.nl (Piet van Oostrum) writes: > It seems that matching word boundaries with \b in regexps doesn't work > properly. [example deleted] Knowing Piet* is running perl on HP-UX, I tried a little, and found out that: - on VAX/Ultrix it behaves like expected - on HP-UX it fails. - it runs fine on HP-UX if the 'ignore case' spec is removed from the matches: $yes++ if /\byes\b/; So I think it has something to do with HP's NLS system (a wild guess, but - who knows?) Johan -- Johan Vromans jv@mh.nl via internet backbones Multihouse Automatisering bv uucp: ..!{uunet,hp4nl}!mh.nl!jv Doesburgweg 7, 2803 PL Gouda, The Netherlands phone/fax: +31 1820 62944/62500 ------------------------ "Arms are made for hugging" -------------------------
jand@kuling.UUCP (Jan Dj{rv) (01/18/90)
In article <JV.90Jan15220501@mhres.mh.nl> jv@mh.nl (Johan Vromans) writes: :In article <2295@ruuinf.cs.ruu.nl> piet@cs.ruu.nl (Piet van Oostrum) writes: :> It seems that matching word boundaries with \b in regexps doesn't work :> properly. :[example deleted] : :Knowing Piet* is running perl on HP-UX, I tried a little, and found :out that: : - on VAX/Ultrix it behaves like expected : - on HP-UX it fails. : - it runs fine on HP-UX if the 'ignore case' spec is removed from the : matches: : $yes++ if /\byes\b/; : So I think it has something to do with HP's NLS system (a wild : guess, but - who knows?) : :Johan Well, I tried the example on our HP 835 running HP-UX 7.0 and it worked just fine. (O.B.S. This is not to be taken as a defence of HP-UX. 7.0 was installed last Friday, and the bugs keeps coming in :-( ) Probably there is something more subtle to the error. I agree that Johans guess is a wild one, did you forget to put out a :-) somewhere, Johan ? Jan D.