[comp.lang.perl] Puzzle solved, with patch.

flee@cs.psu.edu (Felix Lee) (04/03/91)

>Why are the two apparently similar pattern matches doing different

Here's a smaller example that illustrates what's happening:
	$x = 'abc';
	/abc(z|$)/;
	/$x(z|$)/;
Perl complains "/abc(z|/: unmatched () in regexp at line 3".

The difference is the second pattern is a runtime pattern.

do_match() very carefully sets nointrp to "|)" so that "$)" doesn't
get interpreted as a variable in intrpcompile().  But intrpcompile()
doesn't protect it with a "$b" cookie, so interp() very carelessly
removes the "$)", leaving the string "abc(z|".

Below, a patch.  Apply with "patch -p0".

The diff for str.c is against perl 3.044, but can be safely applied to
perl 4.0beta (hunk #1 offset 7 lines, hunk #2 fuzz 2 offset 8 lines).

The new file t/op.regexp1 is a test that exercises this bug/fix.

No warranty, expressed or implied, etc.
--
Felix Lee	flee@cs.psu.edu

*** str.c.old	Sun Feb 10 22:38:51 1991
--- str.c	Wed Apr  3 01:42:21 1991
***************
*** 981,988 ****
  	    ++s;
  	    t = s;
  	}
! 	else if ((*s == '@' || (*s == '$' && !index(nointrp,s[1]))) &&
! 	  s+1 < send) {
  	    str_ncat(str,t,s-t);
  	    t = s;
  	    if (*s == '$' && s[1] == '#' && (isalpha(s[2]) || s[2] == '_'))
--- 981,994 ----
  	    ++s;
  	    t = s;
  	}
! 	else if (*s == '$' && s+1 < send && *nointrp && index(nointrp,s[1])) {
! 	    str_ncat(str, t, s - t);
! 	    str_ncat(str, "$b", 2);
! 	    str_ncat(str, s, 2);
! 	    s += 2;
! 	    t = s;
! 	}
! 	else if ((*s == '@' || *s == '$') && s+1 < send) {
  	    str_ncat(str,t,s-t);
  	    t = s;
  	    if (*s == '$' && s[1] == '#' && (isalpha(s[2]) || s[2] == '_'))
***************
*** 1205,1210 ****
--- 1211,1219 ----
  	if (*s == '$' && s+1 < send) {
  	    str_ncat(str,t,s-t);
  	    switch(*++s) {
+ 	    default:
+ 		fatal("panic: unknown interp cookie\n");
+ 		break;
  	    case 'a':
  		str_scat(str,*++elem);
  		break;
*** /dev/null	Wed Apr  3 01:43:08 1991
--- t/op.regexp1	Wed Apr  3 01:22:00 1991
***************
*** 0 ****
--- 1,17 ----
+ #!./perl
+ 
+ # $Header$
+ 
+ # Buggy perl deletes $| and $) sequences in run-time regexps.
+ 
+ print "1..3\n";
+ 
+ $xyz = 'xyz';
+ 
+ if ("abc" =~ /^abc$|$xyz/) {print "ok 1\n";} else {print "not ok 1\n";}
+ 
+ # buggy perl says "unmatched ()"
+ eval '"abc" =~ /a(bc$)|$xyz/; $result = "$&:$1"';
+ if ($@ eq "") {print "ok 2\n";} else {print "not ok 2\n";}
+ 
+ if ($result eq "abc:bc") {print "ok 3\n";} else {print "not ok 3\n";}