[comp.lang.perl] bug in anchored, multiline pattern match

phillips@cs.ubc.ca (George Phillips) (06/10/90)

I think this bug has already been exposed, but what the hey.  When
processing an anchored, multiline pattern match, perl gets a little
confused if there's a blank line in the search string.  The following
fragment finds a bug instead of a match:

$* = 1;
$block = <<\EOF;
abcdefg

blank line kills the match!
EOF
print $block =~ /^b/ . "\n";

The loop that scans for newlines gets a little excited and increments
the pointer twice when it hits a newline.  Satisfyingly easy to fix:

*** regexec.c	Sun Jun 10 03:15:37 1990
--- regexec.c	Sun Jun 10 03:20:53 1990
***************
*** 208,216 ****
  			/* for multiline we only have to try after newlines */
  			if (s > string)
  			    s--;
! 			for (; s < strend; s++) {
! 			    if (*s == '\n') {
! 				if (++s < strend && regtry(prog, s))
  				    goto got_it;
  			    }
  			}
--- 208,216 ----
  			/* for multiline we only have to try after newlines */
  			if (s > string)
  			    s--;
! 			while (s < strend) {
! 			    if (*s++ == '\n') {
! 				if (s < strend && regtry(prog, s))
  				    goto got_it;
  			    }
  			}

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (06/12/90)

In article <8191@ubc-cs.UUCP> phillips@cs.ubc.ca (George Phillips) writes:
: I think this bug has already been exposed, but what the hey.  When
: processing an anchored, multiline pattern match, perl gets a little
: confused if there's a blank line in the search string.

Actually, this is the first I heard of this one, that I can recall.  Thanks.
It'll be fixed in the next patch.

While we're on the subject of the next patch, I've got things arranged now
that you can link in C libraries to make special versions of Perl.  You do
this by writing a glue routine that translates Perl subroutine calls
to C subroutine calls.  You can easily link in multiple independent
glue routines.  On some machines it might be possible for one of the glue
routines to be a fasl'er, though I haven't written one yet.  I do have
a glue file for BSD curses already.  X windows, anyone?

It's easy to write a glue routine because it's preprocessed from definitions
like:

    CASE int flushok
    I       WINDOW*         win
    I       bool            boolf
    END

In fact, I've even got a script that scans manual pages and spits out
definitions like that above based on the synopsis, presuming it is
reasonably well formed.

Included below is a VERY rudimentary pager using the curses version of
perl.  The help text fibs--it's just ripped out of less to get more than
one page of help text, since I wanted to test recursive calls of the
pager.  Of note is that the pager can call itself recursively just by saying

    local(*lines) = *helplines;
    local(line);
    &pagearray;

Note also the use of an eval to trap any fatal errors and call &endwin
to clean up.

What you are seeing here is the beginnings of a prototype for a new version
of rn.

Larry

#!./curseperl

eval <<'EndOfMain';   $evaloffset = 3;	# line number of this line

    $| = 1;		# command buffering on stdout
    &initterm;
    &inithelp;
    &slurpfile && &pagearray;

EndOfMain

&endwin;

if ($@) {
    print "";		# force flush of stdout
    $@ =~ s/\(eval\)/$0/ && $@ =~ s/line (\d+)/'line ' . ($1 + $evaloffset)/e;
    die $@;
}

exit;

################################################################################

sub initterm {

    &initscr; &cbreak; &noecho; &scrollok($stdscr, 1);
    &defbell unless defined &bell;

    $lines = $LINES; $lines1 = $lines - 1; $lines2 = $lines - 2;
    $cols = $COLS;   $cols1  = $cols  - 1; $cols2  = $cols  - 2;;

    $dl = &getcap('dl');
    $al = &getcap('al');
    $ho = &getcap('ho');
    $ce = &getcap('ce');
}

sub slurpfile {
    while (<>) {
	s/^(\t+)/'        ' x length($1)/e;
	&expand($_) if /\t/;
	if (length($_) < $cols) {
	    push(@lines, $_);
	}
	else {
	    while ($_ && $_ ne "\n") {
		push(@lines, substr($_,0,$cols));
		substr($_,0,$cols) = '';
	    }
	}
    }
    1;
}

sub drawscreen {
    &move(0,0);
    for ($line .. $line + $lines2) {
	&addstr($lines[$_]);
    }
    &clrtobot;
    &percent;
    &refresh;
}

sub expand {
    while (($off = index($_[0],"\t")) >= 0) {
	substr($_[0], $off, 1) = ' ' x (8 - $off % 8);
    }
}

sub pagearray {
    $line = 0;

    $| = 1;

    for (&drawscreen;;&drawscreen) {

	$ch = &getch;
	$ch = "j" if $ch eq "\n";

	if ($ch eq ' ') {
	    last if $percent >= 100;
	    &move(0,0);
	    $line += $lines1;
	}
	elsif ($ch eq 'b') {
	    $line -= $lines1;
	    &move(0,0);
	    $line = 0 if $line < 0;
	}
	elsif ($ch eq "j") {
	    $line += 1;
	    if ($dl) {
		print $ho, $dl;
		&mvcur(0,0,$lines2,0);
		print $ce,$lines[$line+$lines2],$ce;
		&wmove($curscr,0,0);
		&wdeleteln($curscr);
		&wmove($curscr,$lines2,0);
		&waddstr($curscr,$lines[$line+$lines2]);
	    }
	    &wmove($stdscr,0,0);
	    &wdeleteln($stdscr);
	    &wmove($stdscr,$lines2,0);
	    &waddstr($stdscr,$lines[$line+$lines2]);
	    &percent;
	    &refresh;
	    redo;
	}
	elsif ($ch eq "k") {
	    next if $line <= 0;
	    $line -= 1;
	    if ($al) {
		print $ho, $al, $ce, $lines[$line];
		&wmove($curscr,0,0);
		&winsertln($curscr);
		&waddstr($curscr,$lines[$line]);
	    }
	    &wmove($stdscr,0,0);
	    &winsertln($stdscr);
	    &waddstr($stdscr,$lines[$line]);
	    &percent;
	    &refresh;
	    redo;
	}
	elsif ($ch eq "\f") {
	    &clear;
	}
	elsif ($ch eq "q") {
	    last;
	}
	elsif ($ch eq "h") {
	    &clear;
	    &help;
	    &clear;
	}
	else {
	    &bell;
	}
    }
}

sub defbell {
    eval q#
	sub bell {
	    print "\007";
	}
    #;
}

sub help {
    local(*lines) = *helplines;
    local($line);
    &pagearray;
}

# This help message is borrowed from the "less" program.

sub inithelp {
    @helplines = split(/\n/,<<'EOT');

      Commands marked with * may be preceeded by a number, N.

  h              Display this help.
  q              Exit.

  f, SPACE    *  Forward  N lines, default one screen.
  b           *  Backward N lines, default one screen.
  e, j, CR    *  Forward  N lines, default 1 line.
  y, k        *  Backward N lines, default 1 line.
  d           *  Forward  N lines, default 10 or last N to d or u command.
  u           *  Backward N lines, default 10 or last N to d or u command.
  r              Repaint screen.
  R              Repaint screen, discarding buffered input.

  /pattern    *  Search forward for N-th line containing the pattern.
  ?pattern    *  Search backward for N-th line containing the pattern.
  n           *  Repeat previous search (for N-th occurence).

  g           *  Go to line N, default 1.
  G           *  Like g, but default is last line in file.
  p, %        *  Position to N percent into the file.
  m<letter>      Mark the current position with <letter>.
  '<letter>      Return to a previously marked position.
  ''             Return to previous position.

  E [file]       Examine a new file.
  N           *  Examine the next file (from the command line).
  P           *  Examine the previous file (from the command line).
  =              Print current file name.
  V              Print version number of "less".

  -<flag>        Toggle a command line flag.
  +cmd           Execute the less cmd each time a new file is examined.

  !command       Passes the command to a shell to be executed.
  v              Edit the current file with $EDITOR.
EOT
    for (@helplines) {
	s/$/\n/;
    }
}

sub percent {
    &standout;
      $percent = int(($line + $lines1) * 100 / @lines);
      &move($lines1,0);
      &addstr("($percent%)");
    &standend;
    &clrtoeol;
}