[gnu.emacs.bug] re-search-backward on ".*"

stampe@UHCCUX.UHCC.HAWAII.EDU (David Stampe) (12/13/89)

In GNU Emacs 18.51.3 of Mon Jun  6 1988 on uhccux (berkeley-unix)

(re-search-backward ".*") with the cursor after a line of chars
doesn't move at all, i.e. it matches the MINIMAL instead of maximal
match.  (It isn't merely going to the end of the match: "..*" will
move the cursor back one character.)  The *, not the ., is the bug;
there is a similar minimal move for " *", for example.

Similarly, typing ESC C-s .* C-r C-r C-r C-r doesn't move the cursor.

In contrast, (re-search-forward ".*") with the cursor before a line of
chars moves over the entire line.  And so does ESC C-s .*.

This is so bad a bug I can't understand why I haven't seen it before.
Am I dreaming?

kjones@talos.uu.net (Kyle Jones) (12/14/89)

David Stampe writes:
 > In GNU Emacs 18.51.3 of Mon Jun  6 1988 on uhccux (berkeley-unix)
 > 
 > (re-search-backward ".*") with the cursor after a line of chars
 > doesn't move at all, i.e. it matches the MINIMAL instead of maximal
 > match. [...]  This is so bad a bug I can't understand why I haven't
 > seen it before.  Am I dreaming?

You're not dreaming; I had similar difficulties when I was trying to get
VM's digest parser to skip backward over consecutive message delimiters.
I was using the `+' operator.  I eventually had to use a while loop
because I couldn't persuade re-search-backward to match more than one
delimiter per call.

It's almost as if re-search-backward works by moving backward a
character at a time and calling re-search-forward with the appropriate
arguments, although I'm sure that's not quite the way it's done.

peck@SUN.COM (Jeff Peck) (12/15/89)

>It's almost as if re-search-backward works by moving backward a
>character at a time and calling re-search-forward with the appropriate
>arguments, although I'm sure that's not quite the way it's done.

I looked at this issue of "reverse-regexp-search" quite a bit.
My conclusion was that in addition to the limited functionality
provided by the current re-search-backward (which really does
move back point by point and then look forward), what you need
is a reverse-regexp-search.
  Our Common-Lisp Emacs provides such a feature.  The big difference is that
you must write your regular expression as seen from the BACK!  One over that
hurdle, you just teach the regexp processor to index the character buffer in
reverse, and Voila!  you can search backward for the longest section
matching the regexp...
 [actually, instead of teaching the regexp engine to go backwards,
  a temporary buffer is created which contains the buffer segment 
  to be searched, and that temporary buffer is reversed.]

You can probably create this functionality from the privitives already
available to elisp (might want to use C code to create the tmp buffer
with chars reversed)  Maybe someday the C regexp engine will be extended
to actually go backwards.

Maybe somebody could get really creative and acually parse a regexp
and reverse it automatically (this is easier when the regexp is in
functional notation, rather than in string format...)