[comp.emacs] Function needed: is point within regexp?

shapiro@blueberry.inria.fr (Marc Shapiro) (07/07/88)

I am hacking a new, much-improved bibtex-mode for GNU Emacs.  For
this, I need a function which checks if point is *within* a certain
regular expression.

For instance, if the regexp is "^.*$" the function should return t,
and (match-beginning 0) and (match-end 0) should return the location
of the beginning and end of the current line, respectively.  If the
regexp is "ab[aA]ba" and point is on the "c" of "zzabAbac" then the
function should return an error.  If point ia on the "A", then it
should return t, and (match-beginning 0) and (match-end 0) should
return the locations of the first "a" and of "c" respectively.

Neither looking-at, nor re-search-forward, nor re-search-backward fill
the bill.

The function bibtex-enclosing-regexp enclosed below does the trick,
but in a very stupid way.  It moves backwards by an arbitrary number
of characters, and then repeatedly calls re-search-forward until a
match is found, and its boundaries enclose the original point.

Surely there is a better way to do this!  Any ideas?


						Marc Shapiro

INRIA, B.P. 105, 78153 Le Chesnay Cedex, France.  Tel.: +33 (1) 39-63-53-25
e-mail: shapiro@inria.inria.fr or: ...!mcvax!inria!shapiro


---- cut here --------------------------

(defvar bibtex-search-limit 1000
  "Search for the beginning and end of the current bibtex entry is limited to so many chars from point.")

(defun bibtex-enclosing-regexp (regexp)
  "Search for REGEXP enclosing point.
Point does not move; use match-beginning and match-end to learn the
extent of the regexp.

[Doesn't something like this exist already?]"

  (interactive "sRegexp: ")
  (let ((where (point))
	(right (min (point-max)
		  (+ (point) bibtex-search-limit)))
	(left (max (point-min)
		  (- (point) bibtex-search-limit))))
    (save-excursion
      (goto-char left)
      (re-search-forward regexp right nil 1)
      (if (> (match-beginning 0) where)
	  (signal 'search-failed (list regexp)))	  
      (while (<= (match-end 0) where)
	(re-search-forward regexp right nil 1)
	(if (> (match-beginning 0) where)
	    (signal 'search-failed (list regexp))))
      )))

--- the end ----------------------------

PS.  bibtex-enclosing-regexp could be made better by adding optional
arguments, for "left" and "right" limits, and by allowing it to return
nil if the regexp is not found, instead of signalling.  However, as
is, it works.

						Marc Shapiro

INRIA, B.P. 105, 78153 Le Chesnay Cedex, France.  Tel.: +33 (1) 39-63-53-25
e-mail: shapiro@inria.inria.fr or: ...!mcvax!inria!shapiro

shapiro@inria.inria.fr (Marc Shapiro) (07/07/88)

I am hacking a new, much-improved bibtex-mode for GNU Emacs.  For
this, I need a function which checks if point is *within* a certain
regular expression.

For instance, if the regexp is "^.*$" the function should return t,
and (match-beginning 0) and (match-end 0) should return the location
of the beginning and end of the current line, respectively.  If the
regexp is "ab[aA]bc" and point is on the "d" of "zzabAbcd" then the
function should return an error.  If point is on the "A", then it
should return t, and (match-beginning 0) and (match-end 0) should
return the locations of the "a" and "d" respectively.

Neither looking-at, nor re-search-forward, nor re-search-backward fill
the bill.

The function bibtex-enclosing-regexp enclosed below does the trick,
but in a very stupid way.  It moves backwards by an arbitrary number
of characters, and then repeatedly calls re-search-forward until a
match is found, and its boundaries enclose the original point.

Surely there is a better way to do this!  Any ideas?


						Marc Shapiro

INRIA, B.P. 105, 78153 Le Chesnay Cedex, France.  Tel.: +33 (1) 39-63-53-25
e-mail: shapiro@inria.inria.fr or: ...!mcvax!inria!shapiro


---- cut here --------------------------

(defvar bibtex-search-limit 1000
  "Search for the beginning and end of the current bibtex entry is limited to so many chars from point.")

(defun bibtex-enclosing-regexp (regexp)
  "Search for REGEXP enclosing point.
Point does not move; use match-beginning and match-end to learn the
extent of the regexp.

[Doesn't something like this exist already?]"

  (interactive "sRegexp: ")
  (let ((where (point))
	(right (min (point-max)
		  (+ (point) bibtex-search-limit)))
	(left (max (point-min)
		  (- (point) bibtex-search-limit))))
    (save-excursion
      (goto-char left)
      (re-search-forward regexp right nil 1)
      (if (> (match-beginning 0) where)
	  (signal 'search-failed (list regexp)))	  
      (while (<= (match-end 0) where)
	(re-search-forward regexp right nil 1)
	(if (> (match-beginning 0) where)
	    (signal 'search-failed (list regexp))))
      )))

--- the end ----------------------------

PS.  bibtex-enclosing-regexp could be made better by adding optional
arguments, for "left" and "right" limits, and by allowing it to return
nil if the regexp is not found, instead of signalling.  However, as
is, it works.

maa@sei.cmu.edu (Mark Ardis) (07/08/88)

The posted solution will not always work.  Consider the regular
expression "a.b" and sample string "aabb".  If point is initially
located between the two "b"s, then the function posted will return an
error.  The bug is that point is advanced to the *end* of each match
after trying to match, rather than advancing one character.

Mark A. Ardis
Software Engineering Institute
Carnegie-Mellon University
Pittsburgh, PA 15213
(412) 268-7636
maa@sei.cmu.edu

Ram-Ashwin@cs.yale.edu (Ashwin Ram) (07/08/88)

In article <8807070659.AA07832@blueberry.inria.fr>, shapiro@inria (Marc Shapiro) writes:
> I am hacking a new, much-improved bibtex-mode for GNU Emacs.  For
> this, I need a function which checks if point is *within* a certain
> regular expression.
> 
> The function bibtex-enclosing-regexp enclosed below does the trick,
> but in a very stupid way.  It moves backwards by an arbitrary number
> of characters, and then repeatedly calls re-search-forward until a
> match is found, and its boundaries enclose the original point.
> 
> Surely there is a better way to do this!  Any ideas?

You can do better than this by --

- searching backward for a character that matches the first "thing" in the
  regexp (since any match must start at such a character)
- checking if you're looking-at the regexp, and, if so, if the boundaries of the
  match enclose the original point
- and repeating this until you're done.

The first step is a little tricky since the "first thing in the regexp" could be
a wild card or a disjunction, so you'll have to write code that will pick this
out by pseudo-parsing the regexp string.  If you don't want to write the general
function, it should be pretty easy to do this for the bibtex case.

-- Ashwin.