RSS%CALSTATE.bitnet@vm.usc.edu (Richard S. Smith) (05/01/91)
I get the feeling there's no good answer to this question, but I am asking it anyway... Is there a SIMPLE, NON-PAINFUL way to set up a regular expression so that it will match a given string only when it occurs as a word, i.e., delimited by non-alphanumeric characters or by line boundaries? In other words, I am looking for a simple, generalized way to find "foo" when it occurs as "foo bar" or "foo-bar" or "foo: bar" but NOT as "foobar". I am hoping there is a simpler answer than: "[^A-Za-z0-9]foo[^A-Za-z0-9]" Thanks to anyone who can help. Richard Smith - RSS@CALSTATE.BITNET
jik@athena.mit.edu (Jonathan I. Kamens) (05/01/91)
In article <26716@adm.brl.mil>, RSS%CALSTATE.bitnet@vm.usc.edu (Richard S. Smith) writes: |> Is there a SIMPLE, NON-PAINFUL way to set up a regular expression so |> that it will match a given string only when it occurs as a word, i.e., |> delimited by non-alphanumeric characters or by line boundaries? It's difficult to answer this question unless you say what utility you intend to use the regular expression with. For example, with emacs (and possibly with ex and vi, I'm not sure), you can use "\<" and "\>" to delimit a word in a regular expression. With "grep", you can use the "-w" argument to tell it to look for words only. With "perl", you can use "\b" to signify a word boundary. -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710
guy@auspex.auspex.com (Guy Harris) (05/02/91)
> With "grep", you can use the "-w" argument to tell it to look for words only.
Well, with some flavors of "grep", anyway. I think Berkeley introduced
the "-w" flag; "-w" is shorthand for "stick a \< and a \> around the
pattern", and the BSD "grep" supports the "\<" and "\>" items as well.
S5's standard regular expression package doesn't support "\<" and "\>"
prior to S5R4, although we added them in SunOS 4.0; "ed", "grep", and
various other programs use them. S5R4's standard regular expression
package does, I think, support them.