[comp.unix.questions] Word-oriented GREP

nelson@berlioz.nsc.com (Taed Nelson) (04/15/91)

When I use the command "grep V\[0-9\]\[0-9\]\[0-9\] fred.c" it returns
	#define VERSION "V002"
  or somesuch.  What I would really like is just the string of characters
  which matched:
	V002

I thought about it for a while, and I couldn't come up with anything;  even
  AWK seems to offer no nice way of doing it, but this seems like something
  that is at least somewhat common...

Does anyone have any suggestions, preferably limiting the solution to SH or
  CSH, and by not using uncommon tools such as PERL?

Thanks a lot!

pfalstad@phoenix.Princeton.EDU (Paul Falstad) (04/15/91)

nelson@berlioz.nsc.com (Taed Nelson) wrote:
>When I use the command "grep V\[0-9\]\[0-9\]\[0-9\] fred.c" it returns
>  #define VERSION "V002"
>  or somesuch.  What I would really like is just the string of characters
>  which matched:
>  V002
>I thought about it for a while, and I couldn't come up with anything;  even
>  AWK seems to offer no nice way of doing it, but this seems like something
>  that is at least somewhat common...

It would seem so, but this is the simplest thing I could come up with:

sed -n 's/.*V\([0-9][0-9][0-9]\).*/V\1/p' fred.c

or (better, I think):

grep 'V[0-9][0-9][0-9]' fred.c | sed 's/.*V\([0-9][0-9][0-9]\).*/V\1/'

Doesn't work for multiple occurrences of Vxxx though.

I seem to remember something out of the perl man page which makes this
really simple.  Someone will post it perhaps.

--
              Paul Falstad  pfalstad@phoenix.princeton.edu
         And on the roads, too, vicious gangs of KEEP LEFT signs!
     If Princeton knew my opinions, they'd have expelled me long ago.

fitz@mml0.meche.rpi.edu (Brian Fitzgerald) (04/15/91)

Taed Nelson writes:
>
>When I use the command "grep V\[0-9\]\[0-9\]\[0-9\] fred.c" it returns
>	#define VERSION "V002"
>  or somesuch.  What I would really like is just the string of characters
>  which matched:
>	V002

Try:

sed -n -e 's/.*\(V[0-9][0-9][0-9]\).*/\1/p'

(use \[ and \] in csh)

This just gets the first occurrence on a line, though.  Wizards know
how to print more than one occurrence with shell commands, even if they
are not separated by white space. (i.e. V001,V002,...)

Since I'm not a wizard, I tried lex:

%%
V[0-9][0-9][0-9]	printf("%s\n",yytext);
.	|
\n	;

Save as foo.l and type "make LDLIBS=-ll foo" (or "lex -t foo.l >
foo.c ; cc -o foo foo.c -ll")

Brian

merlyn@iwarp.intel.com (Randal L. Schwartz) (04/15/91)

In article <1991Apr15.014626.28903@berlioz.nsc.com>, nelson@berlioz (Taed Nelson) writes:
| When I use the command "grep V\[0-9\]\[0-9\]\[0-9\] fred.c" it returns
| 	#define VERSION "V002"
|   or somesuch.  What I would really like is just the string of characters
|   which matched:
| 	V002
| I thought about it for a while, and I couldn't come up with anything;  even
|   AWK seems to offer no nice way of doing it, but this seems like something
|   that is at least somewhat common...
| Does anyone have any suggestions, preferably limiting the solution to SH or
|   CSH, and by not using uncommon tools such as PERL?

It's such a natural task for Perl, as in:

print -ne 'print "$&\n" if /V\d\d\d/' fred.c

If you have multiple occurrances, you can do the slightly more arcane:

print -ne 's/V\d\d\d/print "$&\n"/eg' fred.c

By the way, if you find Perl to be "uncommon", I suggest you make
yourself more aware of it.  It's everywhere these days.  For example,
there exists at least one computer manufacturer that has it installed
at sysgen time, and it's part of the GNU utils tape.  And because it's
free, and highly (even overly :-) portable, you can put it up yourself
if necessary.

print "Just another Perl hacker,"
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/