[comp.lang.misc] Using variable for awk regexs

lfk@E40-008-6.athena.mit.edu (Lee F Kolakowski) (06/10/90)

I have a samll problem in awk (gawk specifically), and I hope this is
the right place for this posting (I couldn't find a comp.lang.awk,
maybe there should be one :-) ).

I have a datafile that looks like this

id	regex
PS0TEST	^MAQQWSLQRLA
PS1TEST	^M
PS2TEST	MAQQWSLQRLA$
PS00001	N[^P][ST][^P]
PS00002	SG[A-Z]G

I read this info into 2 arrays (id[], and regex[]).

All is fine up to this point.

then I want to loop through all the regexs and search each against 
a string

    if ($0 ~ /^M/) 
      printf("matches pattern '%s'\n", "test")

    for (i = 1; i <= regex_num ; i++ ) {
      if ($0 ~ regex[i]) {
	printf("matches pattern '%s'\n", id[i])
    }

So the first case works as is should (the pattern /^M/)
But the loop does not match anything. This appears to be
because the variable regex[i] is not being expanded
before the comparison, so every time $0 is being 
compared to the string regex[i].

Is there a work around for this. I can't seem to find any
mention of this sort of thing in "The AWK Programming Language".



--

Frank Kolakowski 

======================================================================
|lfk@hx.lcs.mit.edu                     ||      Lee F. Kolakowski    |
|kolakowski@tropicana.mit.edu           ||	M.I.T.		     |
|lfk@mbio.med.upenn.edu		        ||	Dept of Chemistry    |
|AT&T:  1-617-253-1866                  ||	Room 18-506	     |
|#include <litigate.h>         		||	77 Massachusetts Ave.|
|                                    	||	Cambridge, MA 02139  |
|--------------------------------------------------------------------|
|		           One-Liner Here!                           |
======================================================================

lfk@e40-008-5.athena.mit.edu (Lee F Kolakowski) (06/11/90)

I fixed my problems with the code. I was defining the variables with
forward slashes around them. That is apparently a no-no.

A small problem I have is the size of the regex data (340 patterns).
Gawk runs out of memory, and quits (segmentation fault), but works
fine with the file split in two parts.

Thanks to bwk@research.att.com and J. Kingdom@prep.ai.mit.edu

--

Frank Kolakowski 

======================================================================
|lfk@hx.lcs.mit.edu                     ||      Lee F. Kolakowski    |
|kolakowski@tropicana.mit.edu           ||	M.I.T.		     |
|lfk@mbio.med.upenn.edu		        ||	Dept of Chemistry    |
|AT&T:  1-617-253-1866                  ||	Room 18-506	     |
|#include <litigate.h>         		||	77 Massachusetts Ave.|
|                                    	||	Cambridge, MA 02139  |
|--------------------------------------------------------------------|
|		           One-Liner Here!                           |
======================================================================