[comp.lang.perl] Parens in a reg exp

Mike.McManus@FtCollins.NCR.com (Mike McManus) (09/12/90)

I'm reading tokens in from a file and doing compares on the token, sort of
like:

@toklist = ("VDD", "GND", IN", "OUT", "INOUT");

while(<>) {
    if( do IS_TERM()) {
    ...
    }
}

sub IS_TERM() {
    $token = (split)[1];
    if( join( " ", @tklist) !~ /$token/) {return(0);}
    foreach $i (@toklist) {
        if( $i eq $token) {
            .... 
            return(1);
        }
    }
    return(0);
}

# Warning!  Not the REAL perl code, but a resonable facsimile thereof...

The problem I run into is when $token is something containg a meta character.
Specifically, things die when $token = "IF(".  Anybody got any ideas about a
work around?  I tried some things, but am a green enuff novice that I didn't
have much luck.

NOTE:  I use the 'if(join...) return 0' as a quick and dirty check, before
launching into a 'foreach $i (@toklist)', so as to save time.  Will it, can
forget about it.  This would solve my problem, too.

Thanks!
--
Disclaimer: All spelling and/or grammar in this document are guaranteed to be
            correct; any exseptions is the is wurk uv intter-net deemuns,.

Mike McManus                        Mike.McManus@FtCollins.NCR.COM, or
NCR Microelectronics                ncr-mpd!mikemc@ncr-sd.sandiego.ncr.com, or
2001 Danfield Ct.                   uunet!ncrlnk!ncr-mpd!garage!mikemc
Ft. Collins,  Colorado              
(303) 223-5100   Ext. 378
                                    

Mike.McManus@FtCollins.NCR.com (Mike McManus) (09/12/90)

In article <MIKE.MCMANUS.90Sep12091652@mustang.FtCollins.NCR.com> Mike.McManus@FtCollins.NCR.com (Mike McManus) writes:
> I'm reading tokens in from a file and doing compares on the token, sort of
> like:
...
> The problem I run into is when $token is something containg a meta character.
> Specifically, things die when $token = "IF(".  Anybody got any ideas about a
> work around?  I tried some things, but am a green enuff novice that I didn't
> have much luck.

I am having similar problems when comparing tokens that contain other meta
characters (such as brackets).  Right now, I am using a rather kludgy work
around, ala:

$a = "row[4].gndbus";
$b = "row[4]";

$b =~ s/([\.\[\]\(\)])/\\\1/g;
if( $a =~ /$b/) {
...
}

This seems to work, but it (1) changes $b, and (2) is fairly clumsy.  I'm
looking for better solutions!  Something elegent would be nice, such as:

if( $a =~ /$b/l) {
...
}

where the "l" operator would denote that $b should be taken as a literal
(assume meta characters are literal instead, equiv. to \x).  Isn't this kind of
what the "e" operator does for substitute?  Perl didn't like it when I tried
it here!  So throw me some ideas, I'm all ears!
--
Disclaimer: All spelling and/or grammar in this document are guaranteed to be
            correct; any exseptions is the is wurk uv intter-net deemuns,.

Mike McManus                        Mike.McManus@FtCollins.NCR.COM, or
NCR Microelectronics                ncr-mpd!mikemc@ncr-sd.sandiego.ncr.com, or
2001 Danfield Ct.                   uunet!ncrlnk!ncr-mpd!garage!mikemc
Ft. Collins,  Colorado              
(303) 223-5100   Ext. 378
                                    

merlyn@iwarp.intel.com (Randal Schwartz) (09/13/90)

In article <MIKE.MCMANUS.90Sep12091652@mustang.FtCollins.NCR.com>, Mike.McManus@FtCollins (Mike McManus) writes:
| 
| I'm reading tokens in from a file and doing compares on the token, sort of
| like:
| 
| @toklist = ("VDD", "GND", IN", "OUT", "INOUT");
| 
| while(<>) {
|     if( do IS_TERM()) {
|     ...
|     }
| }
| 
| sub IS_TERM() {
|     $token = (split)[1];
|     if( join( " ", @tklist) !~ /$token/) {return(0);}
|     foreach $i (@toklist) {
|         if( $i eq $token) {
|             .... 
|             return(1);
|         }
|     }
|     return(0);
| }

Shoot.  You're doin' it the hard way.  Give the task to Perl in the
form of a regex that matches *everything* you wanna lookfor in one
fell swoop...


################################################## snip here
@toklist = ("VDD", "GND", "IN", "OUT", "INOUT");

grep(s/\W/\\\&/g,@toklist); # de magic-ize toklist

eval 'sub is_token { $_[0] =~ /(' . join('|',@toklist) . ')/; }';
# add ^ and $ here if you want anchored matches (I think you do)

#testing one two three...
for ('aaa','bbb','VDD','ccc','ddd') {
	print
		"$_ ",
		&is_token($_) ? "IS" : "isn't",
		" a token\n";
}
################################################## snip

There.  OK?

print "Just another token Perl hacker," # :-)
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (09/13/90)

In article <MIKE.MCMANUS.90Sep12091652@mustang.FtCollins.NCR.com> Mike.McManus@FtCollins.NCR.com (Mike McManus) writes:
: 
: I'm reading tokens in from a file and doing compares on the token, sort of
: like:
: 
: @toklist = ("VDD", "GND", IN", "OUT", "INOUT");
: 
: while(<>) {
:     if( do IS_TERM()) {
:     ...
:     }
: }
: 
: sub IS_TERM() {
:     $token = (split)[1];
:     if( join( " ", @tklist) !~ /$token/) {return(0);}
:     foreach $i (@toklist) {
:         if( $i eq $token) {
:             .... 
:             return(1);
:         }
:     }
:     return(0);
: }
: 
: # Warning!  Not the REAL perl code, but a resonable facsimile thereof...
: 
: The problem I run into is when $token is something containg a meta character.
: Specifically, things die when $token = "IF(".  Anybody got any ideas about a
: work around?  I tried some things, but am a green enuff novice that I didn't
: have much luck.
: 
: NOTE:  I use the 'if(join...) return 0' as a quick and dirty check, before
: launching into a 'foreach $i (@toklist)', so as to save time.  Will it, can
: forget about it.  This would solve my problem, too.

To solve your immediate problem, quote metacharacters by saying

	$token =~ s/(\W)/\\$1/g;

However, any time you're doing a linear search in Perl, you're probably
doing it wrong.  Learn to think in terms of associative arrays.

What you want is something like this:

%isterm = ("VDD", 1, "GND", 1, IN", 1, "OUT", 1, "INOUT", 1);

while(<>) {
    ($junk, $token) = split;
    if( $isterm{$token} ) {
    ...
    }
}

Larry