[comp.lang.perl] pattern1 AND pattern2?

dennis@nosc.mil (Dennis Cottel) (04/26/91)

I know how to get a successful match on either of two patterns:
"/pattern1|pattern2/".  How can I say that both have to match?  I've
read most of the book and skimmed the rest, but I can't see it.  I can't
believe it isn't possible with Perl, though!

   Dennis Cottel, dennis@NOSC.MIL, (619) 553-1645  
   Naval Ocean Systems Center, San Diego, CA  92152

rbj@uunet.UU.NET (Root Boy Jim) (04/26/91)

In article <dennis.672606193@woodstock> dennis@nosc.mil (Dennis Cottel) writes:
?I know how to get a successful match on either of two patterns:
?"/pattern1|pattern2/".  How can I say that both have to match?  I've
?read most of the book and skimmed the rest, but I can't see it.  I can't
?believe it isn't possible with Perl, though!

/pat1/ && /pat2/ && do { &whatever(you,like); };
-- 
		[rbj@uunet 1] stty sane
		unknown mode: sane

raymond@math.berkeley.edu (Raymond Chen) (04/26/91)

In article <130399@uunet.UU.NET>, rbj@uunet (Root Boy Jim) writes:
>In article <dennis.672606193@woodstock> dennis@nosc.mil (Dennis Cottel) writes:
>?I know how to get a successful match on either of two patterns:
>?"/pattern1|pattern2/".  How can I say that both have to match?  
>/pat1/ && /pat2/ && do { &whatever(you,like); };

Although this works, it doesn't set $& properly.  Upon further
thought, I believe Mr. Cottel was looking for a pattern that produces
$& such that

        $& =~ /^pat1$/
and     $& =~ /^pat2$/

Assume for the moment that `&' serves the purpose. Then

        "Larry is not just another perl hacker" =~ / \S* \S* &.*r.*p.*/

would set $& = " another perl ".  The pattern matches `two consecutive
space-delimited words such that the letter `r' appears before the
letter `p' somewhere in the string'.

Now I know this could be done conventionally as

	/ \S*r\S*p\S* \S* | \S*r\S* \S*p\S* | \S* \S*r\S*p\S* /

but you can see how it gets more complicated as the combinations
get nastier.  For example, suppose the pattern were

    / \S* \S* &[^aeiou]*a[^aeiou]*e[^aeiou]*i[^aeiou]*o[^aeiou]*u[^aeiou]*/

which matches two consecutive words which together use all the vowels
in order.  For example, it would pull out the third and fourth words from

    "Do not name pious mermaids Fred."

Personally, I'm stumped.
--
eval "format=\nJust another perl hacker,\n.";write;
# Note that version 3.041 segfaults; insert^\n here to remedy.
# I don't have access to 4.x to see if it faults there.

ckd@cs.bu.edu (Christopher Davis) (04/26/91)

 Dennis> == Dennis Cottel <dennis@nosc.mil> 

 Dennis> I know how to get a successful match on either of two patterns:
 Dennis> "/pattern1|pattern2/".  How can I say that both have to match?
 Dennis> I've read most of the book and skimmed the rest, but I can't
 Dennis> see it.  I can't believe it isn't possible with Perl, though!

You didn't say whether they had to match in a particular order, or
whether they could overlap, or anything like that...

If they have to be pattern1 then pattern2, without overlap, try

		/pattern1.*pattern2/

If you don't care about order or overlapping, use

		/pattern1/ && /pattern2/

If you care about order, but they can overlap, hand-hack a regexp to
match the potential overlap order.  If you had 'foobar' and 'bartender',
in that order, you could use

		/foo(bar)+tender/ || /foobar.*bartender/

If you don't care about order, but they can't overlap, try matching the
second one against $` and $'.

		/pattern1/ && ($` =~ /pattern2/ || $' =~ /pattern2/)

Now, I know Larry, Randal, and Tom will come back with better ways to do
all this, but I figured I should at least *answer* a question for
once...
--
   [ Christopher Davis - <ckd@cs.bu.edu> - <..!bu.edu!cs.bu.edu!ckd> ]
    A message destined for delivery in *your* domain is fair game for
  anything you may want to do, up to and including translating the entire
 message, header and all, into Swahili. -- chip@tct.uucp (Chip Salzenberg)

flee@cs.psu.edu (Felix Lee) (04/27/91)

Raymond Chen poses:
>    / \S* \S* &[^aeiou]*a[^aeiou]*e[^aeiou]*i[^aeiou]*o[^aeiou]*u[^aeiou]*/
>which matches two consecutive words which together use all the vowels
>in order.

Well, you could write this as one long regexp, but the technique isn't
easily to generalize.  The way to do this in Perl is to write a loop:

$x = "Do not name pious mermaids Fred.";
while ($x ne '' && $x =~ / \S* \S* /) {
	$m = $&;
	$x = substr($& . $', 1);
	if ($m =~ /^[^aeiou]*a[^aeiou]*e[^aeiou]*i[^aeiou]*o[^aeiou]*u[^aeiou]*$/) {
		print "$&\n";
	}
}

Note that there are at least four different interpretations of pattern
conjunction.  This loop expresses only the simplest.
--
Felix Lee	flee@cs.psu.edu