[comp.lang.perl] Perl-Users Digest #651

marc@ATHENA.MIT.EDU (Marc Horowitz) (03/10/91)

|> From: tchrist@convex.COM (Tom Christiansen)
|> 
|> From the keyboard of emv@ox.com (Ed Vielmetti):
|> :Speaking of which, what's the limits on the number of parentheses that
|> :a regular expression can have?  I got an unpleasant message
|> :
|> :/^(scan)\s+((-(\w*))|)\s*\+(\w*)((s+((||>)(.*)))|)/: too many () in regexp at /u:1/emv/bin/smoke line 167, <> line 1.
|> 
|> Nine.  To accept more, Larry would have to change the code that
|> recognizes \1 .. \9 and $1 .. .$9.
|> 
|> --tom

Ok, that explains it, but it doesn't excuse it.  Sometimes I want to
match an expression complex enough (like Ed's example) that I need
more than nine () to group everything.  In this case, perl should
still work, and I would say that the regexp in an array context should
contain as many elements as paren pairs, but that $1 .. $9 should
contain the first nine () matches.

Comments?

		Marc

mmuegel@camdev.comm.mot.com (Michael S. Muegel) (03/11/91)

In article <1991Mar9.230553.22037@uvaarpa.Virginia.EDU> Marc Horowitz <marc@MIT.EDU> writes:
>Ok, that explains it, but it doesn't excuse it.  Sometimes I want to
>match an expression complex enough (like Ed's example) that I need
>more than nine () to group everything.  In this case, perl should
>still work, and I would say that the regexp in an array context should
>contain as many elements as paren pairs, but that $1 .. $9 should
>contain the first nine () matches.

Funny how all this sprung up right now... I just got done writing a routine
that generates regular expressions to match (possibly) abbreviated keywords.
I was not even aware of this problem so I never thought about it when
writing this routine. This LIMIT will now break it with large keywords. 
Ho, hum...

Like Marc said, since you can do something this:

   @Matches = ($String =~ /some_expr_with_()s/);

and then get at the sub-expressions through @Matches I am surprised there is 
even a limit. Larry, how many free cellular phones would it take to get 
this changed :-).

-Mike

-- 
+-----------------------------------------------------------------------------+
| Mike Muegel                              | Internet: mmuegel@mot.com        |
| Software Tools Group                     | UUCP:     uunet!motcid!muegel    |
| Fort Worth Research & Development Center | Voice:    (817) 232-6129         |
| Cellular Infrastructure Group            | Fax:      (817) 232-6081         |
| Radio Telephone and Systems Group        | Mail:     5555 North Beach St.   |
| Motorola, Inc.                           |           Fort Worth,  TX 76137  |
+-----------------------------------------------------------------------------+

jb3o+@andrew.cmu.edu (Jon Allen Boone) (03/11/91)

> Excerpts from netnews.comp.lang.perl: 9-Mar-91 Re: Perl-Users Digest
> #651 Marc Horowitz@ATHENA.MIT (884)



> Ok, that explains it, but it doesn't excuse it.  Sometimes I want to
> match an expression complex enough (like Ed's example) that I need
> more than nine () to group everything.  In this case, perl should
> still work, and I would say that the regexp in an array context should
> contain as many elements as paren pairs, but that $1 .. $9 should
> contain the first nine () matches.

> Comments?

> 		Marc


we need another (what, ANOTHER?) "special variable" to hold the contents
of a regexp pattern - $1 .. $9 would, like Marc said, hold the first 9
()'s, but you could shift and unshift the variable to make $1 .. $9
change.

how about @PAT ? 

	@PAT  Contains the last pattern matched - $1 .. $9 contain the first
		nine ()'s in @PAT, $+ contains the last bracket matched by
		@PAT.

- jon -

tchrist@convex.COM (Tom Christiansen) (03/11/91)

From the keyboard of jb3o+@andrew.cmu.edu (Jon Allen Boone):
:we need another (what, ANOTHER?) "special variable" to hold the contents
:of a regexp pattern - $1 .. $9 would, like Marc said, hold the first 9
:()'s, but you could shift and unshift the variable to make $1 .. $9
:change.
:
:how about @PAT ? 

I think that would be too much copying.  Let the programmer assign
the pattern match to an array if they want, but don't force all those
extra copies on everyone.

--tom

lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) (03/12/91)

In article <1991Mar9.230553.22037@uvaarpa.Virginia.EDU> Marc Horowitz <marc@MIT.EDU> writes:
: |> From: tchrist@convex.COM (Tom Christiansen)
: |> 
: |> From the keyboard of emv@ox.com (Ed Vielmetti):
: |> :Speaking of which, what's the limits on the number of parentheses that
: |> :a regular expression can have?  I got an unpleasant message
: |> :
: |> :/^(scan)\s+((-(\w*))|)\s*\+(\w*)((s+((||>)(.*)))|)/: too many () in regexp at /u:1/emv/bin/smoke line 167, <> line 1.
: |> 
: |> Nine.  To accept more, Larry would have to change the code that
: |> recognizes \1 .. \9 and $1 .. .$9.
: |> 
: |> --tom
: 
: Ok, that explains it, but it doesn't excuse it.  Sometimes I want to
: match an expression complex enough (like Ed's example) that I need
: more than nine () to group everything.  In this case, perl should
: still work, and I would say that the regexp in an array context should
: contain as many elements as paren pairs, but that $1 .. $9 should
: contain the first nine () matches.

There's actually no syntactic reason we can't have $10 and higher, but
\10 already means a backspace.  Implementation-wise, Henry's regexp package
has a hardwired limit of 9, and fixing that will entail changing the
way the pattern-matching opcodes are set up, and this is non-trivial,
I think.  I kludged around this when adding {n,m}, and maybe I'll just
kludge it again.  Sigh.  I agree that the limit shouldn't be there.

Larry

lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) (03/12/91)

In article <397@camdev.comm.mot.com?> mmuegel@camdev.comm.mot.com	 (Michael S. Muegel) writes:
: Larry, how many free cellular phones would it take to get 
: this changed :-).

I'll pay for the phone if you'll pay the phone bill.  :-)

Larry