worley@compass.com (Dale Worley) (07/14/90)
The program /\d{5,5}/; gets the error message Can't do {n,0} at /compass/c/worley/time/aa line 1. As far as I can tell from the manual, this is legal. Dale Worley Compass, Inc. worley@compass.com -- The living dead don't NEED to solve word problems.
worley@compass.com (Dale Worley) (08/09/90)
I tried to write a program with the following regexp: /(^\s*$)|(^---)/ That is, match any line containing only whitespace, or beginning with '---'. (Are ^ and $ allowed other than at the beginning or end of the regexp?) Perl gives the strange error message: /(^\s*|(^---)/: unmatched () in regexp at ss line 3. Where did the missing ')' go? Actually, it was probably assumed to be part of a '$)' variable. (Can one use '$/' as a variable is a regexp?) What is going on here? What *should* be going on here? Dale Worley Compass, Inc. worley@compass.com -- LA, truth to tell, is not much different from a pretty girl with the clap.
lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (08/11/90)
In article <1990Aug9.155120.2703@uvaarpa.Virginia.EDU> worley@compass.com writes:
: I tried to write a program with the following regexp:
:
: /(^\s*$)|(^---)/
:
: That is, match any line containing only whitespace, or beginning with
: '---'. (Are ^ and $ allowed other than at the beginning or end of the
: regexp?) Perl gives the strange error message:
:
: /(^\s*|(^---)/: unmatched () in regexp at ss line 3.
:
: Where did the missing ')' go?
:
: Actually, it was probably assumed to be part of a '$)' variable. (Can
: one use '$/' as a variable is a regexp?)
:
: What is going on here? What *should* be going on here?
It was misinterpreting $) in patterns as a variable. At patchlevel 27
it's interpreted correctly as an end of line check and a terminating paren.
Which means you can't interpolate $) into a pattern directly.
$/ has never been a problem.
By the way, it's more efficient to factor out the ^ to the front:
/^(\s*$|---)/
The reason for this is that it then knows it doesn't have to start looking
at every single position of the input string. I suppose I should make it
do this optimization itself...
It's probably also faster to put the literal string before the *:
/^(---|\s*$)/
This will be less of a problem after patchlevel 27, but it still helps
some, unless almost all your strings are blank.
Larry
worley@compass.com (Dale Worley) (08/13/90)
X-Name: Larry Wall It was misinterpreting $) in patterns as a variable. At patchlevel 27 it's interpreted correctly as an end of line check and a terminating paren. Which means you can't interpolate $) into a pattern directly. $/ has never been a problem. Well, what exactly are the rules for which variables can be used in regexps and which can't? That is, why is interpreting "$)" as e-o-l and paren correct and interpreting it as a variable incorrect? I guess I don't really need an answer here, but I hope that the book will be enough of a language reference that all such questions will be answered by it. Dale Worley Compass, Inc. worley@compass.com -- "I have the same insecurities as Woody Allen." "Yes, but he's paid more for having them."
white@cg-atla.UUCP (Frank ) (10/05/90)
I am using 'perl' PL18. Running the following script causes "WORD WITH PARENS" to be printed. What's the difference? ----------------- cut here ------------------------------------------- #!/usr/local/bin/perl $_ = "This line contains a\nword beginning a line"; if ( /^word/ ) { print "WORD NO PARENS\n"; } elsif ( /^(word)/ ) { print "WORD WITH PARENS\n"; # I get this message !! } ----------------- cut here ------------------------------------------- Chip White (Uunet!samsung!cg-atla!white) Principal Software Engineer AGFA Compugraphic Division 200 Ballardvale Street Wilmington, Massachusetts 01887 MS-200-3-7K Phone: (508) 658-5600 (x5440) CompuDial: (508) 658-0200 (x5440) -- Chip White AGFA Compugraphic ...!{decvax,samsung}!cg-atla!white 200 Ballardvale St. Wilmington, Mass. 01887
lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (10/06/90)
In article <9134@cg-atla.UUCP> white@cg-atla.UUCP (Frank ) writes:
:
: I am using 'perl' PL18. Running the following script
: causes "WORD WITH PARENS" to be printed. What's the difference?
: ----------------- cut here -------------------------------------------
: #!/usr/local/bin/perl
:
: $_ = "This line contains a\nword beginning a line";
: if ( /^word/ ) {
: print "WORD NO PARENS\n";
: } elsif ( /^(word)/ ) {
: print "WORD WITH PARENS\n"; # I get this message !!
: }
This is documented behavior at patchlevel 18--the man page says you can't
expect ^ to behave consistently in mid-string if $* isn't set.
As of the next patch, ^ should never match in mid string unless $* is set.
Larry
merlyn@iwarp.intel.com (Randal Schwartz) (10/06/90)
In article <9134@cg-atla.UUCP>, white@cg-atla (Frank ) writes: | | I am using 'perl' PL18. Running the following script | causes "WORD WITH PARENS" to be printed. What's the difference? | ----------------- cut here ------------------------------------------- | #!/usr/local/bin/perl | | $_ = "This line contains a\nword beginning a line"; | if ( /^word/ ) { | print "WORD NO PARENS\n"; | } elsif ( /^(word)/ ) { | print "WORD WITH PARENS\n"; # I get this message !! | } Quoting from perl(1): By default, the ^ character is only guaranteed to match at the beginning of the string, the $ character only at the end (or before the newline at the end) and perl does certain optimizations with the assumption that the string contains only one line. The behavior of ^ and $ on embedded newlines ============================================ will be inconsistent. You may, however, wish to treat a ======================= string as a multi-line buffer, such that the ^ will match after any newline within the string, and $ will match before any newline. At the cost of a little more overhead, you can do this by setting the variable $* to 1. Setting it back to 0 makes perl revert to its old behavior. The "Fine" Manual says all. ++$*; $_ = "\nJust another Perl hacker,"; /^J.*/; print $& -- /=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\ | on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III | | merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn | \=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/
phillips@cs.ubc.ca (George Phillips) (10/06/90)
In article <9134@cg-atla.UUCP> white@cg-atla.UUCP (Frank ) writes: > > I am using 'perl' PL18. Running the following script >causes "WORD WITH PARENS" to be printed. What's the difference? >----------------- cut here ------------------------------------------- >#!/usr/local/bin/perl > >$_ = "This line contains a\nword beginning a line"; >if ( /^word/ ) { > print "WORD NO PARENS\n"; >} elsif ( /^(word)/ ) { > print "WORD WITH PARENS\n"; # I get this message !! >} Yep, this is a bug. There's even a passage in the manual page which says, more or less, that ^ does not necessarily work as advertised. If you're using ^ in a regular expression and you're not sure if the string has a newline in it, you'd better do something like: if (/^regexp/ && $` eq "") { # yep, it really did do an anchored match So here is a fixed version of your script: $_ = "This line contains a\nword beginning a line"; if ( /^word/ ) { print "WORD NO PARENS\n"; } elsif ( /^(word)/ && $` eq "" ) { print "WORD WITH PARENS\n"; # I get this message !! } -- George Phillips phillips@cs.ubc.ca {alberta,uw-beaver,uunet}!ubc-cs!phillips