[comp.lang.perl] $1, $2 not being set - array/scalar context confusion?

alfie%cs.warwick.ac.uk@nsfnet-relay.ac.uk (Nick Holloway) (03/22/90)

There was recently a thread about a problem with $1 not being set when
a subroutine was called in an array context. I think I understood what
was happening. I am having a problem that is similar in some ways.

What I would like, is either confirmation that I have discovered a bug
in perl (and that it will be fixed - Larry?), or that I have done
something silly, and an explanation of why, and how to avoid it.

I have a problematic routine that is supposed to do range expansion, so
"1,3-5" becomes "(1,3,4,5)". It splits at the delimiter (either "," or
" "), and then looks for terms of the form "\d+" or "\d+-\d". The
latter causes the problem, since if called in an array context (which is
pretty useful if you want to use the array of values!), the pattern
"^(\d+)-(\d+)$" fails to set $1 and $2.

I am willing to be convinced that this is some problem to do with
array/scalar context, but my doubts stem from:

    a) I have an explicit statement to return the array at the end 
       (This was a solution for previous posters problem). From what I
       understood from before - the context the routine was called in
       was affecting the evaluation of "if". Does this apply here?  

    b) If the routine is called in an scalar context, with a pattern
       that contains a "\d+-\d+" term, the routine works as expected in
       later calls.

[I currently have a call to &expand("0-0") in the perl script to start
 it working, commented with "BUG workaround", and my bug logo (see
 .sig) -- it would be nice to be able to remove it]

Following my signature is the code that causes me grief. Also there is
the result of running it through perl, patchlevel 15. (same as 6 & 8 -
I didn't ever have 12).

	Here's hoping that someone can help me,
				Alfie
--
Nick Holloway |  `O O'  | alfie@cs.warwick.ac.uk, alfie@warwick.UUCP,
[aka `Alfie'] | // ^ \\ | ..!uunet!mcsun!ukc!warwick!alfie
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
	sub expand {
	    local ( $arg ) = shift;
	    local ( @nums, $i );

	    print "debug: input = $arg\n";
	    $arg =~ s/\s+([-,])\s+/\1/g;
	    $arg =~ s/\s+/,/g;
	    split ( /,/, $arg );
	    for ( @_ ) {
		if ( /^\d+$/ ) { 
		    print "debug: term = $_\n";
		    push ( @nums, $_ ); 
		    next;
		} elsif ( /^(\d+)-(\d+)$/ ) {
		    print "debug: term = $_ : from ", $1, " to ", $2, "\n";
		    for ( $i = $1; $i <= $2; $i++ ) {
			push ( @nums, $i );
		    }
		    next;
		} else {
		    print "debug: bad term - abandon ship!\n";
		    @nums = ();
		    last;
		}
	    }
	    print "debug: output = @nums\n\n";
	    @nums;
	}

	@a = &expand ( "0-2" );		# fails to expand
	$a = &expand ( "1" );		# doesn't trigger correct operation
	@a = &expand ( "1-3" );		# fails to expand
	$a = &expand ( "2-4" );		# This call starts it working
	@a = &expand ( "3-5" );		# works correctly
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
	debug: input = 0-2
	debug: term = 0-2 : from  to 
	debug: output = 

	debug: input = 1
	debug: term = 1
	debug: output = 1

	debug: input = 1-3
	debug: term = 1-3 : from  to 
	debug: output = 

	debug: input = 2-4
	debug: term = 2-4 : from 2 to 4
	debug: output = 2 3 4

	debug: input = 3-5
	debug: term = 3-5 : from 3 to 5
	debug: output = 3 4 5

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (03/22/90)

In article <1990Mar21.165957.9520@uvaarpa.Virginia.EDU> alfie%cs.warwick.ac.uk@nsfnet-relay.ac.uk writes:
: There was recently a thread about a problem with $1 not being set when
: a subroutine was called in an array context. I think I understood what
: was happening. I am having a problem that is similar in some ways.
: 
: What I would like, is either confirmation that I have discovered a bug
: in perl (and that it will be fixed - Larry?), or that I have done
: something silly, and an explanation of why, and how to avoid it.

It is a bug, and will be fixed in patch 16.  The problem is that perl
was passing the desire for an array on to embedded BLOCKs that weren't
in terminal statements.  It shouldn't do that.  (Of course, it shouldn't
pass the desire for an array to a conditional expression in any event,
but that's another problem.)

By the way, I think you can simplify your routine a little:

sub expand {
    local($tmp) = @_;
    $tmp =~ s/-/../g;
    local(@nums) = eval "($tmp)";
    warn $@ if $@;
    @nums;
}

Alternately, there's

sub expand {
    local($tmp) = @_;
    $tmp =~ s/(\d+)-(\d+)/join(',',$1..$2)/eg;
    warn "Illegal char '$1'" if $tmp =~ /([^\d,\s])/;
    split(',',$tmp);
}

or

sub expand {
    local($tmp) = @_;
    local(@nums);
    push(@nums, $1 .. ($2?$3:$1)) while $tmp =~ s/^,?(\d+)(-(\d+))?//;
    warn "Don't recognize $tmp" if $tmp ne '';
    @nums;
}

Depends on what you're optimizing for at the moment, I s'pose...

Larry

alfie%cs.warwick.ac.uk@nsfnet-relay.ac.uk (Nick Holloway) (03/23/90)

In comp.lang.perl, lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
| It is a bug, and will be fixed in patch 16.  

It's nice to know that my brain hadn't fried, and that I was not
misunderstanding what should happen (and to hear the magic words "will
be fixed").

| By the way, I think you can simplify your routine a little:

I had thought I had seen smaller routines to do the job, but that was
before I became a perl addict. I can see how all three routines work,
but I think I prefer the 2nd (purely subjective - I have not let
matters like performance cloud the issue :-). My original version
allowed you to use either a space or comma to separate the terms (so 
"1 3 - 5" is the same as "1,3-5"), and to return an empty array upon
finding an error, so here is Larry's version 2 of expand with these
spliced back in. Cheers, Larry, for the smaller routine.

sub expand {
    local($tmp) = @_;
    $tmp =~ s/\s+([-,])\s+/\1/g;		# remove non-sep spaces
    $tmp =~ s/\s+/,/g;				# replace sep spaces with ","
    $tmp =~ s/(\d+)-(\d+)/join(',',$1..$2)/eg;	# perform range expansion
    if ( $tmp =~ /([^\d,])/ ) {			# check remainder is valid
	warn "bad character '$1' in expand\n";
	$tmp = "";				# all bets are off!
    }
    split(',',$tmp);				# split and return elements
}

--
Nick Holloway |  `O O'  | alfie@cs.warwick.ac.uk, alfie@warwick.UUCP,
[aka `Alfie'] | // ^ \\ | ..!uunet!mcsun!ukc!warwick!alfie

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (03/24/90)

In article <1990Mar23.120241.20340@uvaarpa.Virginia.EDU> alfie%cs.warwick.ac.uk@nsfnet-relay.ac.uk writes:
: sub expand {
:     local($tmp) = @_;
:     $tmp =~ s/\s+([-,])\s+/\1/g;		# remove non-sep spaces
:     $tmp =~ s/\s+/,/g;				# replace sep spaces with ","
:     $tmp =~ s/(\d+)-(\d+)/join(',',$1..$2)/eg;	# perform range expansion
:     if ( $tmp =~ /([^\d,])/ ) {			# check remainder is valid
: 	warn "bad character '$1' in expand\n";
: 	$tmp = "";				# all bets are off!
:     }
:     split(',',$tmp);				# split and return elements


If you say

	print join(':',&expand('1, 2, 3 ,4'));

you'll see that you have a bug.  The first substitution only works if there's
whitespace on both sides of the comma or hyphen.  Changing the pluses to
stars would fix that.

Larry