[comp.lang.perl] bit vectors, lvalues, and magical relational operators

tchrist@convex.com (Tom Christiansen) (11/20/90)

Here's three postings in one:

1. BIT VECTORS

I've been playing with bit vectors (which are really pretty neat as long
as you're careful not to == them), and I'm not entirely clear on what the
third value is the in the vec() call.  I thought it was a length analogous
to substr, but empirical evidence suggests otherwise.  I looked briefly at
do_vec(), and it looks like you're using it as some sort of multiplier.
Could you shed any light here?  The only one of these that gives results I
understand is the first one:

    vec($a, 40, 1) = 1; &pvec($a);
    vec($b, 40, 4) = 1; &pvec($b);
    vec($c, 40, 1) = 4; &pvec($c);
    vec($d, 40, 4) = 4; &pvec($d);

    sub pvec {
	local($v) = @_;
	local($i);
	local($len) = length($v);
	printf "%5d: ", $len;
	print "\n"; return;
	for ($i = 0; $i < ($len*8); $i++) {
	    print vec($v,$i,1);
	    print ' ' if ($i + 1) % 8 == 0;
	}
	print "\n";
    }

And how come I can do |, &, ^, and ~ on bits, but not << or >>?


2. LVALUEs

On lvalues, I thought assign returned an lvalue, but that's not quite
so in all contexts, is it?  None of these work:

    1. ($a = $b) = $c;
    2. chop(@list = <>);
    3. ($x ? $a : $b) = $c;

Even though these are just fine:

    4. ($a = $b) =~ tr/a-z/A-Z/;
    5. chop($a = $b);
    6. print ++($a = 'red');  # "ree"

I don't see why 5 should work and 2 not, nor why 6 works but 1 and 3
do not.  I'm sure the answer is buried deep inside perl.y somewhere.


3. MAGICAL RELATIONAL OPERATORS

Here's another idea: the in the spirit of || and && being a bit
magical by allowing you do say:

    $a = $b || $c;

instead of 

    ($a = $b) || ($a = $c);

how about having relational operators like < and > being defined to return
one of their operands in such a way as to make this possible:

    if ($a < $b < $c)
    if ($a > $b > $c)

They're already leftly-associative, so that's ok.  All they'd
need to do is return return their left-hand operand if true
instead of 1.  How many scripts would this break?  I think
people usually do 

    if ($a < $b) 

which would still work fine, instead of :

    $a = 1 + ($b > $c);

which would probably not.

Of course, once you've done this, you'd want the same treatment
on "<=", ">=", "lt", "le", "gt", and "ge" as well.

How does that sound?


--tom

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (11/21/90)

In article <108948@convex.convex.com> tchrist@convex.com (Tom Christiansen) writes:
: Here's three postings in one:
: 
: 1. BIT VECTORS
: 
: I've been playing with bit vectors (which are really pretty neat as long
: as you're careful not to == them), and I'm not entirely clear on what the
: third value is the in the vec() call.  I thought it was a length analogous
: to substr, but empirical evidence suggests otherwise.  I looked briefly at
: do_vec(), and it looks like you're using it as some sort of multiplier.
: Could you shed any light here?  The only one of these that gives results I
: understand is the first one:
: 
:     vec($a, 40, 1) = 1; &pvec($a);
:     vec($b, 40, 4) = 1; &pvec($b);
:     vec($c, 40, 1) = 4; &pvec($c);
:     vec($d, 40, 4) = 4; &pvec($d);
: 
:     sub pvec {
: 	local($v) = @_;
: 	local($i);
: 	local($len) = length($v);
: 	printf "%5d: ", $len;
: 	print "\n"; return;
: 	for ($i = 0; $i < ($len*8); $i++) {
: 	    print vec($v,$i,1);
: 	    print ' ' if ($i + 1) % 8 == 0;
: 	}
: 	print "\n";
:     }

The third argument (BITS) is the number of bits per element in the vector, so
each element can contain an unsigned integer in the range 0 .. 2**BITS-1.
As many elements are packed into each byte as possible, and the ordering
is such that vec($foo,0,1) is guaranteed to go into the lowest bit of the
first byte of the string.  To find out the position of the byte in which
an element is going to be put, you have to multiply the maximum OFFSET (40
in the example above) by the number of elements per byte.  When BITS is 1,
there are 8 elements per byte, so the 40th bit will be put into the byte
at offset 5, making a string 6 long.  When BITS is 4, there are 2 nybbles
per byte, so the value goes into the byte with offset 20, and the length
is 21.

Your program (with the return commented) prints something like this:

    6: 00000000 00000000 00000000 00000000 00000000 10000000
   21: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
       00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
       00000000 00000000 00000000 00000000 10000000
    6: 00000000 00000000 00000000 00000000 00000000 00000000
   21: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
       00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
       00000000 00000000 00000000 00000000 00100000

The lengths we've explained.  Note the second and fourth cases differ
simply in having a 1 vs a 4 in the nybble in question.  In the third
case, you've written a 4 into a single bit, and since 4 % 2 is 0, you
get a 0 bit.

By the way, as of the next patch, there are pack/unpack options to handle
bit strings and hex strings, so you'll be able to simplify your vector
printing subroutine:

    sub pvec {
	local($v) = @_;
	printf "%5d: ", length($v);
	local($bits) = unpack("b*", $v);
	$bits =~ s/(........)/$1 /g;
	print $bits,"\n";
    }

: And how come I can do |, &, ^, and ~ on bits, but not << or >>?

Nobody's asked for them.  Remember, I'm not a language designer, just
a squeaky wheel greaser.

As Stephane said in soc.culture.french, "perl est franchement blue-collar."

Again, after the next patch, you could say

    sub leftshift {
	local($bits) = unpack("B*",$_[0]);
	substr($bits,0,$_[1]) = '';
	$bits .= '0' x $_[1];		# keep same length
	$_[0] = pack("B*", $bits);
    }

Note the capital B template, which unpacks in the opposite bit order
to lower case b.

: 2. LVALUEs
: 
: On lvalues, I thought assign returned an lvalue, but that's not quite
: so in all contexts, is it?  None of these work:
: 
:     1. ($a = $b) = $c;
:     2. chop(@list = <>);
:     3. ($x ? $a : $b) = $c;
: 
: Even though these are just fine:
: 
:     4. ($a = $b) =~ tr/a-z/A-Z/;
:     5. chop($a = $b);
:     6. print ++($a = 'red');  # "ree"
: 
: I don't see why 5 should work and 2 not, nor why 6 works but 1 and 3
: do not.  I'm sure the answer is buried deep inside perl.y somewhere.

There, and consarg.c.  I suspect 1 is being interpreted as an assignment
to a list, though it still oughta work even if that's the interpretation.
However, since it does something stupid, nobody ever uses that construct.
I thought 2 worked once--maybe I busted it.  I've never tried to support 3--
I'd have to change how lvalues are passed on the stack, I think.  Maybe not.

But lvalues is one of those places where the evolutionary nature of Perl
shows--the processing of lvalues internally is rather ad hoc (not that
the rest of Perl isn't).  No doubt things could be more consistent.
If I didn't have my family to support I'd take six months off and turn
Perl into a real language.

: 3. MAGICAL RELATIONAL OPERATORS
: 
: Here's another idea: the in the spirit of || and && being a bit
: magical by allowing you do say:
: 
:     $a = $b || $c;
: 
: instead of 
: 
:     ($a = $b) || ($a = $c);
: 
: how about having relational operators like < and > being defined to return
: one of their operands in such a way as to make this possible:
: 
:     if ($a < $b < $c)
:     if ($a > $b > $c)
: 
: They're already leftly-associative, so that's ok.

Well, actually, there non-associative at the moment, but that could change.

: All they'd
: need to do is return return their left-hand operand if true
: instead of 1.  How many scripts would this break?

Most of them.  Take $a < $b < $c.  What if the left operand is 0?
What if $a is greater than $b but less than $c?  I think you have to
separate out success and failure from return values ala Icon to get
the semantics you want.

I'd rather see something like

	if ($b in $a..$c) {

On the other hand, $a..$c currently implies a list of integers--what if
$b is fractional?

Larry