[comp.lang.misc] Pointers vs. arrays

cik@l.cc.purdue.edu (Herman Rubin) (11/17/90)

In article <5961@lanl.gov>, jlg@lanl.gov (Jim Giles) writes:
> From article <2732@l.cc.purdue.edu>, by cik@l.cc.purdue.edu (Herman Rubin):
  > [...]
> > Well, here is an example which will definitely not work on all machines.
> > It is desired to move 3 bytes into the three least significant bytes of
> > a 4-byte word (other sizes are appropriate), and repeat this operation.
> > This can be done efficiently by using pointers to words on a machine with
> > unaligned reads.  Even with overhead for unaligned reads, it is hard to
> > see how to do this efficiently otherwise.
 
> It's hard to see what this has to do with the array vs. pointer
> issue which was being discussed.  It looks to me like you've got
> a word that's being 'mapped' as an array of bytes and your loop
> is filling the bottom three bytes.  Where are the pointers?

I neglected to mention that one does not care what goes into the leading
byte.  The use of pointers here is NOT conforming to the usual standards.

The code for this would be

	*y++ = *x++;
        decrease x, treated as a byte pointer, by 1;

This uses one unaligned read and one aligned write, as compared to
three reads and writes.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)

jlg@lanl.gov (Jim Giles) (11/18/90)

From article <2742@l.cc.purdue.edu>, by cik@l.cc.purdue.edu (Herman Rubin):
> [...]
> 	*y++ = *x++;
>         decrease x, treated as a byte pointer, by 1;

Ok.  I see now.  I interpreted your initial request differently.
The equivalent non-pointer version is (including the declarations
you left out):

      bit.32 :: y(Number_of_elements)
      bit.24 :: x(Number_of_elements)
      ...
      do i=1,Number_of_elements
	 y(i) = x(i)
      end do

> [...]
> This uses one unaligned read and one aligned write, as compared to
> three reads and writes.

So does my version.  At least, assuming that 32-bit numbers are
aligned.  (Note: 'bit' data types are unsigned; for signed integers,
the declaration is 'int'.  Your use of pointers is thus seen as
an example of needing to get around restrictions caused by inadequate
control over data types.)

J. Giles

hrubin@pop.stat.purdue.edu (Herman Rubin) (11/20/90)

In article <6291@lanl.gov>, jlg@lanl.gov (Jim Giles) writes:
> From article <2742@l.cc.purdue.edu>, by cik@l.cc.purdue.edu (Herman Rubin):
> > [...]
> > 	*y++ = *x++;
> >         decrease x, treated as a byte pointer, by 1;
> 
> Ok.  I see now.  I interpreted your initial request differently.
> The equivalent non-pointer version is (including the declarations
> you left out):
> 
>       bit.32 :: y(Number_of_elements)
>       bit.24 :: x(Number_of_elements)
>       ...
>       do i=1,Number_of_elements
> 	 y(i) = x(i)
>       end do
> 
> > [...]
> > This uses one unaligned read and one aligned write, as compared to
> > three reads and writes.
> 
> So does my version.  At least, assuming that 32-bit numbers are
> aligned.  (Note: 'bit' data types are unsigned; for signed integers,
> the declaration is 'int'.  Your use of pointers is thus seen as
> an example of needing to get around restrictions caused by inadequate
> control over data types.)

Your version may be far less efficient, depending on the hardware.  At
least on some hardware, an unaligned read is not that much slower than
an aligned read.  Also, the instruction sequences are different.  Mine
says to take 4 bytes at a particular location and move them to another,
and then change the pointer for the source location.  Yours says to form
a 3 byte unit at a given location, and move them into the three bytes
at the destination.

As far as the result, there is no important difference.  But as far as
the implementation, there will be.  Yours would be interpreted as taking
those 3 bytes only, and moving them to the destination, with some convention
about the 4-th byte at the destination.  Mine will take 4 bytes, and just
move them to the destination.  It involves no processing, other than getting
and putting the bytes.  It will not work on all machines, but it will be
much faster on the ones on which it works.

Now conceivably something could be added to the language to allow the
unaligned read/aligned write procedure if the hardware will permit it.
But it still takes the programmer to let the compiler know this.  There
are other situations calling for unaligned reads/writes if they can be
done.  Pointers allows the user who understands the hardware to implement
them.
--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)

jlg@lanl.gov (Jim Giles) (11/29/90)

From article <16900@mentor.cc.purdue.edu>, by hrubin@pop.stat.purdue.edu (Herman Rubin):
> In article <6291@lanl.gov>, jlg@lanl.gov (Jim Giles) writes:
>> From article <2742@l.cc.purdue.edu>, by cik@l.cc.purdue.edu (Herman Rubin):
>> > [...]
>> > 	*y++ = *x++;
>> >         decrease x, treated as a byte pointer, by 1;
> [...]
> Your version may be far less efficient, depending on the hardware.  At
> least on some hardware, an unaligned read is not that much slower than
> an aligned read.  Also, the instruction sequences are different.  Mine
> says to take 4 bytes at a particular location and move them to another,
> and then change the pointer for the source location.  Yours says to form
> a 3 byte unit at a given location, and move them into the three bytes
> at the destination.

Oh.  I _still_ misunderstood what you were after.  well, how about this:

   Type int_overlay
      int.32 :: field
   End type int_overlay

   Int.32 :: y(0:Number_of_elements-1)
   int.8  :: x(0:3*Number_of_elements)

   map x as int_overlay

   do i = 0,Number_of_elements-1
      y(i) = x(3*i).field
   end do

Remember what I have said before that mapping is a _storage_ associated
operation.  This code picks up a four byte field every three bytes through
the X array and stores these values in the Y array.  Assuming that 32 bits
is a word alignment, this does one unaligned load of a whole word and one
aligned store of that word per trip through the loop.

> [...]                   It will not work on all machines, but it will be
> much faster on the ones on which it works.

I prefer using portable features whenever possible.  Which is why I
support the addition of features like mappings to existing languages
(or new ones for that matter).  If mappings were available in your
language, the above code would port everywhere.  To be sure, the code
might be very inefficient on hardware which _really_ penalizes unaligned
memory traffic - but your pointer version would be too AND it wouldn't
port everywhere.

J. Giles