[comp.lang.c] When it is amoral...

bph@buengc.BU.EDU (Blair P. Houghton) (05/04/89)

			   Hey!  That's me!! :-) --vvvvvvvvvvvvvvvvv
In article <2747@buengc.BU.EDU> bph@buengc.bu.edu (Blair P. Houghton) writes:
>
[...You've read the movie, now go see the book!!...]

Thanks to all who pushed me upright.  I seem to have been tottering on
the brink of looneytunes for the past few days (pulling ten all-nighters
in 2.5 weeks is no way to live, especially when yer a grad stoont and
don't have to put up with that final-exam shit anymore.  Yup.  I go
40hrs sleepless because I _want_to_... %-|)

Anyway, it's now well established that I've harbored a fundamental
misconception about C and pointer indexing for a number of years.

I think I've got it identified, and I hope someone else benefits from
my story.  In Jim Kingdon's emailed reply to my original posting, he
mentioned that I should be wary of scaling in pointer arithmetic.  I
wrote back:

	I know all about the automatic scaling of an integer in
	pointer arithmetic.

	I've just always felt a little guilty about using what I'd
	interpreted to be a convenience.  Rather than a proper
	feature of the C syntax, I'd always felt that adding an
	integer to a pointer was allowed as being "the obvious
	thing that should be done in that situation."  There are
	lots of 'obvious' things that are _not_ implemented that
	way, however, and that should have been a clue.  Still, I
	just took it for granted that what was going on was a
	promotion from integer to pointer, then the addition.

	Now I know much, much better.  A pointer isn't really a
	data type at all.  It's a different sort of beastie,
	germane only within a program, and therefore it is
	fallacious to see it as having the properties of a
	countable, measurable, or otherwise physical entity.

The way pointers work implies that they don't refer at all to
anything physical, even to memory locations.

It doesn't change the fact that I'd like to be able to add and subtract
pointers regardless of what trouble I _might_ get into.  Considering the
general level of danger incurred by programming in something so potentially
obfuscatory as C, it's a small barrier to remove.  The arguments of
"why would you want to do _that_?" don't hold water.  I counter with
"Why would you want variadic functions?" and "Why would you want to
define mathematical routines when you can write your own in assembler
and link to them?"

				--Blair
				  "And one of these days I'll know what
				   the middle pedal on a piano is for."

gwyn@smoke.BRL.MIL (Doug Gwyn) (05/04/89)

In article <2763@buengc.BU.EDU> bph@buengc.bu.edu (Blair P. Houghton) writes:
-	Now I know much, much better.  A pointer isn't really a
-	data type at all.  It's a different sort of beastie,
-	germane only within a program, and therefore it is
-	fallacious to see it as having the properties of a
-	countable, measurable, or otherwise physical entity.
-The way pointers work implies that they don't refer at all to
-anything physical, even to memory locations.

I don't think you have it figured out yet.

-It doesn't change the fact that I'd like to be able to add and subtract
-pointers regardless of what trouble I _might_ get into.

Fine; cast them into the appropriate integral type and do whatever
integral arithmetic your heart desires.  Don't expect anyone else to
appreciate the beauty of your code when you've done this..

guy@auspex.auspex.com (Guy Harris) (05/04/89)

>It doesn't change the fact that I'd like to be able to add and subtract
>pointers regardless of what trouble I _might_ get into.  Considering the
>general level of danger incurred by programming in something so potentially
>obfuscatory as C, it's a small barrier to remove.

I guess the question is "what do you expect to happen when you add
pointers?"  The definition of "pointer + int" doesn't refer to low-level
bit-banging - i.e., the "multiply by the size of the object and then add
the (binary) values of the pointer and the result of the multiplication"
is perhaps more of a guide to understanding for people unfamiliar with
C-style pointer addition but familiar with bit-banging, and a guide to
the implementer, than a definition.  (Check out the Dec. 7, 1988 draft
- the "multiply by the size of the object..." stuff is relegated to a
footnote.)

Can you come up with a similar definition for the addition of two
pointers? If not, it sounds like it may be implementation-dependent, in
which case C implementations may let you do what you want by casting the
pointers properly (perhaps to some integral type), in which case you
*could* do what you want - you just have to say "pretty please" to the C
compiler.  I have no problem whatsoever with that; I *like* it when the
compiler discovers I've screwed up when it detects a type mismatch - it
saves me from discovering it at run time.  Think of it as a fail-safe.

peter@ficc.uu.net (Peter da Silva) (05/04/89)

Basically a pointer is a cardinal, not an ordinal.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.

Business: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.
Personal: ...!texbell!sugar!peter, peter@sugar.hackercorp.com.

feg@clyde.ATT.COM (Forrest Gehrke) (05/05/89)

In article <2763@buengc.BU.EDU>, bph@buengc.BU.EDU (Blair P. Houghton) writes:
> 
> The way pointers work implies that they don't refer at all to
> anything physical, even to memory locations.
> 

On some systems, particularly one that does not deal in virtual 
memory, a pointer can be referring to a real memory location.
Which leads me to wonder what you would do, as you insist upon
being able to do, with the sum of two pointers?  If you can 
find a use for this, why stop there?  There's also multiplication
and division......

> It doesn't change the fact that I'd like to be able to add and subtract
                                                         ^^^ 
> pointers regardless of what trouble I _might_ get into.  Considering the
> general level of danger incurred by programming in something so potentially
> obfuscatory as C, it's a small barrier to remove.  The arguments of
> "why would you want to do _that_?" don't hold water.  I counter with
> "Why would you want variadic functions?" and "Why would you want to
> define mathematical routines when you can write your own in assembler
> and link to them?"

Perhaps this question is asked because while we can think of uses
for wanting these things, we haven't thought of what we could do 
with the sum of pointers.  You haven't yet provided any 
motives either.

Forrest Gehrke

bph@buengc.BU.EDU (Blair P. Houghton) (05/05/89)

In article <1558@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
>
>Can you come up with a similar definition for the addition of two
>pointers?

As Gwyn said (and he thinks I rejected it, but I didn't, he should check
out that I was referring to his statement, not mine...) "addition of two
pointers is meaningless."  Addition of 0xfffe and 0xfffe puts you over the
top of a data segment and that's bogus.  Add that, as I posted earlier
today, I now see that the "scaling" of pointers isn't to be trusted,
either, and this addition doesn't even do what it appears to do, whether
that's what I want or not.

Addition of two pointers?  I'd never do it.  Addition of two things that
have pointer type?  That I might do.

	float array[100]
	float *a1, *a2;

	float *diffa;

	struct gomessyourself brray[100];
	struct gomessyourself *b1, *b2;

	/* ...Code setting the array values, and pointing
	   a1, a2, and b1 into their respective arrays... */

	dobedobedoo(array,brray,&a1,&a2,&b1);

	/* Want to align the pointer differences; how far
	   apart are the a's ? */

	diffa = a2 - a1;  /* Bogus in normal C */

	/* Now align the b's */

	b2 = b1 + diffa;

What does that do?  It tries to add a float-pointer to a
struct-ugly-pointer.  Mismatch errors fill my screen with a phosphorescent
radiance.  And well they should.

Normal C, however, allows this sort of thing by telling one to declare

	int diffa;

and merrily arithmeticise.  No warnings, no complaints, not even a bump
in the great, grassy field of complacen-C....

Mind you, it's real nice to be able to do it this way, and if one
couldn't, we'd all be screaming for it, but it feels the same as adding
shorts, longs, and ints together haphazardly.

				--Blair
				  "Okay, `long diffa'.
				   There, you happy?"

guy@auspex.auspex.com (Guy Harris) (05/05/89)

>Normal C, however, allows this sort of thing by telling one to declare
>	
>	int diffa;
>
>and merrily arithmeticise.  No warnings, no complaints, not even a bump
>in the great, grassy field of complacen-C....

Or, even better, "ptrdiff_t diffa", in (p)ANS C at least, where
"ptrdiff_t" is defined in <stddef.h>.

>Mind you, it's real nice to be able to do it this way, and if one
>couldn't, we'd all be screaming for it, but it feels the same as adding
>shorts, longs, and ints together haphazardly.

Well, think of "ptrdiff_t" as syntactic sugar-coating, or syntactic
buffering to prevent an upset stomach (you remember, the little "B"s
bouncing around your stomach, which don't have the sharp points that the
little "A"s do, and thus don't poke into your stomach lining); you now
have objects of various flavors of pointer type, of various flavors of
integral type - and of a type that means "difference between pointers",
although that type happens to be one of the integral types.  Now, while
pointers come in different types, "difference between pointers" happens
to come in only one type - but then, measurements come in different
types (cm, inches, grams, furlongs, fortnights, etc.), but ratios of
measurements come in one type, too....

hutch@lzaz.ATT.COM (R.HUTCHISON) (05/05/89)

Re: addition of two pointers

How 'bout in a binary search to find the middle char in an array of
characters?

	midpoint_pointer = (start_pointer + end_pointer) / 2;

Yes, I realize that you can...

	midpoint_pointer = ((end_pointer - start_pointer) / 2) + start_pointer;

... but I think people were asking when you might ever want to add two
pointers and where it could possibly me meaningful.  After all, if the
two variables were not pointers and represented distances (in feet) 
and I wanted the midpoint, I might choose the first approach.

Bob Hutchison
lzaz!hutch

peter@ficc.uu.net (Peter da Silva) (05/06/89)

In article <563@lzaz.ATT.COM>, hutch@lzaz.ATT.COM (R.HUTCHISON) writes:
> 	midpoint_pointer = (start_pointer + end_pointer) / 2;

You're right. It's a valid operation.

(what's the average of the 7 clubs and the 9 clubs?)
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.

Business: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.
Personal: ...!texbell!sugar!peter, peter@sugar.hackercorp.com.

dts@quad.uucp (David T. Sandberg) (05/06/89)

In article <4093@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>In article <563@lzaz.ATT.COM>, hutch@lzaz.ATT.COM (R.HUTCHISON) writes:
>> 	midpoint_pointer = (start_pointer + end_pointer) / 2;
>
>You're right. It's a valid operation.

It is?  What about when the difference between the two is an odd
number of multibyte units?  Writing to *midpoint_pointer after
setting it up in such a fashion is going to corrupt some data
(unless, of course, you *want* to modify the last byte(s) of one
unit and the first byte(s) of the next ;')

>(what's the average of the 7 clubs and the 9 clubs?)

More to the point, what's the difference between the 7 clubs and the
8 clubs?

Perhaps you mean that it is an *allowed* operation, without regards
to the wisdom of same.  Or perhaps you guys are only talking about
char pointers (and then what happens if you run into a multibyte char
machine?).

-- 
  char *david_sandberg()
  {
      return ( dts@quad.uucp || uunet!rosevax!sialis!quad!dts );
  }

peter@ficc.uu.net (Peter da Silva) (05/07/89)

In article <127@quad.uucp>, dts@quad.uucp (David T. Sandberg) writes:
> It is?  What about when the difference between the two is an odd
> number of multibyte units?

You have to align the result before using it, of course.

> >(what's the average of the 7 clubs and the 9 clubs?)

> More to the point, what's the difference between the 7 clubs and the
> 8 clubs?

	clubs_t clubs[13];

	printf("%d\n", &clubs[7]-&clubs[8]);

The result better be -1.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.

Business: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.
Personal: ...!texbell!sugar!peter, peter@sugar.hackercorp.com.

mcdaniel@uicsrd.csrd.uiuc.edu (Tim McDaniel) (05/07/89)

Suppose that pointer addition means that "&A[i] + &A[j] == &A[i+j]".
(Is this what you had in mind, Peter?)

In a previous article, I showed that "&A[i] + &A[j] == &A[i+j]"
requires (in general) that pointers be represented internally as pairs
"(location,base_of_array)".  To briefly recap, if we want
        Total = A + B;
we can't just add the addresses as integers.  For example, suppose
that A and B point into an array Array that starts at address 100, and
that A is 102 and B is 104.  A+B must then be 106.  Adding A+B as
integers would yield 102+104 == 206, which is incorrect.  We must use
pairs as above and compute Total by
        Total.base = A.base;            /* or B.base */
        Total.loc = (A.loc - A.base) + B.loc;
(so Total.loc = (102-100)+104 == 106, as required).


In article <127@quad.uucp>, dts@quad.uucp (David T. Sandberg) writes:
> More to the point, what's the difference between the 7 clubs and the
> 8 clubs?
I think he meant "midpoint", not "difference".

In article <4097@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>You have to align the result before using it, of course.

I agree that these results would have to be aligned.  But how?

For
            some_type * M, * A, * B;
let's compare
  a:        M = (A + B) / 2;
versus
  b:        M = A + (B - A) / 2;

a: Mid, A, and B take 6 addresses to represent: (M.loc,M.base),
(A.loc,A.base), (B.loc,B.base).  Assume that A and B point into the
same array, so A.base==B.base.  The best implementation I can think of
is
        t = (A.loc + B.loc) / 2                   /* average */
                - A.base;                         /* begin aligning */
        M.loc = (t - t % sizeof *A) + A.base;     /* rest of aligning */
        M.base = A.base;
You see, it's not guaranteed that "(A.loc+B.loc)/2" is aligned for
A.base.  We have to see how far it is from A.base (the first
subtraction), align that to "sizeof (some_type)" (second subtraction
and the modulo), and add A.base back in to get the address.
Total operation count: 1 divide (or right shift), 1 modulo (or
bitwise-AND if "sizeof *A" is a power of 2), 4 adds/subtracts.  Note
that a modulo is usually as expensive as a divide.

We can do better if A.base is known to be aligned on a "sizeof (*A)"
boundary.  However, many systems can't guarantee this for arbitrarily-
large types.  For example, if A.base was malloced, it may only be at
an 8-byte boundary (for a VAX).  Even if we know that A.base is so
aligned (for example, if some_type is long, and all longs are so
aligned), we can only improve it to
        t = (A.loc + B.loc) / 2;
        M.loc = t - t % sizeof *A;
        M.base = A.base;
Operations: 1 divide (or right shift), 1 modulo (or bitwise-AND if
"sizeof *A" is a power of two), 2 adds/subtracts.  Even this solution
ignores overflow for the first add.

b: If no single data object takes more than half of memory, overflow
is impossible.  Representing the 3 pointers uses only 3 addresses:
Mloc, Aloc, Bloc. 
        Mloc = Aloc + (Bloc - Aloc) / sizeof *A / 2;
Operations: 1 divide (or right shift if "sizeof *A" is a power of 2),
2 adds/subtracts.


So pointer addition doubles the size of pointers, runs slower, and is
susceptible to overflow.  As far as I can tell, the only thing it
gains us is the ability to write
        midpoint = (start + end) / 2;
instead of
        /* midpoint is the midpoint of start and end, because
         * A + (B-A)/2 == A + B/2 - A/2 == A/2 + B/2 == (A+B)/2. */
        midpoint = start + (end - start) / 2;


Peter, you can't just say "let there be pointer addition", and
                         FIAT LUX!
there is pointer addition, and it is good.  You have to propose an
actual implementation.  Furthermore, if this implementation is more
costly than the current situation, you have to show that the gains
outweigh the costs.  You have done neither.

--

             Tim, the Bizarre and Oddly-Dressed Enchanter

Center for      |||  Internet, BITNET:  mcdaniel@uicsrd.csrd.uiuc.edu
Supercomputing  |||  UUCP:     {uunet,convex,pur-ee}!uiucuxc!uicsrd!mcdaniel
Research and    |||  ARPANET:  mcdaniel%uicsrd@uxc.cso.uiuc.edu
Development,    |||  CSNET:    mcdaniel%uicsrd@uiuc.csnet
U of Illinois   |||  DECnet:   GARCON::"mcdaniel@uicsrd.csrd.uiuc.edu"

dts@quad.uucp (David T. Sandberg) (05/08/89)

In article <925@garcon.cso.uiuc.edu> mcdaniel@uicsrd.csrd.uiuc.edu (Tim McDaniel) writes:
>In article <127@quad.uucp>, dts@quad.uucp (David T. Sandberg) writes:
>> More to the point, what's the difference between the 7 clubs and the
>> 8 clubs?
>
>I think he meant "midpoint", not "difference".

Yes, "midpoint" is a more correct description of what I meant.  In
retrospect, I should consider myself fortunate that someone actually
understood what I was trying to say.   ;')

-- 
  char *david_sandberg()
  {
      return ( dts@quad.uucp || uunet!rosevax!sialis!quad!dts );
  }

gwyn@smoke.BRL.MIL (Doug Gwyn) (05/08/89)

In article <4093@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>In article <563@lzaz.ATT.COM>, hutch@lzaz.ATT.COM (R.HUTCHISON) writes:
>> 	midpoint_pointer = (start_pointer + end_pointer) / 2;
>You're right. It's a valid operation.
>(what's the average of the 7 clubs and the 9 clubs?)

What's the average of the 7 of clubs and the 10 of clubs?
What if sizeof(club) > 1?  Shouldn't we end up pointing into
the middle of a club?

How is any sane language definition supposed to factor in
such application-specific notions of what is "appropriate"
behavior?

C provides a sufficiently simple way to specify exactly what
is intended in such situations.

throopw@bert.dg.com (Wayne A. Throop) (05/08/89)

> bph@buengc.BU.EDU (Blair P. Houghton)
> Thanks to all who pushed me upright.  

I think either your seat or your tray table is not yet in the
full upright position, because what you say here:

> I think I've got it [..that is, the misunderstanding about pointer 
> arithmetic..] identified, and I hope someone else benefits from
> my story.  In Jim Kingdon's emailed reply to my original posting, he
> mentioned that I should be wary of scaling in pointer arithmetic.  I
> wrote back:
> 	I know all about the automatic scaling of an integer in
> 	pointer arithmetic.
>       [... and ...]
> 	Now I know much, much better.  A pointer isn't really a
> 	data type at all.  
> [... and ...]
> The way pointers work implies that they don't refer at all to
> anything physical, even to memory locations.

This is wrong.  Very wrong.  First off, the integer is not scaled
in pointer/integer arithmetic.  The pointer is.  Second, pointers
are data types every bit as much as floating point, or integers, or
characters are.  Third, pointers work as if they are references to
any of the types of objects expressable in C (though not all instances
of those objects can be refered to by a pointer value).

The pointer is scaled so that details of particular implementation of
pointer on any machine can be made portable.  The details are hidden by
the compiler, so that machine independant pointer manipulations can be
done.  This is a major innovation in C, and generally (as low level
features go) a Good Thing.

Pointers are data types, in that they are objects which contain 
information just as any other C type does.  As with, say, floating
point types, C remains very cagy about the the exact structure of
pointer data expressed in bits.

And finally pointers have well-defined operations which can be performed
against them, in particular indirection and arithmetic, and well-defined
meanings for the values they can take on.  So I'd say they have a very
good definition of what they "work as if".

> It doesn't change the fact that I'd like to be able to add and subtract
> pointers regardless of what trouble I _might_ get into.  Considering the
> general level of danger incurred by programming in something so potentially
> obfuscatory as C, it's a small barrier to remove.  

So why try to guild the lilly?  This barrier HAS been removed, in that
when pointers and integers are mixed, very reasonable, machine independant
semantics are attached to the process.

The point is, Blair expects an object defined as one type to behave both
as if it were a pointer and an offset at the same time.  This is NOT sensible.

--
If it could be demonstrated that any complex organ existed which coult not
possibly have been formed by numerous, successive, slight modifications,
my theory would absolutely break down.
                              --- Charles Darwin
--
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw

diamond@diamond.csl.sony.junet (Norman Diamond) (05/08/89)

In article <563@lzaz.ATT.COM> hutch@lzaz.ATT.COM (R.HUTCHISON) writes:

>Re: addition of two pointers
>
>How 'bout in a binary search to find the middle char in an array of
>characters?
>
>	midpoint_pointer = (start_pointer + end_pointer) / 2;

Yup, and if you have an array of ints?  Let's see, half-way between
the 1st and 6th elements, we have the 3.5th element.

Well yeah, you've got chars not ints.  Only the machine uses word
addressing so that your pointers use some extra bits to indicate
the offset within a word.

If you're planning to leave your present employer before they ever
have to port your code to another machine, you could do this:

   midpoint_pointer = (char *) (((unsigned long) start_pointer +
      (unsigned long) end_pointer) / 2);

--
Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.co.jp@relay.cs.net)
  The above opinions are my own.   |  Why are programmers criticized for
  If they're also your opinions,   |  re-inventing the wheel, when car
  you're infringing my copyright.  |  manufacturers are praised for it?

peter@ficc.uu.net (Peter da Silva) (05/08/89)

In article <925@garcon.cso.uiuc.edu>, mcdaniel@uicsrd.csrd.uiuc.edu (Tim McDaniel) writes:
> Suppose that pointer addition means that "&A[i] + &A[j] == &A[i+j]".
> (Is this what you had in mind, Peter?)

No, I had in mind ((ptrdiff_t)&A[i]) - ((ptrdiff_t)&A[j]). It's an
intermediate result only. Which you point out in great detail.

> Peter, you can't just say "let there be pointer addition", and
>                          FIAT LUX!
> there is pointer addition, and it is good.  You have to propose an
> actual implementation.  Furthermore, if this implementation is more
> costly than the current situation, you have to show that the gains
> outweigh the costs.  You have done neither.

Chill out, dude :->.

I was arguing *against* it as a general operation. I was just noting that there
*is* (despite my previous beliefs) an application for it.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.

Business: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.
Personal: ...!texbell!sugar!peter, peter@sugar.hackercorp.com.

bph@buengc.BU.EDU (Blair P. Houghton) (05/09/89)

In article <5779@xyzzy.UUCP> throopw@bert.dg.com (Wayne A. Throop) enscreeds:
>> bph@buengc.BU.EDU (Blair P. Houghton)
>> Thanks to all who pushed me upright.  

Yeah, thanks for the whiplash... ( ;-) just kidding :-) )

>The point is, Blair expects an object defined as one type to behave both
>as if it were a pointer and an offset at the same time.  This is NOT sensible.

Nay, I want it to behave as an object, and not as some nebulous changeling
requiring maintenance-by-fiat.  If one is yea-big and the other is yo-big,
then I damn well want the difference between them to be yea-yo, at the
very least when I _tell_ it to be so.

It's sensible aplenty.  The only question is:  does it add sufficient
functionality to justify the trouble it allows?  That there's a question
for the more scientific amongst this gaggle of Computo-Scientists.

>If it could be demonstrated that any complex organ existed which coult not
>possibly have been formed by numerous, successive, slight modifications,
>my theory would absolutely break down.

Interesting logic in there: "If it could...then...would absolutely"

It implies that it must be that "it" could not; since therefore the theory
would break down absolutely, and since the theory doesn't, then therefore
"it" couldn't "be demonstrated".

Good.  That saves me a lot of time trying to demonstrate it.

>                              --- Charles Darwin

I get flamed by the most impressive people...

				--Blair
				  "Speaking of fiat..."

stacey@hcr.UUCP (Stacey Campbell) (05/10/89)

In article <4093@ficc.uu.net> peter@ficc.UUCP writes:
>In article <563@lzaz.ATT.COM>, hutch@lzaz.ATT.COM (R.HUTCHISON) writes:
>> 	midpoint_pointer = (start_pointer + end_pointer) / 2;
>
>You're right. It's a valid operation.
>
>(what's the average of the 7 clubs and the 9 clubs?)

Que?

On a segmented architecture p1 + p2 blows up badly.

How about;

	mid = start + (end - start) / 2;
-- 
Stacey Campbell, HCR Corporation, {lsuc,utzoo,utcsri}!hcr!stacey

jzs@bridge2.ESD.3Com.Com (Jeremy A. Siegel) (05/10/89)

>>The point is, Blair expects an object defined as one type to behave both
>>as if it were a pointer and an offset at the same time.  This is NOT sensible.

>Nay, I want it to behave as an object, and not as some nebulous changeling
             ^^              ^^^^^^^^^
>requiring maintenance-by-fiat.  If one is yea-big and the other is yo-big,
>then I damn well want the difference between them to be yea-yo, at the
>very least when I _tell_ it to be so.

My, my!  I'd nearly stopped reading this thread; it seems to have 
generated so much theoretical discussion (pun(?) intended).  But I
just had to try on this one...

There is no such thing as "a pointer" -- the pointer (*) syntax is
a *constructor*, producing a new and exciting object each time it is
used!  That is, there is one type "pointer to char" which is ALWAYS
a certain size, and which always behaves exactly the same under the
defined arithmetic operations... there's another type "pointer to 
long" that is also self-consistent: but it admittedly does not behave
the same as a "pointer to char", and there's no reason why it should.
(Are the two arrays (int a1[10]) and (int a2[5]) the same size?)

It really is important that pointers and offsets not be confused; it is
also important that offsets and integers not be confused -- even though
they're kind of the same.  (I'm not really sure what I mean by that, 
except that an integer -- e.g. 5 -- when used in an operation with a
pointer should not be viewed as an integer but as an offset, which has
"units"; that is, not "five" but "five of those").

As people have said here already, don't think of a pointer as a certain
bit-pattern, and incrementing like integer addition.  (Get a hold of a
PDP-10, or it's assembly manual, and look at byte pointers instructions
to modify them -- it'll cure you in a hurry.)

Rather than agree/disagree on the merits of earlier arguments presented,
how about a plea for peace: there was a statement about pointers not having 
"units".  Let's pretend they do, but we don't know what to call them -- and
that you certainly can add to pointers but the thing you get is different
units, and there's no other operations defined on them, so who care?

(Maybe look at pointers as logs?  I can add and subtract them and I know
how to use the result for multiplication and division.  I can also 
mulitply them, but I don't know what to do with the result.  [Note the
*I*, not *you*, since there's sure to be some *you* out there just
dying to explain about products of logarithms.  Direct those followups
to sci.math where I won't have to read them :-] )

--Jeremy Siegel
  3Com Corp.
  Mountain View

scs@sloth.pika.mit.edu (Steve Summit) (05/11/89)

I think everybody has figured out by now why pointer addition
doesn't work or make sense, but I'll throw in another
perspective just for good measure.

I think Blair's original confusion stemmed from wanting to treat
a pointer as an actual memory address.  It's true that pointers
are represented on many if not most machines by actual memory
addresses, and that pointers are generated with the "&" operator
which is named "address of", and that thinking about machine
addresses is often a helpful way to think about pointers; but as
has been amply pointed out, a pointer is properly a higher-level
language construct which is removed from, and insulates the
programmer from the details of, the implementation.

The one time I wanted to add pointers was when writing a dynamic
linker.  It seems reasonable, at first, to use pointers to
describe the addresses (locations) of the symbols within an
object module being read in.  (I'm already groping though; I
talked about pointers as addresses because that's how they're
usually implemented; now I'm turning around and trying to
implement an address as a pointer as if they were the same.)

Additionally, I may well have a pointer (call it "base") to the
spot in memory into which the object module is being dynamically
read.

One of the things a linker must do is relocation.  Suppose an
object module defines a symbol x, and that the symbol's
address/location is 4 (relative to the beginning of the object
module; that is, the object module essentially defines a frame of
reference that assumes that the module begins at location/address 0.)
If I am using a pointer as my generic address type, I might
read the "4" out of the object module's symbol table and cram it
(via any suitable means) into a variable (call it "loc") of
pointer type.

Once the object module is read in, the actual address/location in
memory of the symbol x will be base + loc (x was at "loc"
relative to the start of the module, which is being read in at
address/location "base").

So, if I had declared both base and loc as pointers, the compiler
would have complained when I tried to compute base + loc.  (In
fact, that is how I did attempt to write it, at first; and in
figuring out why it couldn't work, I gained a deeper understanding
of the relationship, and differences, between pointers and
addresses, which is what I am trying to impart here.)

The problem is that, in writing a linker, I have dropped
completely beneath the machine-independent high-level abstract
model which the language provides.

What I eventually did was to represent addresses/locations as
unsigned integers, no longer attempting to disguise the fact that
I was, in fact, dealing with actual machine addresses, which are
(for a flat-address machine) honest-to-God numbers, not
pointers.

On a non-flat-address space machine, the appropriate type for a
machine address may be some semi-complicated structure, and
computing base + loc might require a subroutine call (which C++
could hide for me...) to a routine which knew about the memory
model of the machine in use.

Obviously, such code is unportable, but code which gets real
close to the machine (assemblers, linkers, debuggers, kernels)
does tend to have its nonportable aspects.  (Don't lose heart,
though; they can also have their portable aspects.)

The moral is, if you want (and have good reason) to talk about
actual machine addresses, don't beat around the bush with
pointers.  Use integers or structures or whatever you have to
use to accurately describe the machine addresses you're actually
using.

                                            Steve Summit
                                            scs@adam.pika.mit.edu

throopw@dg-rtp.dg.com (Wayne A. Throop) (05/13/89)

> bph@buengc.BU.EDU (Blair P. Houghton)
>>The point is, Blair expects an object defined as one type to behave both
>>as if it were a pointer and an offset at the same time.  This is NOT sensible.
> Nay, I want it to behave as an object, and not as some nebulous changeling
> requiring maintenance-by-fiat.  If one is yea-big and the other is yo-big,
> then I damn well want the difference between them to be yea-yo, at the
> very least when I _tell_ it to be so.
> It's sensible aplenty.  

"Nebulous changeling"? "Yea-big"?  "Yo-big"?  This is *sensible*?

I'm guessing here, but I suppose that Blair is being flip.  But I have
no idea whatsoever why the rather clear-cut, simple and elegant semantics
that C has assigned to pointer arithmetic should be characterized as
"nebulous".  Pointers DO behave as objects, so I don't know why Blair
makes special mention that he wishes they did behave so.

And finally, several people have pointed out why a single object
representing both position and offset is not a good idea, so I have
NO idea what all this yea-yo stuff is about.... maybe it's a bilingual
joke I don't get, a takeoff on "viva yo" or "yeah me" or something.

A show of hands here.  How many think a pointer in c doesn't "behave
as an object"?   Yeah, I thought so...

--
"You'd be surprised... they're all separate little countries down there."
                                        --- Ronald Wilson Reagan
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw