[comp.lang.c] pointer sizes, was: Re: What does char **ch mean?

wolfram@akela.informatik.rwth-aachen.de (Wolfram Roesler) (05/15/91)

gwyn@smoke.brl.mil (Doug Gwyn) writes:

>>that the declaration char **ch; is equivalent to char *ch;
>No, they're not at all equivalent.  They might not even have the same size.
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I advised somebody something similar to that, telling him (like I learned
from the FAQ) that weird machines have weird pointers, that (char*)0 and
0L might have different binary representations and the like. His response
to this was:
	"I claim there are no machines like this"
What do you gurus say about this? How about an example of a machine or OS
where this is true?

Wolfram

wulkan@torolab6.vnet.ibm.com ("Mike Wulkan") (05/16/91)

I don't claim to be a guru, but the IBM AS/400 has 4 byte longs and
16 byte pointers.

Mike

andy@research.canon.oz.au (Andy Newman) (05/16/91)

In article <wolfram.674309645@akela> Wolfram Roesler writes:
>I advised somebody something similar to that, telling him (like I learned
>from the FAQ) that weird machines have weird pointers, that (char*)0 and
>0L might have different binary representations and the like. His response
>to this was:
>	"I claim there are no machines like this"
>What do you gurus say about this? How about an example of a machine or OS
>where this is true?

Transputers have signed address spaces (start at 0x80000000 and ends
at 0x7FFFFFFF for 32 bit implementations), null pointers should have
the integer value 0x80000000 and not 0 (address 0 is smack, bang in the
middle of the address space).

I remember great concern by one compiler writer over what value should be
stored for null pointers. The person used 0 so as not to break all the source
that assumes that null pointers are 0. This threw away half the address
space of the machine (only 2Gb [on a 32 bit machine], what a bummer!).
-- 
Andy Newman (andy@research.canon.oz.au) Canon Info. Systems Research Australia

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/16/91)

In article <wolfram.674309645@akela>, wolfram@akela.informatik.rwth-aachen.de (Wolfram Roesler) writes:
> I advised somebody something similar to that, telling him (like I learned
> from the FAQ) that weird machines have weird pointers, that (char*)0 and
> 0L might have different binary representations and the like. His response
> to this was:
> 	"I claim there are no machines like this"

When I had access to a Prime P400, 0L was 32 bits, but char* pointers
were 48 bits.  Given that pointers contained 2 "ring" bits with three
possible rings (kernel, supervisor, user), there were three distinct
bit patterns that identified (segment 0, word 0), and all of them had
the same effect in user-mode code.  I once had access to a machine where
0L was 64 bits but char* was 32 bits.  If I remember correctly, function
pointers were longer than data pointers on 68000-based Apollos.

On a 386, (far char *)0 could quite plausibly be 48 bits (segment:16,
byte within segment:32) as that's what the hardware is prepared to handle.
I imagine OS/2 would find this useful.  0L would still be 32 bits.

If anyone ever figured out how to put C on a B6700 (what do they call
them now, A series?) pointers have tag 5 and integers have tag 0, so
the binary representation would certainly be different.

Take a look at the address encoding on System 370/XA.
In older System/370s, there were 256 binary encodings of any address
(24-bit address in 32-bit words, top 8 bits ignored) so a compiler
could quite legitimately have implemented (char*0) as 0xFF000000.
XA machines have 31-bit addresses.  (With the _really_ big machines,
it gets complicated.)

On DEC-10s an address had (indirect: 1, index register: 4,
field width: 6, field offset: 6, word number: 18) or something like
that.  The code for "7-bit character at address 0", which I take to
be the obvious reading of (char *)0, would have to be something like
0000700000000.  (If you want 9-bit bytes, to make it more like a
32-bit machine, use 0001100000000.)  This is not the same as 0L.
0L _would_ be a useful code for (int *)0.

These are just some of the _less_ exotic machines...
-- 
There is no such thing as a balanced ecology; ecosystems are chaotic.

rh@smds.UUCP (Richard Harter) (05/16/91)

In article <wolfram.674309645@akela>, wolfram@akela.informatik.rwth-aachen.de (Wolfram Roesler) writes:

> >>that the declaration char **ch; is equivalent to char *ch;
> >No, they're not at all equivalent.  They might not even have the same size.

> I advised somebody something similar to that, telling him (like I learned
> from the FAQ) that weird machines have weird pointers, that (char*)0 and
> 0L might have different binary representations and the like. His response
> to this was:
> 	"I claim there are no machines like this"
> What do you gurus say about this? How about an example of a machine or OS
> where this is true?

Prime machines (their native line, not the UNIX boxes they are reselling)
are good exercises for the sloppy at heart.  Prior to PRIMOS 19.4, some
pointers were 32 bits and some were 48 bits.  If memory serves me correctly
(int *) was 32, (char *) was 48.  At some point they changed it so that all
pointers were 48 bits.  Also (char *)0 was definitely not 0L; the extra 16
bits hold information -- segment number or ring number or some such.  As a
general rule, machines with segment architecture do funny things with
pointers that enforce your respect for language rules.

My observation is that funny architectures and such don't matter as long
as your code in lint clean -- you never see the problems because you follow
the rules that avoid the problems.  However these fussy rules are only
for portability freaks.  If your friend is only going to have one job in
his life, work on only one machine in his life, and only use one compiler
I see no reason why he should worry about these issues.
-- 
Richard Harter, Software Maintenance and Development Systems, Inc.
Net address: jjmhome!smds!rh Phone: 508-369-7398 
US Mail: SMDS Inc., PO Box 555, Concord MA 01742
This sentence no verb.  This sentence short.  This signature done.

exspes@gdr.bath.ac.uk (P E Smee) (05/16/91)

In article <wolfram.674309645@akela> wolfram@akela.informatik.rwth-aachen.de (Wolfram Roesler) writes:
>gwyn@smoke.brl.mil (Doug Gwyn) writes:
>
>>>that the declaration char **ch; is equivalent to char *ch;
>>No, they're not at all equivalent.  They might not even have the same size.
>                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>I advised somebody something similar to that, telling him (like I learned
>from the FAQ) that weird machines have weird pointers, that (char*)0 and
>0L might have different binary representations and the like. His response
>to this was:
>	"I claim there are no machines like this"
>What do you gurus say about this? How about an example of a machine or OS
>where this is true?

Don't know about current machines, but certainly on Multics (which
hasn't been dead for that long) there was absolutely no resemblance
between the null pointer (char *) 0, and any other form of 0.
Pretending there was could get you into trouble.

Further, there's no reason to believe that (char *) 0 and 0L will be
the same on all future machines.  The standard doesn't require it.  All
the standard requires is that if you cast 0 to a pointer (something *)
0, the conversion will result in whatever bit pattern that machine/OS
uses as null pointers.  There are absolutely NO guarantees about what
that will look like internally -- if your code cares, then it is NOT
portable, and it WILL come back to haunt you (or your replacement)
someday.

-- 
Paul Smee, Computing Service, University of Bristol, Bristol BS8 1UD, UK
 P.Smee@bristol.ac.uk - ..!uunet!ukc!bsmail!p.smee - Tel +44 272 303132

exspes@gdr.bath.ac.uk (P E Smee) (05/16/91)

In article <1991May16.102900.13063@gdr.bath.ac.uk> P.Smee@bristol.ac.uk (Paul Smee) writes:
>In article <wolfram.674309645@akela> wolfram@akela.informatik.rwth-aachen.de (Wolfram Roesler) writes:
>>gwyn@smoke.brl.mil (Doug Gwyn) writes:
>>
>>>>that the declaration char **ch; is equivalent to char *ch;
>>>No, they're not at all equivalent.  They might not even have the same size.
>>                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
>>I advised somebody something similar to that, telling him (like I learned
>>from the FAQ) that weird machines have weird pointers, that (char*)0 and
>>0L might have different binary representations and the like. His response
>>to this was:
>>	"I claim there are no machines like this"
>>What do you gurus say about this? How about an example of a machine or OS
>>where this is true?
>
>Don't know about current machines, but certainly on Multics (which
>hasn't been dead for that long) there was absolutely no resemblance
>between the null pointer (char *) 0, and any other form of 0.

Apropos current machines, I just recovered my copy of 'Portable C and
Unix System Programming' by J.E. Lapin (a pseudonym) of Rabbit Software
Corp.  (A handy book, by the way, everyone should have one.
Prentice-Hall, ISBN 0-13-686494-5.)  It says:

    On processors such as the 8086, the representation of a null
    pointer may differ from the arithmetic (integer sized) constant 0.
    On the 68000, code generation that exploits the difference between
    data and address registers may break code that expects the null
    pointer to be identical to an integer 0.

-- 
Paul Smee, Computing Service, University of Bristol, Bristol BS8 1UD, UK
 P.Smee@bristol.ac.uk - ..!uunet!ukc!bsmail!p.smee - Tel +44 272 303132

rjc@cstr.ed.ac.uk (Richard Caley) (05/16/91)

In article <wolfram.674309645@akela>, Wolfram Roesler (wr) writes:

wr> What do you gurus say about this? How about an example of a machine or OS
wr> where this is true?

Ohhhhhhh Nooooooooooooo!

--
rjc@cstr.ed.ac.uk	Hell, if it's that time of the year, I think
			I'll go back to rec.arts.books and start the
			Rushdie debate again.

dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (05/17/91)

In <466@smds.UUCP> rh@smds.UUCP (Richard Harter) writes:

     If your friend is only going to have one job in his life, work on
     only one machine in his life, and only use one compiler...

I would hate to have a friend like that.
--
Rahul Dhesi <dhesi@cirrus.COM>
UUCP:  oliveb!cirrusl!dhesi

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/17/91)

In article <1991May15.234259.2613@research.canon.oz.au>, andy@research.canon.oz.au (Andy Newman) writes:
> I remember great concern by one compiler writer over what value should be
> stored for null pointers [on transputers].
> The person used 0 so as not to break all the source
> that assumes that null pointers are 0. This threw away half the address
> space of the machine (only 2Gb [on a 32 bit machine], what a bummer!).

There is no reason why placing NULL at address 0 should throw away
anything more than one byte of the address space.  C requires a guarantee
that no user-declared variable and no object allocated by malloc() and
friends will ever live at the address which NULL converts to.  It does
_not_ require that NULL convert to the lowest address, and in fact doesn't
define pointer comparison except within (or at the top end of) a single
object.

I've used a 68000-lookalike where user address space was negative, so
that 0 was not in the user's accessible address space.  On that machine,
NULL converted to 0, not to 0x80000000.  That was sensible.

-- 
There is no such thing as a balanced ecology; ecosystems are chaotic.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/17/91)

In article <466@smds.UUCP>, rh@smds.UUCP (Richard Harter) writes:
> Prime machines (their native line, not the UNIX boxes they are reselling)
> are good exercises for the sloppy at heart.  Prior to PRIMOS 19.4, some
> pointers were 32 bits and some were 48 bits.  If memory serves me correctly
> (int *) was 32, (char *) was 48.  At some point they changed it so that all
> pointers were 48 bits.  Also (char *)0 was definitely not 0L; the extra 16
> bits hold information -- segment number or ring number or some such.

Wrong.  The format of pointers was
    - a 16-bit word, holding
	is pointer valid? 1 bit
	is pointer long?  1 bit
	which ring of protection?  2 bits (3 rings)
	which segment? 12 bits
    - a 16-bit word, holding
	which 16-bit word of that segment?  16 bits
    - an optional extension word (indicated by the "long" bit in word 1)
	which byte of that word? 1 bit
      OR
	which bit of that byte? 4 bits
The P400 I used didn't actually have any instructions that manipulated
bit addresses, but byte addresses were used.  The pointer valid? bit
was used for omitted arguments in procedure calls, amongst other things.
To choose a good NULL code, you'd need to understand the ramifications of
the PCL instruction.  (The machine _insists_ on passing 48-bit pointers
to procedures; the PCL instruction directly supports only pass by reference.)

> If your friend is only going to have one job in
> his life, work on only one machine in his life, and only use one compiler
> I see no reason why he should worry about these issues.

Exactly.
The universe is not only stranger than we imagine,
it is stranger than we _can_ imagine.
This applies to computers and C compilers too.
-- 
There is no such thing as a balanced ecology; ecosystems are chaotic.

jerry@talos.npri.com (Jerry Gitomer) (05/17/91)

rh@smds.UUCP (Richard Harter) writes:

:My observation is that funny architectures and such don't matter as long
:as your code in lint clean -- you never see the problems because you follow
:the rules that avoid the problems.  However these fussy rules are only
:for portability freaks.  If your friend is only going to have one job in
:his life, work on only one machine in his life, and only use one compiler
:I see no reason why he should worry about these issues.
:-- 
	I think you forgot the :-)  
-- 
Jerry Gitomer at National Political Resources Inc, Alexandria, VA USA
I am apolitical, have no resources, and speak only for myself.
Ma Bell (703)683-9090  (UUCP:  ...uunet!uupsi!npri6!jerry )

rh@smds.UUCP (Richard Harter) (05/18/91)

In article <2274@talos.npri.com>, jerry@talos.npri.com (Jerry Gitomer) writes:
> rh@smds.UUCP (Richard Harter) writes:

> :However these fussy rules are only
> :for portability freaks.  If your friend is only going to have one job in
> :his life, work on only one machine in his life, and only use one compiler
> :I see no reason why he should worry about these issues.

> 	I think you forgot the :-)  

Quite right.  Mea Culpa.  Usenet is famed for its contributors who are
dangerously humor impaired and cannot recognize irony or sarcasm even
when they are laid on with a trowel.  Extensive experience has shown
that the net is quieter and more decorous when the humor impaired are
protected from their propensity to flame by the judicious placement of
smileys.

:-)

-- 
Richard Harter, Software Maintenance and Development Systems, Inc.
Net address: jjmhome!smds!rh Phone: 508-369-7398 
US Mail: SMDS Inc., PO Box 555, Concord MA 01742
This sentence no verb.  This sentence short.  This signature done.

exspes@gdr.bath.ac.uk (P E Smee) (05/20/91)

In article <5805@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>
>There is no reason why placing NULL at address 0 should throw away
>anything more than one byte of the address space.  

Well, almost.  Though I can see how putting a hole right in the middle
of the address space could be painfully inconvenient.  Also, on
segmented architectures (e.g. Multics) putting NULL at address {0,0}
(more precisely 0|0, and it STILL didn't have a bit pattern which
looked anything like 0) would mean throwing away an entire segment, not
just the one byte.

-- 
Paul Smee, Computing Service, University of Bristol, Bristol BS8 1UD, UK
 P.Smee@bristol.ac.uk - ..!uunet!ukc!bsmail!p.smee - Tel +44 272 303132

meissner@osf.org (Michael Meissner) (05/21/91)

In article <466@smds.UUCP> rh@smds.UUCP (Richard Harter) writes:

| My observation is that funny architectures and such don't matter as long
| as your code in lint clean -- you never see the problems because you follow
| the rules that avoid the problems.  However these fussy rules are only
| for portability freaks.  If your friend is only going to have one job in
| his life, work on only one machine in his life, and only use one compiler
| I see no reason why he should worry about these issues.

In supporting a C compiler for a funny machine (the Data General
MV/Eclipse), this is mostly true.  The one exception is calls to
qsort, bsearch, etc.  where invariably the wrong type of pointer is
passed to the comparison routine.  Lint doesn't catch it because it
loses the argument type information.
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

You are in a twisty little passage of standards, all conflicting.