[net.unix-wizards] big ptrs, small ints in C

dmr (12/10/82)

There has been enough muttering about the size of pointers that
I suppose I should say something on the subject.

The most obvious machine on which 16-bit integers and 32-bit
pointers are plausible is the Motorola 68000.  If you make a compiler
for the 68000 according to these specs, and follow the manual faithfully,
you get a useful product that successfully handles, for example, most
Unix utilities.  I know of some programs that were carelessly coded
and did cause problems, but mostly there is little trouble, based
on what I've heard.  When you use such a compiler to develop new
applications, as we have been doing locally, things are smooth indeed.

However, there is one big difficulty.  The manual states unambiguously
that the type of "sizeof" is unsigned (formerly int; but in any case
int-sized) and the type of ptr-ptr is int.  This makes it difficult to
have a large array, and the ability to use lots of storage is presumably
one of the reasons one wants to use big pointers.

The most obvious solution is to change the language definition to make
the type of sizeof and p-p depend explicitly on the implementation,
and in fact this is my current inclination.  However, just waving the
wand does not solve all problems; in particular a lot of Unix programs
stop working, especially those that contain sizeof or p-p as function
arguments.  (This includes a substantial fraction of those with calls
to read, write, and qsort.)

Let's say that the size of ptrs, and the type of sizeof and p-p, are
freely selectable.  You are implementing a system on the 68000.
Consider these choices and their consequences.

1)  Simplify life and go for 32-bit ints and ptrs.  Unreliable tests
on a small sample of programs indicate you will pay 10-20% typically
in execution time; it rises to a factor of 2 on programs with lots
of multiplication (as in subscripting) or division.

2)  16-bit ints, 32-bit ptrs, short sizeof.  You give up the ability
to declare large arrays.  However Unix utilities should port with
little trouble, and it should be possible to allocate big arrays
dynamically. (Nothing says that long subscripts can't be handled.)
ptr-ptr is an open question, but I bet it occurs seldom enough
not to be a really hot issue.

3)  16-bit ints, 32-bit ptrs, long sizeof. Least portability for existing
code, and depends on a change in the written standard which will most
assuredly not become a de facto standard merely by being written down.
But it seems the best choice for new applications in today's technology.


A couple of related points that people wondered about.  The current
language definition (as distributed with System III) says that
pointers can be converted to "sufficiently long" integers and back again.
Also, char ptrs are guaranteed to have the most resolution, and
other pointers can be explicitly converted to char pointers and back.

				Dennis Ritchie

lee%usc-cse@Usc-Ecl (12/18/82)

Date:         15 Dec 1982 9:58-PST
The current situation with the M68000 is tempting people to achieve
speed optimization by having the default int size be 16 bits even
though addresses and registers on the M68000 are 32 bits wide.  While
one can save perhaps 10% today with this decision, in a remarkably
short time, chip processors will have real 32-bit behavior and this
choice will appear archaic.  The long range benefits from the uniformity
of behavior (and storage usage) over the large class of 32-bit
processors will far outstrip the short term performance improvements.
						-- Lee