[comp.lang.c] ANSIfication of 4BSD

chris@mimsy.UUCP (Chris Torek) (03/28/88)

(I may regret this, but here goes. . . .)  Here is a hypothetical
question that has nothing to do with new languages (or does it?
oh never mind).

Suppose you had the chance to decide whether the next 4BSD
release---what I have been calling 4.3-tahoe---were changed
to be more in line with the dpANS, in that:

 0. the compiler were to accept (void *) as a type, and freely
    interconvert (void *) and any other pointer type;

 1. missing functions (e.g., strtol, memmove) were written;

 2. functions that now take or return (char *) (e.g., malloc)
    were changed to take or return (void *);

 3. sizeof() were changed to yeild (unsigned int) constants rather
    than (int) constants;

 4. the unsigned widening rules were changed from sign-preserving
    to value-preserving;

or any combination of the above.  I PROMISE NOTHING!  (For that matter,
I really have little say in the whole affair.)  But 0. is already half
done, and the other half is only a one-line change in the new
compilers; I had started on 1. when I noticed this.

Note that 0. and 1. break nothing.  No one uses (void *) in 4BSD
yet, for the simple reason that the compiler bombs if you do.  (Try
it.  It thinks variables of type (void *) are undefined.)  The
missing functions are generally simple and convenient.

Number 2 breaks code in such a way that it still works.  Declarations
for malloc and others would appear only in <stdlib.h> and in the lint
libraries.  Since the existing compilers internally make all pointers
the same, calls to a (char*) malloc work the same as those to a (void*)
malloc; while lint would yell, old code would continue to compile
and run.  (As a bonus, `int *p = malloc((unsigned)10*sizeof(*p));'
would no longer have to be written `int *p = (int *)malloc...'.  With
#3, the (unsigned) cast can be discarded as well.)

Number 3 breaks code; there is no doubt as to that.  It probably does
not break a great deal of code, and that sizeof() yeilds a signed
integer has been a bug all along.  (Consider that on a PDP-11 with
split I&D, the data space size is 65,534 bytes [there is a shim at data
address 0], and that this only fits in an unsigned int.)  But it does
break code.

Number 4 breaks more code, and is ugly to boot.  I think this is a
terrible rule, but I doubt it will change before the dpANS becomes
a true ANS, so we will be stuck with it eventually.  (`noalias',
on the other hand, may well be struck down.  I am betting that we,
the forces of good, shall prevail against this abomination.  What?
Oh, yes, I shall write to X3J11 ... soon :-) .  Const and volatile
are likely to stay, but to do them right requires more compiler
work than I want to do.)

Before anyone asks, jamming prototypes into PCC is out.

(Another possibility is to have options for some or all of these.
This is more work and is thus less likely to get done.)

Anyway, if you want to vote for or against any or all of the
things listed above, send me mail.  Again, I make no promises
beyond collecting and summarising replies.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

amos@taux01.UUCP (Amos Shapir) (04/03/88)

The bugs that men do live on after them...

This is an interesting suggestion, since I did have the chance to make such
changes, and one of them - signed sizeof - is my own idea.

Because of #4 in Chris' article (persistence of unsigned-ness), when I was
converting PCC to the tahoe at CCI, I made the decision to implement #3 (signed
sizeof); originally sizeof was unsigned. The chain of events leading
to that decision is rather interesting, and may teach us something (though
I'm not sure what):

- The first version of the machine had no FPU;
- When the FPU was put in, it was discovered that there was no more
 room in the microcode storage, and some integer instructions had to go;
- The least used instructions were unsigned mod (and div? I forget);
- Replacing them required using the 'ediv' instruction, which was not
 used or tested much till then;
- When ediv was brought into frequent use, a bug was discovered in its
 interaction with the page-fault mechanism;
- Since there was no time to fix it, a work-around was used, which required
 putting all operands in registers;
- Complicated expressions involving unsigned terms ran out of
 registers and caused 'simplify expression' messages;
- Many such expressions were generated by sizeof, which caused conversion to
 unsigned of the whole expression, even if all other terms were signed;
- Since the machine architecture itself limits sizes of objects to 1Gb,
 there was no point not to change sizeof to int.

About *void: I totally agree, I would have put it in myself if it was
standard then (1982).

-- 
	Amos Shapir			(My other cpu is a NS32532)
National Semiconductor (Israel)
6 Maskit st. P.O.B. 3007, Herzlia 46104, Israel  Tel. +972 52 522261
amos%taux01@nsc.com  34 48 E / 32 10 N