[net.unix-wizards] Help wanted with C/4.?BSD/VAX register short problem

ado@elsie.UUCP (Arthur David Olson) (11/30/85)

As documented on the 4.1BSD "cc" manual page, "The compiler currently ignores
advice to put char, unsigned char, short or unsigned hsort variables in
registers.  It previously produced poor, and in some cases incorrect, code for
such declarations."

Since the smart folks at Berkeley were unable to fix the problem, I figure
the best way to deal with it is a work-around.  The notion I've come up with
is to allow the "first pass" (which lives in the same executable as the
"second pass" on 4.1BSD systems) to put shorts (et al.) into registers.
Then, just before code is generated in the "second pass" (specifically, at the
beginning of "p2compile") you'd change nodes whose "op"s were "REG" and whose
"type"s were "short" (et al.) into "NAME" nodes--and would fill the "name" field
of the node with the string "r5" or "r6" or whatever.  The theory is that if
the compiler produces correct code for non-register shorts (et al.) then it
will produces correct code for register shorts provided that you make them
look like non-register shorts.

In testing out this approach, it would be a help to have samples of source code
that caused the compiler to produce "poor, and in some cases incorrect, code."
If you have such samples, I'd appreciate you mailing them to me.  Thanks.
--
UNIX is an AT&T Bell Laboratories trademark.
C is a Mel Blanc/Jack Benny trademark.
--
	UUCP: ..decvax!seismo!elsie!ado    ARPA: elsie!ado@seismo.ARPA
	DEC, VAX and Elsie are Digital Equipment and Borden trademarks

kre@munnari.OZ (Robert Elz) (11/30/85)

In article <5300@elsie.UUCP>, ado@elsie.UUCP (Arthur David Olson) writes:
> Since the smart folks at Berkeley were unable to fix the problem,

That's not quite right - Berkeley did "fix the problem" (which was that
the compiler produced bad code) they simply made the perfectly legal
decision to ignore the "register" declaration for chars & shorts.

> The theory is that if
> the compiler produces correct code for non-register shorts (et al.) then it
> will produces correct code for register shorts provided that you make them
> look like non-register shorts.

Well, perhaps...  Its not quite that simple.  Its not terribly hard to
make the compiler produce correct code, what's hard is to make it produce
correct code that runs faster than simply leaving short & char variables
in memory (since if you use registers and get no real speed advantage,
then you are losing the possibility of putting other variables that
would give some benefit in a register).

What follows is a test for correctness of C compilers in this
area.  It is a shell archive, cut on the dotted line, and extract
with sh (not csh).  Then read the README file carefully.  Then
type "make".  Make takes less than a minute on an unloaded 780
running 4.2bsd.  When it finishes, you might know that your compiler
is broken.  If you don't know that, then you have wasted your time,
since nothing here will demonstrate that your compiler is not broken.

If your compiler implements "register" types, and doesn't fail this
test, then you should probably look at the assembler code produced,
and see if the code to process the register variable is any better
than the code produced to process the non register variable.

Note: This code is intended to run on any system with a C compiler
that supports printf.  However, this has not been tested and I
guarantee nothing.  The makefile will only work on systems with make
of course, but you should be able to guess what it is doing easily
enough, and emulate "make" by hand, or with some kind of command
script if your system has no "make".  If you don't have printf, then
the basic code should still work, you will just need to change the
way the diagnostics are produced.  If you have a compiler with a broken,
or no, preprocessor, then you may need to edit the source for each test.

Robert Elz	seismo!munnari!kre	kre%munnari.oz@seismo.css.gov

: ---------------------------------------- cut here

echo x - "README" 2>&1
sed "s/^X//" >"README" <<'!The!End!'
XThis is a simple test for one possible bug in implementations of C.
X
XSimply extract this shar archive (maybe you have done that already)
Xand type "make".  If all is OK, you will see several "Testing"
Xlines printed to stdout.  If there are problems, lines containing
Xthe word "Fails" will appear.
X
XNote: the existence of any "Fails" lines in the output implies
Xthat your implementation of C is deficient.  The converse is not
Xtrue.  Absence of "Fails" lines says nothing at all about anything.
XOn a cpu with 8 bit chars, and 16 bit shorts, absence of any
X"Fails" lines probably suggests that the compiler is OK in this
Xrespect.  Other cpu architectures may need the initialization of
Xthe "values" array extended.  You can put any numbers in that that
Xyou like (and which allow the source to compile).  There is no such
Xthing as "bad data" here, for any number that you can think of to put
Xin that array, the tests in the code should never fail.
X
XVERY IMPORTANT NOTE: Please, if you run this test, DO NOT report
Xyour findings to the net.  No-one (except you) cares if your compiler
Xworks or not.  If you have a broken compiler, complain to whoever
Xsupplied it.
X
XRobert Elz	seismo!munnari!kre  kre%munnari.oz@seismo.css.gov
!The!End!

echo x - "Makefile" 2>&1
sed "s/^X//" >"Makefile" <<'!The!End!'
X#
X# C compiler "register type" test
X#
X
XOPT= -O
X
Xtest:
X	@-for type in short char int unsigned long "unsigned short" \
X			"unsigned char" 		;\
X	do \
X		cc $(OPT) -o cctst -Dtype="$$type" cctst.c	;\
X		echo "Testing $$type"			;\
X		./cctst					;\
X	done
X
Xclean:
X	rm -f cctst core a.out *.o *junk* *[Ee]rr* *[Mm]ade*
!The!End!

echo x - "cctst.c" 2>&1
sed "s/^X//" >"cctst.c" <<'!The!End!'
X/*
X * The numbers in this array are test cases that will (hopefully)
X * find faults on processors with 8 bit char variables, and 16 bit
X * short variables.  Other processor types should add new values.
X * Any number that fits is OK, if you can find *any* value that
X * fails, you have a broken compiler.
X */
Xlong values[] = {
X	0,
X	-1,
X	(1 << 15),
X	(1 << 16) - 1,
X	(1 << 15) + 1,
X	(1 << 15) - 1,
X	~0,
X	(1 << 7),
X	(1 << 8) - 1,
X	(1 << 7) + 1,
X	(1 << 7) - 1,
X	(1 << 31),
X	(1 << 31) + 1,
X	(1 << 31) - 1,
X	(1 << 32) - 1,
X};
X
X#ifndef type
X#define	type	short
X#endif
X
Xmain()
X{
X	register type rtype;
X	auto type atype;
X	register long *vp;
X
X	for (vp = values; vp < &values[sizeof values/sizeof values[0]]; vp++) {
X		rtype = *vp;
X		atype = *vp;
X
X		if (rtype != atype)
X			printf("Fails simple assignment for %d\n", *vp);
X
X		if (++rtype != ++atype)
X			printf("Fails ++ equality for %d\n", *vp);
X
X		if (rtype != atype)
X			printf("Fails incremented assignment for %d\n", *vp);
X
X		rtype = *vp;
X		atype = *vp;
X
X		if (--rtype != --atype)
X			printf("Fails -- equality for %d\n", *vp);
X
X		if (rtype != atype)
X			printf("Fails decremented assignment for %d\n", *vp);
X	}
X	exit(0);
X}
!The!End!
exit

ado@elsie.UUCP (Arthur David Olson) (12/01/85)

In article <1016@munnari.OZ>, kre@munnari.OZ (Robert Elz) writes:
> What follows is a test for correctness of C compilers in this
> area.

The posting is appreciated; however, it doesn't seem to show up whatever
problem it is that convinced the Berkeley folks to ignore the "register"
advice in "register short" (et al.) declarations.  This is the way the
Berkeley folks got rid of register shorts (code from "pcc/local.c",
the trade secret status precludes a more complete posting):

    #ifdef TRUST_REG_CHAR_AND_REG_SHORT
    	if( t==INT || t==UNSIGNED || t==LONG || t==ULONG	/* tbl */
    		|| t==CHAR || t==UCHAR || t==SHORT 		/* tbl */
    		|| t==USHORT || ISPTR(t)) return(1);		/* tbl */
    #else
    	if( t==INT || t==UNSIGNED || t==LONG || t==ULONG	/* wnj */
    		|| ISPTR(t)) return (1);			/* wnj */
    #endif

If I generate a C compiler from the distributed 4.1BSD sources with
TRUST_REG_CHAR_AND_REG_SHORT defined, the generated compiler passes the
"correctness test."  So the question remains:  why did the Berkeley folks
get rid of register shorts?
--
Short is a Randy Newman trademark.
--
	UUCP: ..decvax!seismo!elsie!ado    ARPA: elsie!ado@seismo.ARPA
	DEC, VAX and Elsie are Digital Equipment and Borden trademarks

tanner@ki4pv.UUCP (Tanner Andrews) (12/09/85)

On most non-32-bit machines, the "%d" in the "printf" should be
changed to read "%ld", or (more informatively) "%lx".

-- 
<std dsclm, copies upon request>	   Tanner Andrews, KI4PV
uucp:					...!decvax!ucf-cs!ki4pv!tanner