[net.lang.c] more on C optimization

chris@eneevax.UUCP (03/09/84)

Just for contrast, here's a place where the (4.1BSD) C optimizer did
something really fancy.

---part of comm.h---
/* Input and output translation is done via optional translation tables and
   bits to set and clear.  This allows any combination of input and output
   parity (or lack thereof). */
struct Translate {
    char *tr_tab;		/* Translation table (if any) */
    int   tr_bic;		/* Bits to clear */
    int   tr_bis;		/* Bits to set */
};
[...]
struct Translate InTr;		/* Input (link to tty) translation */
[...]
/* Apply the translation given by tp to c */
#define ApplyTranslation(tp, c)		\
    if ((tp)->tr_tab)			\
	c = (tp)->tr_tab[(c) & 0177];	\
    c &= ~(tp)->tr_bic;			\
    c |=  (tp)->tr_bis;

---part of proca.c---
	register int    c;
	[...]
		c = 7;		/* Ding-a-ling */
		ApplyTranslation (&InTr, c);

---The corresponding assembly code (edited slightly for readability)---
	movl	$7,r11
	tstl	_InTr
	jeql	L74
	movl	_InTr,r0
	extzv	$0,r11,r11,r1	# the interesting one
L2000041:
	cvtbl	(r0)[r1],r11
L74:
	bicl2	_InTr+4,r11
	bisl2	_InTr+8,r11

---------------------------
Apparently an "extzv $0,$7,x,y" instruction is faster than a
"bicl3 $-128,x,y".  Well instead of generating the constant "$7"
for the extzv, c2 noticed that r11 had 7 in it already and just
used r11!  At first I thought it was a bug!

Admittedly the code might be better if the "tstl" and "movl" were
collapsed into a single "movl" which also sets the condition codes.
Oh well, I guess that's asking too much of c2, what with L74:.
-- 
Chris Torek, Dept of CS, Univeristy of Maryland, College Park, MD
...!umcp-cs!chris   chris%umcp-cs@CSNet-Relay

chris@umcp-cs.UUCP (03/15/84)

All right, how come no one has replied yet???  I posted something
a while ago about something /lib/c2 did that was "neat", namely,
change

	movl	$7,r11
	...
	bicl3	$-256,r11,r1

into

	movl	$7,r11
	...
	extzv	$0,r11,r11,r1	# look closely now

(instead of "extzv $0,$7,r11,r1").  If you just think for a minute,
the "extzv" is unnecessary in the first place!  If c2 does limited
flow analysis to watch over register contents, how come it doesn't
eliminate instructions that are effectively no-ops?  (That's a
rhetorical question, by the way.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris.umcp-cs@CSNet-Relay