[net.unix-wizards] arguments in registers

chris@umcp-cs.UUCP (04/27/86)

In article <155@cbmvax.cbmvax.cbm.UUCP> grr@cbmvax.UUCP (George Robbins) writes:
>[Passing arguments in registers] would be fine if C had nested
>procedures or inlines or something, but a disaster otherwise.

In fact, a compiler is free to optimise any function into inline code,
so long as it provides a `regular' version of those that can be called
from external modules.  For example, given the following as a complete
C module, the compiler can eliminate the routine `local1' completely,
but not `local2' nor `global':

	static int
	local1()
	{
		static int count;

		return (++count);
	}

	static void
	local2()
	{
		void ext0(), ext1();

		switch (local1()) {
		case 0:
			ext0();
			break;

		case 1:
			ext1();
			break;
		}
	}

	void
	global(ppfv)
		void (**pfv)();
	{

		*pfv = local2;
	}

[Aside to Joe Yao:  Hello!  I just used `M' in vi again.]

Here the routine `local1' is not accessible outside this module,
so a compiler may elide it completely and replace the call in
`local2' with a direct reference to an unnamed static variable.
However, `local2' is externally accessible, since global() (which
is itself reachable) sets the supplied pointer to point at local2.
(A really clever compiler will discover unreachable local functions
and remove them, which may reveal more unreachable locals.)

One can also come up with examples wherein a good compiler might
provide a function externally yet also expand calls to it in line
within that module:

	int
	do_the_obvious(a, b)
		int a, b;
	{

		return (a > b ? a : b);
	}

	void
	something()
	{
		register int *p, *q;
		...
		otherproc(do_the_obvious(*p++, *q++));
		...
	}

There is no reason a compiler cannot pretend the call to
`do_the_obvious' was a built-in `max' function, and generate
something like this Vax code:

	_do_the_obvious: .globl _do_the_obvious
		.word	0		# only scratch regs used
		movq	4(ap),r0	# get a and b into r0 and r1
					# assume return a
		cmpl	r0,r1		# if a > b, skip this:
		bgtr	0f
		movl	r1,r0		# return b
	0:	ret

	_something: .globl _something
		...
		movl	(r11)+,r0	# get *p++
		movl	(r10)+,r1	# get *q++
		cmpl	r0,r1		# if the value from *p is greater
		bgtr	0f		# than that from *q, skip this:
		movl	r1,r0		# move the from-*q value to r0
	0:	pushl	r0		# result onto stack
		calls	$1,_otherproc	# go run otherproc
		...

Incidentally, I have not seen any optimising C compilers myself;
are there any available that would have done what I did above?
(Just curious.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

kwh@bentley.UUCP (KW Heuer) (04/30/86)

In article <1205@umcp-cs.UUCP> umcp-cs!chris (Chris Torek) writes:
>Incidentally, I have not seen any optimising C compilers myself;
>are there any available that would [inline expand]?

Yes.  I once got "caught" trying to time a call to nullf().

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint

johnl@ima.UUCP (John R. Levine) (05/06/86)

In article <1205@umcp-cs.UUCP> chris@maryland.UUCP (Chris Torek) writes:
>In article <155@cbmvax.cbm.UUCP> grr@cbmvax.UUCP (George Robbins) writes:
>>[Passing arguments in registers] would be fine if C had nested
>>procedures or inlines or something, but a disaster otherwise.
>
>In fact, a compiler is free to optimise any function into inline code,

Actually, you can always pass arguments in registers if you're smart about it.
The compiler for the IBM RT/PC does.  (A clever idea added after I stopped
working on it.)  The first few arguments to a procedure are always passed in
registers, but space is left for them in the stack frame.  If they aren't
declared register in the routine, the routine's prolog saves them.  Note that
this saves code space, since you have one set of store instructions in the
routine's prolog rather than replicating the code at each call.  If the
arguments are declared register, well, they're already in registers.

As far as passing args to system calls in registers goes, the big win there
is that the kernel's job of validating the arguments is made easier.  If the
args are in memory, the kernel has to make sure the address is valid, go
through some address mapping calculations, possibly take page faults, and so
forth.  It's much easier if the user program puts the args in registers, since
then the validation is done for free by hardware.
-- 
John R. Levine, {ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl
Levine@YALE.EDU

The opinions expressed above are those of a 12-year-old hacker who has
broken into my account and not those of any person or organization.

davidsen@steinmetz.UUCP (Davidsen) (05/14/86)

In article <109@ima.UUCP> johnl@ima.UUCP (John R. Levine) writes:
>In article <1205@umcp-cs.UUCP> chris@maryland.UUCP (Chris Torek) writes:
>>In article <155@cbmvax.cbm.UUCP> grr@cbmvax.UUCP (George Robbins) writes:
>>>[Passing arguments in registers] would be fine if C had nested
>>>procedures or inlines or something, but a disaster otherwise.
>>

The original "B" compiler for the GECOS operating system passed the first
two arguments in registers and it worked fine. There are no inherent
limitations, and it makes the code run much faster.

Someone doing software metrics told me that 90% of the procedure calls in
UNIX source code use <3 arguments, and running over about 50k lines of
code by people at our site it looks as if that's true, even more so if you
eliminate printf, which hopefully isn't called as often as the internal
routines.

There are three cases of action if the first N arguments are passed in
registers:
 a) the values are never stored in memory and the procedure runs faster.
 b) the values are stored in memory when needed to preserve them during a
procedure call, or when the registers are required for inline code.
 c) the compiler just stores the value in memory (stack of course) and
goes on as if the calling program had done it.

Note that in the worst case, the code to save the values has been moved
from the calling routines to the called routine, making the code smaller,
if not faster.

In 1972 I produced a language called IMP from the original B, and it ran
on the GE 600 series and the INtel 8080 (and could cross compile in either
direction). If it can be done on an 8080, it can be done anywhere. The
code was smaller and faster than that produced by any C compiler I've ever
seen for the 8080, due in part to the frequent occurence of case (a)
above, where the values were never stored.
-- 
	-bill davidsen

	seismo!rochester!steinmetz!--\
       /                               \
ihnp4!              unirot ------------->---> crdos1!davidsen
       \                               /
        chinet! ---------------------/        (davidsen@ge-crd.ARPA)

"Stupidity, like virtue, is its own reward"