[net.unix-wizards] 16032 calling sequences

thomson@utcsrgv.UUCP (Brian Thomson) (06/09/83)

		    CAST YOUR VOTE TODAY!

You have the opportunity to influence the future of an NS16032 C compiler
developed here at the University of Toronto.  We have not been able to
decide between National's favoured cxp calling sequence and the faster
jsr/bsr.

Our compiler and optimizer are now complete.  Because of our vacillation,
we have parameterized them to generate either calling sequence.  Now it
is time to install libraries and we have to bite the bullet.

We have made some measurements.  

On a sample of 7 common utilities (pstat, ed, tar, ...) we find the
text size of programs using jsr is 5% greater than the optimizing 4.1bsd
VAX compiler produces.  When we recompile using cxp, the cumulative
text size is 5% less than that on the VAX.

We then ran a speed test using the 'dc' calculator utility on
a 5MHz 16032 processor.  One test calculated 10 ** 1000, and there
was no significant speed difference.  A second test, involving no
multiplying but lots of loops and string executions, showed an 8%
speed advantage for jsr.

What do you think, and why?  The issues are:

 1) Speed vs. size, i.e. a technical decision
 
 2) Compatibility.  We understand that National-sponsored
    C implementations are constrained to use cxp, while
    MIT's effort uses jsr.

 3) Whether 1) is more important than 2).

We would be particularly interested in responses from other implementors
especially if you can tell us the reasons for your choice.

				    Brian Thomson,  utcsrgv!thomson
				    David Galloway, utcsrgv!drg
				    CSRG Univ. of Toronto

jbray%bbn-unix@sri-unix.UUCP (06/10/83)

From:  James Bray <jbray@bbn-unix>

In response to your quandry regarding jsr vs. cxp in a 16032-targeted
compiler, I would like to raise the following points. First, I should note
that I don't know the 16000-series, but hear that they are good chips.
  In a general sense,  it seems there are often situations imposed by a
particular architecture in which one must choose between space and speed:
for example, because of constraints in the architecture of the underlying
micromachine, the sis (subtract-immediate-short) macroinstruction on the
Perkin-Elmer 3200-series machines is one machine-cycle (200ns on most of them)
longer than an equivalent shi (subtract-halfword-immediate) instruction. The
sis accepts nibble operands, and is 16 bits long. The shi accepts halfword
operands, and is 32 bits long. There were other instances in which one could
trade a halfword for a cycle or two. One wonders if the -O flag should have
an s/t argument to indicate space vs time... But this wouldn't work for a
calling sequence, which must be standard -- which brings me to the second
point: perhaps NS will reimplement the cxp instruction in later versions to
speed it up; if you have contacts with them, you might try to find out
whether that is possible, or whether it is already up against the limits of
the microarchitecture (I am assuming it is microprogrammed).
  Good luck. It sounds like a good chip, so it deserves a good compiler.

--Jim Bray @ BBN-UNIX

mann%Shasta%su-score@sri-unix.UUCP (06/10/83)

From:  Tim Mann <mann%Shasta@su-score>

One of the other people in the research group I work in here
at Stanford is currently porting a message-based operating
system kernel written in C (a Thoth descendent) from the 68000
to a Vax 750.  Sending a message takes 25% longer on the Vax
than on an 8 MHz 68000 (and our newer 68000 machines run at 10
MHz).  The main reason for the slowdown is the slow function
call instruction on the Vax.  The Vax can add faster that the
68000, but context switching is a little slower, and function
calls are much slower.  It sounds like your situation with the
16032 is similar to ours with the Vax, except that in your 
case you still have a choice, while we are more or less stuck
with the existing Berkeley C compiler.

I think in general, with modern processors that don't have
address space limitations, speed is much more important than
size.  Memory is cheap and getting cheaper.  Unfortunately
(depending on your goals in developing this compiler),
compatibility with the rest of the world may be more important
than either.  I suggest, as long as you aren't developing
the compiler for commercial use, that it would be best to 
leave the option in the compiler to use cxp, but make jsr the
default and compile all your standard libraries with it.  If
you ever sell or give the compiler to anyone else, they can
reset the option if it's important to them, or you can reset
it if you ever need to interface with other languages that use
cxp in their compilers.

	--Tim

glenn@nsc.UUCP (06/15/83)

Discussion of NS16000 calling sequences really belongs in net.micro.16k,
especially given recent complaints about net.unix-wizards overcrowding.

I have posted a followup to that news group.

		-- Glenn Skinner