[comp.arch] Stack Architectures Can Have Registers

crowl@cs.rochester.edu (Lawrence Crowl) (05/27/88)

I have notices a pervasive, but unwarrented, assumption reguarding stack
architectures.  Namely, that the push and pop operations are to memory.  This
need not be the case.  The critical feature of a stack architecture is the
expression evaluation mechanism, not the lack of registers.  So, the key issue
is "0 operand" instructions versus "2 or 3 operand" instructions.  When
comparing a register-to-register architecture with a stack architecture, one
must allow the stack architecture to have registers.  

This quells two objections to stack architectures.  The first objection is that
stack architectures must go to memory for local variables while register-to-
register architectures need not.  A stack architecture with registers will have
the same fast storage.  The second objection is that common subexpressions are
more difficult.  This is only marginally true.  A simple "copy top of stack to
register" instruction will succeed in saving the results of the subexpression.
The identification of common subexpressions is architecture independent.

-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627

barmar@think.COM (Barry Margolin) (05/27/88)

In article <10076@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
>This quells two objections to stack architectures.  The first objection is that
>stack architectures must go to memory for local variables while register-to-
>register architectures need not.

The machine I use, a Symbolics 36xx, solves this problem without
adding user-visible registers.  Instead, the top N words of the stack
are stored in high-speed cache memory.  I don't know what N is, but I
think it is several hundred.  This provides the speed of registers
without complicating the rest of the architecture.  I think many
stack-based architectures use a similar scheme.

This is similar to the PDP-10's registers, which could also be
accessed as the first 16 words of memory.  In fact, since the PC could
point to these locations, it was possible to run a program in a PDP-10
with no memory installed!

Barry Margolin
Thinking Machines Corp.

barmar@think.com
uunet!think!barmar

ok@quintus.UUCP (Richard A. O'Keefe) (05/27/88)

In article <10076@sol.ARPA>, crowl@cs.rochester.edu (Lawrence Crowl) writes:
> I have noticed a pervasive, but unwarrented, assumption reguarding stack
> architectures.  Namely, that the push and pop operations are to memory.  This
> need not be the case.  The critical feature of a stack architecture is the
> expression evaluation mechanism, not the lack of registers.  So, the key issue
> is "0 operand" instructions versus "2 or 3 operand" instructions.

An example of this is the 80387 (maybe the 8087 is too, I just don't
happen to have a manual for it).

Are the bandwidth considerations between a CPU and a coprocessor such as
the 80387 significantly different from code density considerations?
That is:  were the 0-operand instructions put in the 80*87 to reduce
coprocessor interface time or for some other reason?

hjm@cernvax.UUCP (hjm) (05/31/88)

Stack architectures can, and indeed do, have registers.  Take a look at the
INMOS transputer, for example.  This very nice chip has a three deep stack of
registers and the T800 has 4Kbytes of internal RAM which can be considered to
be 1024 32-bit registers or 4KB of code space, or any mixture of the two (it's
really a sort of software controllable cache).  Stack pushes and pops can be
performed either to external or to internal memory, and the internal ones take
only one cycle at 20 or 30 MHz, whether you're loading a local constant or a
4-bit immediate.

As for the theory that stack-based machines have denser code, then what do you
think of the Transputer's 8-bit instructions which it picks up four at a time?
You can run a 20 MHz T800 with 100ns access time DRAMS if you try very hard 
which is not bad for a 10 MIPS machine.  It even has time to run 8 DMA engines
in parallel with the CPU at this speed.  And no extra wait states either!

Stack-based architectures aren't doomed, they are changing just as fast as the
rest of the world.

	Hubert Matthews

	(...blah, blah, my opinions, blah, blah, ...)