[comp.arch] Processor Design

herndon@umn-cs.UUCP (Robert Herndon) (04/21/87)

  During the great flame wars many moons back over RISC vs. CISC,
most of the intelligent people on the net soon realized that the
question 'Is "reduced instruction set" better than "complex
instruction set"?' was comparable to "Can submarines swim?".
Those of us familiar with CDC 6600 assembly language sat back
and snickered, since that machine's design answered most of the
"RISC" questions correctly without getting sidetracked.  Since
then, Mashey and others have done a fair amount to enlighten
us readers with many interesting postings relating considerations
and trade-offs in processor/cache/memory design.
  The CDC-6600 was designed to run Fortran fast.  The machine
language reflects this, and still, many many years later, runs
Fortran impresssively fast.  The simple load/store architecture,
simple set of registers (admittedly with the unusual addressing
registers) and three-address register operations is hard to beat.
Unfortunately, character handling was, like many old processors,
very strange, and I/O was bizarre.
  My first question is this:  have any processors since then
managed to incorporate:
   1) A register windowing scheme
      (the one big win from RISCs)
   2) Three-address register ops
      (a big win for optimal code generation)
   3) A simple load/store memory addressing scheme
   4) Byte & word addressing
   5) A 32 or 64 bit word size
together to create a nice FAST processor that's easy to generate
code for?
  The CDC machine used A registers to reference memory.  Storing
an address into an A register caused a read or write from memory
to occur (depending on which register was stored to).  This was
kind of ugly for the assembly code because it made for side effects,
but a win because concurrent memory ops could be overlapped easily.
Cheaper processors would not do overlapped fetches, more expensive
ones would.  Instruction schedulers (peep-hole optimizers that
rearrange op-codes for maximum concurrent optimization) existed
to take advantage of overlappable operations.
  My second question is:  has anyone created a more elegant
mechanism for memory reference that:
  1) Supports register windowing & automatic memory spill
  2) Avoids the side-effect kluge of Cyber addressing
  3) Provides for overlapping memory references?

  Finally, can anyone perhaps suggest additional questions
or corrections to my questions to help improve both speed and
useability in processors?

				Robert Herndon