[net.arch] Simple Instructions in Parallel

rb@ccird2.UUCP (Rex Ballard) (04/23/86)
In article <5100045@ccvaxa> aglew@ccvaxa.UUCP writes:
>
>>>A side note: the IBM PC/RT's ROMP chip is microprogrammed!? True, usually
>>>only one microword per instruction, but what does this say about Radin's
>>>and Patterson's argument that microcoding adds 10% to the cycle time?
>>
>>If there is only one microinstruction per macroinstruction what possibly
>>could that word hold that is not derivable from the instruction?  Perhaps
>>the ROMP decodes the instruction but a microengine that makes a 1 to 1
>>decoding of an instruction is ridiculous.
>
>This came out of that collection of IBM PC/RT ROMP articles. A 1 to 1 
>microengine makes partial sense, for the following reasons: (1) that doesn't
>mean all instructions are 1 to 1. Some have sequencing.  (2) Essentially, 1
>to 1 microcode would mean producing a microaddress from the instruction 
>through decoding logic, and then using this address to index into the 
>microinstruction store. Really, not very different from simple decoding, or
>using a ROM in your decoder, but it does imply more levels of logic than
>using a great big broad PLA. It can also be a lot denser.  
>
>(3) rather fortuitously, this jibes with the original subject of my posting.
>Microinstructions are for sequencing, right? This seems to make 1 to 1
>microengines silly, right? But what if you use decodes from several 
>instructions to form your microaddress? This is one way you can use to resolve
>concurrency problems. 

Sounds more to me like horizontal expansion.  The instruction may be an
index into a very large "word" say 16 bits in, 64 bits out.  This would allow
such things as opening several fifo latches at once, things that would be
impractical in a single simple instruction.  Perhaps the source, destination,
and opcode are separate indices to separate "micro-code arrays".  If there
are multiple alu's, fifo's, and registers, the larger internal word would
be needed to manage the "overhead" of the relatively simple external
(instruction) word.

Is there such a thing as a "fully horizontal cisc"?  A machine in which
indexing, offsets, alu-ops,... are all completed within a single "bus state"
(not neccesarily on clock state).  Obviously, complex stores such as "push
register set" and "frame setup" would require multiple bus states.  Anything
close would also be interesting.