[comp.arch] Interlocked Pipelines - basic questions

koll@NECAM.tdd.sj.nec.com (Michael Goldman) (03/27/91)

I have been reading about RISC machines that either do or don't have
interlocking pipelines.  I read also that MIPS, which originally stood for
Microprocessor without Interlocking Pipeline Stages (I think) has introduced
interlocked pipelines.  I am ignorant about interlocking.

1.  What does the interlocking refer to ?

2.  Why is it considered good or bad ?

3.  What, if any, are the implications for interrupt handling ?  (The reason I
    ask is that I will soon be going to a new job which will be using 2 MIPS
    R3000s in a Real-Time system and I wondered if it would be better to
    dedicate one CPU to interrupt handling and the other to applications to
    get the most out of the pipelining for the applications ?)
    
4.  Does this mean MIPS will have to change its name to:
     With Interlocking Microprocessor Pipelined Stages ?  (Forget I said that ;)

jcallen@Encore.COM (Jerry Callen) (03/27/91)

In article <1991Mar26.171547.2400@sj.nec.com> koll@NECAM.tdd.sj.nec.com (Michael Goldman) writes:
>    ...I will soon be going to a new job which will be using 2 MIPS
>    R3000s in a Real-Time system...

What benchmark are you running that only gets 2 MIPS out of R3000s??? :-)

-- Jerry "See, I TOLD you 88Ks are faster!" Callen
   jcallen@encore.com

henry@zoo.toronto.edu (Henry Spencer) (03/27/91)

In article <1991Mar26.171547.2400@sj.nec.com> koll@NECAM.tdd.sj.nec.com (Michael Goldman) writes:
>... I read also that MIPS, which originally stood for
>Microprocessor without Interlocking Pipeline Stages (I think) has introduced
>interlocked pipelines.

MIPS stands for two things:  a Stanford project some years ago, which did
indeed use MIPS as an acronym with the expansion you give, and a company,
which doesn't.  The company's products have always had interlocking of
certain kinds.

>1.  What does the interlocking refer to ?

To having an instruction which needs not-yet-available data wait around
until the data is on hand.  The alternative is to require the compiler
to ensure that the situation never happens, by generating code that
explicitly stalls when needed results would not be ready in time.

>2.  Why is it considered good or bad ?

It's good if delays are unpredictable, because otherwise the compiler
would have to insert a worst-case stall every time.  It's bad because
it adds hardware complexity.  The Stanford project established that
you *could* do without it, given a smart compiler and predictable delays.
Unpredictable cache delays and the desire to make compiled programs move
between different versions of the machine without recompilation tend to
encourage commercial hardware builders to use interlocks.

>3.  What, if any, are the implications for interrupt handling ?

None that I can think of offhand.  Pipelining does tend to make interrupts
fun, but whether the pipeline is interlocked or not is less significant.
-- 
"[Some people] positively *wish* to     | Henry Spencer @ U of Toronto Zoology
believe ill of the modern world."-R.Peto|  henry@zoo.toronto.edu  utzoo!henry

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (03/28/91)

In article <1991Mar26.213709.9246@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:

| >1.  What does the interlocking refer to ?
| 
| To having an instruction which needs not-yet-available data wait around
| until the data is on hand.  The alternative is to require the compiler
| to ensure that the situation never happens, by generating code that
| explicitly stalls when needed results would not be ready in time.

  An example is register scoreboarding, where operations which are going
to modify the contents of a register set a flag such that any subsequent
instruction which uses that register for input will stall.

Consider:
  x1:	mov	memloc1,r2
  x2:	mov	memloc2,r3
  x3:	add	r2,#1,r4
  x4:	sub	r4,r3,r5
  x5:	mov	r5,memloc3

Note that the compiler has put the use of r2 before the use of r3. If
the load from memory has not completed when {x3} starts, it stalls until
the load is complete.

In a machine with lots of parallelism {x4} could be started, since {x2}
may complete first due to caching. However it needs the r4 result of
{x3} so it stalls, too. If there were an instruction to put after {x3} a
good compiler would have done it, since {x3} must complete before {x4}.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
        "Most of the VAX instructions are in microcode,
         but halt and no-op are in hardware for efficiency"