[comp.arch] R4000 vs. superpipelining

davec@nucleus.amd.com (Dave Christie) (02/16/91)
I've re-read one of Jouppi's superpipelining papers (its been a couple
of years) - the one from ASPLOS III: "Available Instruction-Level
Parallelism for Superscalar and Superpipelined Machines" in which
he states in the abstract: "Superpipelined machines can issue only one
instruction per cycle, but they have cycle times shorter than the
latency of *any* functional unit" (emphasis mine).  I don't particularly
worship Mr. Jouppi, but I think this is a useful distinction.

However, he also allows further on that machines with a single-cycle ALU
but multi-cycle memory access and/or floating point can be thought of
as "slightly superpipelined", and defines a metric, "degree of
superpipelining" which is basically the average operation latency *for
a given instruction mix*.

This metric is the only thing that can keep the term reasonably useful,
but, considering that it is dependent on instruction mix and hence
compilers and architecture, it is less useful for comparing different
architectures than different implementations of the same architecture.
It would be quite valid for a comparison of the R4000 with the R3000
(both being slightly superpipelined, just to differing degrees).  SPEC
would be the obvious instruction mix.  However, it does not take into
account actual runtime dependency stalls, so it isn't an indicator of
performance.  Not terribly useful after all....

So, I suppose I could accept the term "superpipelined" for the R4000,
as long as the degree is given, as well as for the R3000, plus other
current microprocessors.  But it still is about as meaningful as RISC,
MIPS and "Your check is in the mail".

----------------------------
Dave Christie      My opinions only.