[comp.sys.m88k] What the heck is "instruction folding"?

rfg@paris.ics.uci.edu (Ronald Guilmette) (03/12/90)

The following is a short excerpt from the March 1990 issue of Unix Review
(page 26):

	In December 1989, Dolphin { Server Technology } announced the Orion,
	a project that is to take performance far beyond today's 88k
	processors by building a processor using the 88k instruction set
	that is capable of executing up to eight instructions in parallel
	to achieve a theoretical peak performance of 1000 MIPS.  Dolphin
	Marketing Manager Lars Lauritzsen says the ECL processor uses a
	patented technique that the company calls "instruction folding",
	a result of Norsk's research in improving performance.

...and later on...

	... the Orion processor should ship in the first half of 1992.
	Orion is the subject of a mutual exchange of patented technology
	between Dolphin and Motorola.

I wonder what this is all about.  Perhaps someone reading this can enlighten
me.

I understand that it might make sense to make a VLIW version of an 88k,
but what is this "instruction folding" stuff all about?  That's a new one
on me!  Is it just marketing hype or is there really something novel here?

Since the claim is made that is it already patented, I can't see any
reason for undue secrecy.  Perhaps one of the many Motorola engineers
who reads this newsgroup on a regular basis will enlighten us all about
"instruction folding".

// rfg

ram@shukra.Sun.COM (Renu Raman) (03/14/90)

In article <25FAED94.24113@paris.ics.uci.edu> rfg@paris.ics.uci.edu (Ronald Guilmette) writes:
>The following is a short excerpt from the March 1990 issue of Unix Review
>(page 26):
>
>	In December 1989, Dolphin { Server Technology } announced the Orion,
>	a project that is to take performance far beyond today's 88k
>	processors by building a processor using the 88k instruction set
>	that is capable of executing up to eight instructions in parallel
>	to achieve a theoretical peak performance of 1000 MIPS.  Dolphin
>	Marketing Manager Lars Lauritzsen says the ECL processor uses a
>	patented technique that the company calls "instruction folding",
>	a result of Norsk's research in improving performance.
>
>...and later on...
>
>	... the Orion processor should ship in the first half of 1992.
>	Orion is the subject of a mutual exchange of patented technology
>	between Dolphin and Motorola.
>
>I wonder what this is all about.  Perhaps someone reading this can enlighten
>me.
>
>I understand that it might make sense to make a VLIW version of an 88k,
>but what is this "instruction folding" stuff all about?  That's a new one
>on me!  Is it just marketing hype or is there really something novel here?
>
>Since the claim is made that is it already patented, I can't see any
>reason for undue secrecy.  Perhaps one of the many Motorola engineers
>who reads this newsgroup on a regular basis will enlighten us all about
>"instruction folding".
>
>// rfg

    If an instruction does not occupy a pipeline slot, then one says "<that>
    instruction is folded", for e.g,  Floating-point ops can be detected
    or pre-decoded say in some small I-cache or pre-fetch queues and shipped
    to the FP controller directly without entering the main integer pipeline.
    This saves 1 pipe-line slot and in-effect increases your instruction
    bandwidth requirements and reduces your FP instruction's start to result
    latency.  Then one says "Floatint point folding". Likewise
    one can say about branches, which is called "branch folding" and I am 
    guessing Moto marketing has generalized it to "Instruction folding" -
   Renukanthan Raman				ARPA:ram@sun.com
   Sun Microsystems			UUCP:{ucbvax,seismo,hplabs}!sun!ram
   M/S 18-412, 2500 Garcia Avenue,               TEL:415-336-1813
   Mt. View,  CA 94043

rfg@ics.uci.edu (Ronald Guilmette) (03/14/90)

In article <132876@sun.Eng.Sun.COM> ram@sun.UUCP (Renu Raman) writes:
>In article <25FAED94.24113@paris.ics.uci.edu> rfg@paris.ics.uci.edu (Ronald Guilmette) writes:
>>The following is a short excerpt from the March 1990 issue of Unix Review
>>(page 26):
>>
>>	In December 1989, Dolphin { Server Technology } announced the Orion,
>>	a project that is to take performance far beyond today's 88k
>>	processors by building a processor using the 88k instruction set
>>	that is capable of executing up to eight instructions in parallel
>>	to achieve a theoretical peak performance of 1000 MIPS.  Dolphin
>>	Marketing Manager Lars Lauritzsen says the ECL processor uses a
>>	patented technique that the company calls "instruction folding",
>>	a result of Norsk's research in improving performance.
>>
>>...and later on...
>>
>>	... the Orion processor should ship in the first half of 1992.
>>	Orion is the subject of a mutual exchange of patented technology
>>	between Dolphin and Motorola.
>>
>>I wonder what this is all about.  Perhaps someone reading this can enlighten
>>me.
>>
>>I understand that it might make sense to make a VLIW version of an 88k,
>>but what is this "instruction folding" stuff all about?  That's a new one
>>on me!  Is it just marketing hype or is there really something novel here?
>>
>>Since the claim is made that is it already patented, I can't see any
>>reason for undue secrecy.  Perhaps one of the many Motorola engineers
>>who reads this newsgroup on a regular basis will enlighten us all about
>>"instruction folding".
>>
>>// rfg
>
>    If an instruction does not occupy a pipeline slot, then one says "<that>
>    instruction is folded", for e.g,  Floating-point ops can be detected
>    or pre-decoded say in some small I-cache or pre-fetch queues and shipped
>    to the FP controller directly without entering the main integer pipeline.
>    This saves 1 pipe-line slot and in-effect increases your instruction
>    bandwidth requirements and reduces your FP instruction's start to result
>    latency.  Then one says "Floatint point folding". Likewise
>    one can say about branches, which is called "branch folding" and I am
>    guessing Moto marketing has generalized it to "Instruction folding" -
>   Renukanthan Raman				ARPA:ram@sun.com
>   Sun Microsystems			UUCP:{ucbvax,seismo,hplabs}!sun!ram
>   M/S 18-412, 2500 Garcia Avenue,               TEL:415-336-1813
>   Mt. View,  CA 94043

So where is the patentable technology in "instruction folding"?  This sounds
too trivial to qualify for a patent.  Is there a lawyer from Scandinavia
in the house?

Also, what the heck has "instruction folding" got to do with VLIW machines?
Hummm... maybe these fellows at Dolphin are saying that they have a VLIW
machine when in fact they have a machine that batches up groups of instruction
decodes but which still has to stuff the instructions down a single FP pipe one
at a time.  Anybody else wanna partake of this wild speculation?

I have to say that I find it quite curious that the only response I got to
my original question was from a guy at Sun.  C'mon now.  I *know* there's
a lot of Moto guys who read this group.  What's making you all so shy?

Oh yea.  And by the way.  Where is that figure of 1000 MIPS (peak) comming
from?  Lemme see. If you run 8 processors in parallel that means each one
must get a peak of 125 MIPS, right?  Are we then talking about 125 MHz
parts?  Did this figure perhaps pass through a marketing department on
its journey to the popular press?  Hey!  Just asking.  I don't know nuttin
about ECL.  Maybe it *will* go up to that by the first half of 1992.

// rfg

bsw@cci632.UUCP (Brad Werner) (03/16/90)

rfg@ics.uci.edu (Ronald Guilmette) writes:
*ram@sun.UUCP (Renu Raman) writes:
*>rfg@paris.ics.uci.edu (Ronald Guilmette) writes:
*>>from the March 1990 issue of Unix Review (page 26):
*>>	In December 1989, Dolphin { Server Technology } announced the Orion,
*>>	a project that is to take performance far beyond today's 88k
*>>	processors by building a processor using the 88k instruction set
*>>	that is capable of executing up to eight instructions in parallel
*>>	to achieve a theoretical peak performance of 1000 MIPS.  
*>
*>    If an instruction does not occupy a pipeline slot, then one says "<that>
*>    instruction is folded", 
*
*Also, what the heck has "instruction folding" got to do with VLIW machines?
*Hummm... maybe these fellows at Dolphin are saying that they have a VLIW
*machine when in fact they have a machine that batches up groups of instruction
*decodes but which still has to stuff the instructions down a single FP pipe one
*at a time.  Anybody else wanna partake of this wild speculation?
*

Yes.  Wild speculation follows:
  "...executing eight instructions in parallel..." -> could mean that they
  have four FP and four IP units fed by the kind of high-volume pre-fetch
  which rfg describes later.  Or maybe a 6/2 IP/FP ratio is more suited to
  their needs, I just picked a symmetric P unit division assuming that they
  might want to prototype with CMOS or whatever existing 88100s before getting
  deep into the ECL concerns.  Eight in parallel could mean eight of each,
  but I assumed marketing types would have insisted in calling that 16 in ||.

  So the actual IP/FP units would be 88k code compatible, yet the VLIW fed
  into the pre-decode could be multi-slot 88k or whatever is convenient.
  Some information could be included to specify P unit scheduling which
  gets into folding, and related issues if the VLIW does not just specify 
  eight 88k instructions.  That would get into one class of scheduling
  issues in the compiler in order to deal with the multiple P pipes.
  If the pre-fetcher+ does some scheduling, this speculation is more 
  interesting.

  This may be a side issue, but while I'm into speculation mode I don't
  want to forget it.  Say a 'bcnd' comes down the pipe.  The N field
  currently specifies some pipe guidelines--whether to execute the next
  instruction unconditionally or not.  The VLIW containing a bcnd could
  have additional information specifying which slots of this VLIW and the
  next W to execute unconditionally (too CISCy for you?).

  For either case (the two previous paragraphs), classic 88k instructions
  don't enter the same of eight pipes so I believe the general term
  instruction folding is appropriate.  Moot side note: Didn't Norsk come
  up with the term (and patent)?

I'd better stop now before I get too far down the pipe and the queues
have to be flushed.
-Brad Werner; USENET: ...!cci632!ccird1!bsw; these are my speculations.