jbs@WATSON.IBM.COM (05/14/91)
I believe what Fujitsu has actually done is provide a vector
instruction to compute x(j)=a(j)*x(j-1)+b(j) (j=1,length vector register)
given a,b and x(0). This runs slower than most vector instructions be-
cause it can't be pipelined (ie the add for x(j) feeds the multiply for
x(j+1)). It still may be advantageous however if it lies in a loop which
can otherwise be vectorized since otherwise time is wasted moving stuff
between the vector and scalar units.
Of course I may have this all wrong. Perhaps someone from
Fujitsu can give us the real story.
James B. Shearer