[comp.arch] Chaining on IBM 3090 VF

yk@a.nl.cs.cmu.edu (Yasusi Kanada) (01/02/89)

I read an article of IBM 3090 in 88-12 issue of Transaction of Information
Processing (written in Japanese) recently.  In this article, the author
(in IBM Tokyo Research Center) writes that the following instruction sequence
is executed in CHAINED manner, so the result is generated every cycle.

	VL	VR0,A(R1)
	VA	VR0,B(R2)
	VST	VR0,C(R3)

Is that true?  Thanks in advance.

-Yasusi Kanada

--

newton@kahuna.UUCP (Mike Newton) (01/04/89)

In article <3950@pt.cs.cmu.edu> yk@a.nl.cs.cmu.edu (Yasusi Kanada) writes:
>I read an article of IBM 3090 in 88-12 issue of Transaction of Information
>Processing (written in Japanese) recently.  In this article, the author
>(in IBM Tokyo Research Center) writes that the following instruction sequence
>is executed in CHAINED manner, so the result is generated every cycle.
>
>	VL	VR0,A(R1)
>	VA	VR0,B(R2)
>	VST	VR0,C(R3)
>
>Is that true?  Thanks in advance.
>
>-Yasusi Kanada

Once the pipeline is loaded this will give you a result every cycle.  
To overcome multiply bottlenecks in fp multiply, i believe they alternate
multipliers between a bank of 2 or 3 multipliers.

I stronly urge you to find an old copy of IBM Journal of R & D from
last year (Feb?) if you have access to an 3090 (w/ or w/o VF).  I'd
give you the exact date and more precise info above, but my copy is
about 3000 miles from here.

Some useful facts for highspeed code generation (my speciality): 'LA'
instructions are effectively executed by the instruction fetcher and
so are usually 0 clock cycles.  Also: avoid overlapping args for SS
instructions and self-modifying code like the plague.  To a good
first order aproximation program execution time = clock cycle time * 
number of instructions executed (ie: it's basically a risc :-) !! )

- mike

newton@csvax.caltech.edu		Caltech Submillimeter Observatory
(which is forwarded to)			POB 4339 / Hilo HI 96720
 cit-vax!kahuna!newton			808 935 1909
-- 
newton@csvax.caltech.edu		Caltech Submillimeter Observatory
(which is forwarded to)			POB 4339 / Hilo HI 96720
 cit-vax!kahuna!newton			808 935 1909

yk@a.nl.cs.cmu.edu (Yasusi Kanada) (01/19/89)

Dr Yasumura in Hitachi Central Research Laboratory replied me as follows:

>I asked Nakatani-san of IBM Research Lab. and he admitted that the article
>was wrong, i.e. there is no chaing feature in 3090/VF.

So, any of the instructions in the sequence VL, ..., VST aren't
executed concurrently.  I'm sorry for my ambiguity in my last posting.
The VST instruction probably generates each element of the result
in every cycle, but it was not what I asked for.  Thanks to all who
responded to my posting.

-Yasusi Kanada


--