newton@kahuna.UUCP (Mike Newton) (01/05/89)
[ Found this on comp.arch. Althought we're not necessarily interested in vector processors, no one else seems to be taking up the slack. Pipelining things is a form of parallelism. Call it ``empire building.'' 8-) --- Steve ] In article <3950@pt.cs.cmu.edu> yk@a.nl.cs.cmu.edu (Yasusi Kanada) writes: >I read an article of IBM 3090 in 88-12 issue of Transaction of Information >Processing (written in Japanese) recently. In this article, the author >(in IBM Tokyo Research Center) writes that the following instruction sequence >is executed in CHAINED manner, so the result is generated every cycle. > > VL VR0,A(R1) > VA VR0,B(R2) > VST VR0,C(R3) > >Is that true? Thanks in advance. > >-Yasusi Kanada Once the pipeline is loaded this will give you a result every cycle. To overcome multiply bottlenecks in fp multiply, i believe they alternate multipliers between a bank of 2 or 3 multipliers. I stronly urge you to find an old copy of IBM Journal of R & D from last year (Feb?) if you have access to an 3090 (w/ or w/o VF). I'd give you the exact date and more precise info above, but my copy is about 3000 miles from here. Some useful facts for highspeed code generation (my speciality): 'LA' instructions are effectively executed by the instruction fetcher and so are usually 0 clock cycles. Also: avoid overlapping args for SS instructions and self-modifying code like the plague. To a good first order aproximation program execution time = clock cycle time * number of instructions executed (ie: it's basically a risc :-) !! ) - mike newton@csvax.caltech.edu Caltech Submillimeter Observatory (which is forwarded to) POB 4339 / Hilo HI 96720 cit-vax!kahuna!newton 808 935 1909