gsg0384@uxa.cso.uiuc.edu (08/17/89)
Hi, I've heard that vector machines are more expensive than multi-cpu parallel machines. I've got two questions about vector machines. 1. For compiler design, I think vector machine architecture is easier. Is this true? 2. Our machine is Ardent Titan. Each cpu has 64-register length vector registers. The problem is that this machine does not vectorize do loops with division. How much harder is to implement division than the other three operations, + - x? Is this a hardware limitation? I'd rather have a vector machine than a multi-cpu parallel machine for my application. I just want more people in computer industries to pay more attention to vector machines. Hugh
mccalpin@masig3.ocean.fsu.edu (John D. McCalpin) (08/18/89)
In the above referenced message, Hugh ??? <gsg0384@uxa.cso.uiuc.edu> writes: >I've heard that vector machines are more expensive than multi-cpu parallel >machines. I've got two questions about vector machines. There is no reason, in principle, why this should be true. However, almost all of the machines designed around vector pipelines are quite expensive, while there are a fair number of parallel boxes becoming available. This may change if Ardent and/or Stellar make any headway in the workstation market. >1. For compiler design, I think vector machine architecture is easier. > Is this true? For operations that map onto the vector hardware, this is likely to be true. There is also considerably more experience in the industry at vectorizing scientific codes than parallelizing them. One of the reasons for this is that "vectorizing" means basically one thing, while "parallelizing" means many different things depending on the model (MIND or SIMD), the size of the pieces, the communication and synchronization options, etc..... >2. Our machine is Ardent Titan. Each cpu has 64-register length vector > registers. The problem is that this machine does not vectorize do loops > with division. How much harder is to implement division than the other > three operations, + - x? Is this a hardware limitation? The Ardent Titan uses the Weitek 2264/2265 pipelined floating-point adder and multiplier chips. Divide is not implemented in the hardware, but is iterated using one or both of the chips (I don't remember which way Ardent does it). Pipelined divide units are not likely to be found on anything in the Titan's price range. >I'd rather have a vector machine than a multi-cpu parallel machine for >my application. I just want more people in computer industries to pay >more attention to vector machines. I'll second that motion! -- John D. McCalpin - mccalpin@masig1.ocean.fsu.edu - mccalpin@nu.cs.fsu.edu mccalpin@delocn.udel.edu
frazier@oahu.cs.ucla.edu (Greg Frazier) (08/18/89)
In article <MCCALPIN.89Aug17175558@masig3.ocean.fsu.edu> mccalpin@masig3.ocean.fsu.edu (John D. McCalpin) writes: +>In the above referenced message, Hugh ??? <gsg0384@uxa.cso.uiuc.edu> writes: +>>I've heard that vector machines are more expensive than multi-cpu parallel +>>machines. I've got two questions about vector machines. +> +> There is no reason, in principle, why this should be true. I think the reason behind this is that multicomputers can use off-the-shelf processors in their processing nodes. Vector machines, on the other hand, require custom parts. Greg Frazier o Internet: frazier@CS.UCLA.EDU CS dept., UCLA /\ UUCP: ...!{ucbvax,rutgers}!ucla-cs!frazier ----^/---- /
slackey@bbn.com (Stan Lackey) (08/18/89)
In article <112400003@uxa.cso.uiuc.edu> gsg0384@uxa.cso.uiuc.edu writes: >I've heard that vector machines are more expensive than multi-cpu parallel >machines. I've got two questions about vector machines. It depends upon what you're comparing. A Cray is more expensive than a Sequent. I know of vector machines that are cheaper than simple single-processor scalar machines. You statement may be a result of a perception that "for a fixed MIPS, parallel processing is cheaper than a single processor." Which is a probably a valid statement, if your application gets anywhere near linear speedup. >1. For compiler design, I think vector machine architecture is easier. > Is this true? Today your statement is true. The reason is that vector machines have been around much much longer than parallel machines (in an industrial sense) and there has been many more manyears of development put into automatic vectorization. Now that usable parallel machines (i.e, shared global memory) exist, my guess is that compilation technology for them will catch up to exploit the massive performance capacity. >2. Our machine is Ardent Titan. Each cpu has 64-register length vector > registers. The problem is that this machine does not vectorize do loops > with division. How much harder is to implement division than the other > three operations, + - x? Is this a hardware limitation? I am guessing, but it sounds like there is no vector divide instruction. If their divider is a successive approximation type, especially if it uses the multiplier, which is likely, it cannot be pipelined anyway. BTW the Alliant has a standalone divider gate array. Only one is used in scalar instructions. But for vectors there are two of them, which operate on alternate operand pairs. >I'd rather have a vector machine than a multi-cpu parallel machine for >my application. I just want more people in computer industries to pay >more attention to vector machines. For a vector machine you'll need to go to the minisupercomputer guys, I'm afraid. It looks like the suppliers of single-chip micros are not to the point where they are (willing or able?) to include them. Intel has a first step with the dual instruction issue capability if the i860, but they're not there yet; also their data cache is far too small for it to be effective. :-) Stan
mkraiesk@midas.UUCP (Mark Kraieski) (08/25/89)
in article <112400003@uxa.cso.uiuc.edu>, gsg0384@uxa.cso.uiuc.edu says: > Nf-ID: #N:uxa.cso.uiuc.edu:112400003:000:698 > Nf-From: uxa.cso.uiuc.edu!gsg0384 Aug 16 21:38:00 1989 > > > > Hi, > > I've heard that vector machines are more expensive than multi-cpu parallel > machines. I've got two questions about vector machines. I was involved in the compiler work for Gould's NP1 vector computer and would like to shed some light on these questions. > > 1. For compiler design, I think vector machine architecture is easier. > Is this true? Not that not all code can be vectorized, in fact, vectorization is just a subset of all parallel optimizations. A vector compiler must handle all the normal scalar operations PLUS the vector operations. Also, unless the vectorization is syntax driven, a vectorizer (such as those from PSR and KAI) must be used to determine when it is safe to vectorize. > > 2. Our machine is Ardent Titan. Each cpu has 64-register length vector > registers. The problem is that this machine does not vectorize do loops > with division. How much harder is to implement division than the other > three operations, + - x? Is this a hardware limitation? I am not familiar with Ardent's box but of the 4 operations, division takes the most time. Our original architecture used a recipricol/multiply instead of divide since we could chain the operations and do it faster. My guess is that the gates required to do a fast divide were not available on the Titan so they just punted. But note that a compiler could vectorize the portion of the expression that doesn't do division for a potential speed up. For example, given the array operation A = (B+C+D) / E we can make 2 loops, one vector and one scalar: A = B+C+D A = A/E Depending on the vector loop overhead and the number of elements (and size of cache if we want to get technical), the partially vectorized loop can outperform an unvectorized version. Note that vector operations are not faster than scalar - a multiply takes the same in either case. But for scalar code we have to execute multiple instructions to set us up for the next operation while vector code has just one instruction fetch and no branches (if the array is less than the vector length). > > I'd rather have a vector machine than a multi-cpu parallel machine for > my application. I just want more people in computer industries to pay > more attention to vector machines. > > Hugh I'd rather have a multi-cpu system where each cpu has vector capabilities so my fine grain parallelism is done within a node but other non-vector- izable code could still benefit from multiple processors. | "Far better it is to dare mighty things, to Mark E. Kraieski | win glorious triumphs, even though checkered Encore Computer Corp., MS#404 | by failure, than to take rank with those Computer Performance Evaluation | poor spirits who neither enjoy much nor 6901 West Sunrise Blvd. | suffer much, because they live in the gray Ft. Lauderdale, FL 33313-4499 | twilight that knows not victory or defeat." (305) 587-2900 | -- Theodore Roosevelt