mdr@reed.UUCP (Mike Rutenberg) (12/07/89)
Ok, I hate to harp on it, but I have recently seen lots of boasts in the non-technical Canadian media about Myrias and their super duper fast computer. Does anyone have *any* details of the architecture that they can talk about? Is Myrias simply a /dev/null for tax dollars (both Canadian and DARPA) or are there real machines with a real architecture that are being used for something real? Mike -- Mike Rutenberg BITNET: mdr@reed.bitnet UUCP: uunet!tektronix!reed!mdr
grunwald@foobar.colorado.edu (Dirk Grunwald) (12/08/89)
We have one here. I haven't used it, but I've seen speedup curves for it. As best, it does about 10Mflops, because they're using 68020's or with 68881s. The myrias architecture is based on parallel loops. Completeley parallel. It's not a shared memory machine. Imagine you have a parent process with a page of memory. It spawns a child. The parent & child touch the page. The page is copy-on-writed. We now have a Master (M), Parent (P) and Child (C) copy. When the child dies, we 'merge' the different pages. Semantics are: + parent or child touch a bit -> get their new bit + both parent & child touch same bit -> get junk Do this using M xor P xor C, for the entire page. It's not clear to me that they have any special hardware for this. Some say aye, some say nay. They do the merging in background. Aside from the above (which is a simple O/S hack), the box is basically a lot like an IPSC/2 - circuit switched networks to shuffle 4K pages around. One could wonder why they didn't just repackage the IPSC/2, since that would be faster. If you have a parallel fortran loop with no cross-iteration dependencies (=), this is a godsend, 'cause it's cheap parallelism. If you have *any* cross iteration dependenceies (< or >), then it sucks, because there is *no* (according the mtg I went to) synchronization between processors. For the certain applications it was designed for, it's a reasonable design (although not terribly fast). Dirk Grunwald -- Univ. of Colorado at Boulder (grunwald@foobar.colorado.edu) (grunwald@boulder.colorado.edu)
ingoldsb@ctycal.UUCP (Terry Ingoldsby) (12/13/89)
In article <13683@reed.UUCP>, mdr@reed.UUCP (Mike Rutenberg) writes: > Does anyone have *any* details of the architecture that they can talk about? > Is Myrias simply a /dev/null for tax dollars (both Canadian and DARPA) > or are there real machines with a real architecture that are being > used for something real? Myrias is based in Edmonton Alberta. They are real, and have a real (as in working) multiprocessor machine. I believe the machine uses 68030 processors. It seems to me that they have several US DOD contracts. -- Terry Ingoldsby ctycal!ingoldsb@calgary.UUCP Land Information Systems or The City of Calgary ...{alberta,ubc-cs,utai}!calgary!ctycal!ingoldsb
serafini@amelia.nas.nasa.gov (David B. Serafini) (12/14/89)
In article <515@ctycal.UUCP> ingoldsb@ctycal.UUCP (Terry Ingoldsby) writes: >In article <13683@reed.UUCP>, mdr@reed.UUCP (Mike Rutenberg) writes: >> Does anyone have *any* details of the architecture that they can talk about? > >Myrias is based in Edmonton Alberta. They are real, and have a real >(as in working) multiprocessor machine. I believe the machine uses >68030 processors. >-- > Terry Ingoldsby ctycal!ingoldsb@calgary.UUCP > Land Information Systems or > The City of Calgary ...{alberta,ubc-cs,utai}!calgary!ctycal!ingoldsb The processors are 68020's. They are going to 88000 and hope to ship mid '90. The machine is hierarchical. There are 4 processors on a board, with a full interconnect between these. There are 16 boards in a card cage. There are two busses that interconnect the boards. Any board can use either bus. There are 5 serial com lines coming out of each cage, so they can be inter- connected. I think they've built a 512 proc. machine, but I might be wrong. I believe the com lines are FDDI using the AMD chip set. The number of lines is determined by how many chips fit on a board. It's a real machine. They've sold some. The software is more important than the hardware since they're trying to build a programming paradigm that will be both easy to use and easy to port. They claim that converting old code takes hours or days instead of months. Basically anything that can be vectorized on a Cray can be parallelized on the Myrias. They downplay the issues of interconnect performance (latency and bandwidth) more than they should (IMHO), but for some applications it has great potential for scalability. Like the i860 Intel iSPC, the 88000 Myrias will have performance like a full-up Y-MP, if you can get at it. <dbs> David B. Serafini serafini@ralph.arc.nasa.gov Rose Engineering and Research @NASA/Ames Research Center MS 227-6 Moffett Field, CA 94035
grunwald@foobar.colorado.edu (Dirk Grunwald) (12/15/89)
DBS> the hardware since they're trying to build a programming paradigm that will be DBS> both easy to use and easy to port. They claim that converting old code takes DBS> hours or days instead of months. Basically anything that can be vectorized DBS> on a Cray can be parallelized on the Myrias. They downplay the issues of While it may be possible, I don't think it's practical. According to the talk myrias gave here ( we have one somewhere, see ealier note) there is no synchronization possible. Thus, you can't cheaply parallelize.. Do I = 2, N A(I) = B(I) * C(I) D(I) = A(I-1) * C(I) end On the Cray, this would be vectorized: A(2:N) = B(2:N) * C(2:N) D(2:N) = A(1:N-1) * C(2:N) On a machine with synchronization, you could say: Doall I = 2, N A(I) = B(I) * C(I) POST(A,I) WAIT(A,I-1) D(I) = A(I-1) * C(I) end or Doall I = 2,N A(I) = B(I) * C(I) end Doall I = 2,N D(I) = A(I-1) * C(I) end The myrias forces the latter, because of no synchronization. You could optimize this a little... S = (N-2)/Processors Doall IP = 1,S Do I = IP, IP + N - 1 A(I) = B(I) * C(I) if (I != IP ) D(I) = A(I-1) * C(I) end end Doall I = 1,S D(S * (N-2) ) = A((S * N-2)-1) * C((S*(N-2))) end (more or less -- you just strip mine the loop based on the number of processors, execute all first statements, and only the second statements that are local to your strip, merge pages and then assign all cross-process iterations) But you'll need to force a page merge betwen the two doall loops ( think they call them 'pardo' or something). It's not clear to me this that this going to be faster than e.g. a CM-2 or a Cray. For loops involving no cross-iteration dependence, however, it should work well. I belive this is what they had intended, by the way, because the designers (a physicist?) had several probelems with no cross iteration dependence.
cmt@myrias.com (Chris Thomson) (12/18/89)
In article <13683@reed.UUCP> mdr@reed.UUCP (Mike Rutenberg) writes: >Ok, I hate to harp on it, ... Does anyone have *any* details of the >architecture that they can talk about? Is Myrias simply a /dev/null ... >or are there real machines with a real architecture that are being >used for something real? Yes, we are real, and so is our system. Architectural info to follow in subsequent postings. We have 8 systems installed: - Myrias, Edmonton, Canada: 768 PE's (varies) - Alberta Research Council, Edmonton, Canada: 64 PE's - Department of Defense, Maryland, USA: 128 PE's - Department of National Defense, Ottawa, Canada: 128 PE's - University of Calgary, Calgary, Canada: 64 PE's - Colorado Center for Applied Parallel Processing, Boulder, USA: 64 PE's - Air Force Weapons Lab, Albuquerque, USA: 64 PE's The seven offsite systems have been shipped since April 1989. It's not really for me to say, but my impression of what is done with these systems is that it is "real". Perhaps some of our users will comment. Application areas I know of include seismic, chemistry, biochemistry, physics, ray tracing, marine biology, and others. -- Chris Thomson, Myrias Research Corporation uunet!myrias!cmt or cmt@myrias.com 900 10611 98 Ave, Edmonton Alberta, Canada Tel 403-428-1616 Fax 403-421-8979
brb@myrias.com (Brian Baird) (12/19/89)
In article <629968724.4593@myrias.com> cmt@myrias.com (Chris Thomson) writes:
Chris> We have 8 systems installed:
Chris> - Myrias, Edmonton, Canada: 768 PE's (varies)
Chris> - Alberta Research Council, Edmonton, Canada: 64 PE's
Chris> - Department of Defense, Maryland, USA: 128 PE's
Chris> - Department of National Defense, Ottawa, Canada: 128 PE's
Chris> - University of Calgary, Calgary, Canada: 64 PE's
Chris> - Colorado Center for Applied Parallel Processing, Boulder, USA: 64 PE's
Chris> - Air Force Weapons Lab, Albuquerque, USA: 64 PE's
Chris> The seven offsite systems have been shipped since April 1989.
The eighth site (for those of you keeping count) is
- University of Alberta, Edmonton, Canada: 64 processors
--
Brian Baird brb@myrias.com
Myrias Research, Edmonton {uunet,alberta}!myrias!brb