dan@rna.UUCP (Dan Ts'o) (09/27/85)
We would like to hear from people who know about or who have used multiple 680XX's on a bus. We are tentatively considering a ~10 processor machine using 68020's on VME for a particular real-time data collection, analysis and display application. I know that some people at Calgary have a similar project, using the Harmony OS. Anybody know more about that project ? Is such a machine really easy to build with off-the-shelf boards ? I would assume that each processor's instruction space should reside in local memory and preferably most of its data requirements as well. What is the practical VME memory bandwidth of a typical VME system using standard memory boards and backplanes ? Thanks. Cheers, Dan Ts'o Dept. Neurobiology Rockefeller Univ. 1230 York Ave. NY, NY 10021 212-570-7671 ...cmcl2!rna!dan rna!dan@cmcl2.arpa
cleary@calgary.UUCP (John Cleary) (09/30/85)
> > We would like to hear from people who know about or who have > used multiple 680XX's on a bus. .. > I know that some people at Calgary have a similar project, > using the Harmony OS. Anybody know more about that project ? > I am at Calgary and yes we have a multiprocessor 68000 system going - called Calgary Mesh Machine - CM^2. It is a mesh connected torus with each machine connected to 4 nearest neighbours. Each machine has an independent clock. Communication between neighbours is via a 4K block of dual ported memory. This last allows very high speed transfer without interrupting the destination processor until the entire message is there. Currently we use the shared memory as a fast message passing device but are thinking hard about how to use it for a more flexible concurrent prolog implementation. We use a home grown kernel with Thoth like message passing between processes -- this is the JADE system inhereted from a distributed systems and monitoring project here at Calgary. (It knows about Unix and can use Unix I/O via a host). Current main applications (still being worked on) are ray tracing and timewarp based simulation. > Is such a machine really easy to build with off-the-shelf > boards ? I would assume that each processor's instruction space should > reside in local memory and preferably most of its data requirements as > well. What is the practical VME memory bandwidth of a typical VME system > using standard memory boards and backplanes ? It was easy to build. Current component costs approx $800/board. We have a 3x3 prototype array almost going and have had a 2x2 array going for some time. We designed the boards and did our own artwork ourselves. Off the shelf boards are expensive and tend to have a lot othings that are irrelevant in a very simple environment such as this. The current hardware configuration per board is: 68010 - 8MHz clock 512KB local RAM 2x4K dual ported RAM plus two off board connectors to RAM on other boards 2xRS232C connectors timer and interrupts from each of four neighbours. One of the boards has a 1Mb/sec net connection (omninnet). If I ever get money to build the mark II it will have more memory, floating point support, and an ethernet connection. John G. Cleary, Dept. Computer Science, The University of Calgary, 2500 University Dr., N.W. Calgary, Alberta, CANADA T2N 1N4. Ph. (403)284-6015 Usenet: ...{ubc-vision,ihnp4}!alberta!calgary!cleary CRNET (Canadian Research Net): cleary@calgary
baba@spar.UUCP (Baba ROM DOS) (09/30/85)
> > We would like to hear from people who know about or who have > used multiple 680XX's on a bus. We are tentatively considering a ~10 > processor machine using 68020's on VME for a particular real-time > data collection, analysis and display application. > > > Cheers, > Dan Ts'o The August 1, 1985 issue of Computer Design contains a rather interesting article on metastability problems in multiprocessor VME systems. It appears to be a little trickier than expected to design VME bus and memory arbitration logic that is not vulnerable to metastability problems in synchronization. In particular, some folks at CMU's robotics lab discovered that "as few as two" 8-MHz Motorola VM02 68000 boards would lock up "within 4 to 10 minutes". Baba ROM DOS
jbn@wdl1.UUCP (10/01/85)
We have a system with several bus masters on one VMEbus, and ran into the following problems: 1) The Omnibyte CPU card incorrectly performed its slot 1 bus arbitration function, due to a problem with the Motorola bus arbiter chip. Omnibyte replaced the chip with a small daughter board with two chips, which fixed the problem. 2) The Ironics RAM card was discovered to raise DTACK before it was done with the data; as long as the cycle came from a M68000 this was OK, because the M68000 kept the lines up just long enough for it to work, but our own DMA peripheral didn't and the RAM board would write bad parity into memory. No fix; we've switched to DY-4 RAM cards. Moral: ask your board vendor ``Have you run this thing with multiple bus masters?'' Both these cards work beautifully until you put on the second bus master. John Nagle
witters@fluke.UUCP (John Witters) (10/01/85)
> > We would like to hear from people who know about or who have > used multiple 680XX's on a bus. We are tentatively considering a ~10 > processor machine using 68020's on VME for a particular real-time > data collection, analysis and display application. > I'd suggest reading the August 1st 1985 issue of Computer Design before you rush off and do this. The article of interest is titled "Metastability haunts VME bus and Multibus II system designers" on page 29. I'll quote the relevant section below. I haven't looked too closely at this, but it seems to me that the interrupt daisy chain scheme should suffer from the same problem. If you can't overcome the metastability problems, maybe you could loosely couple the processors using the VMSbus instead of the VMEbus. The VMSbus is a synchronous high speed serial bus, so by definition you shouldn't have metastability problems. Another solution is to use a different bus request level for each board in your system, but this limits one to only four bus masters since the VMEbus has only four bus request levels. Multiprocessor system fails An early victim of metastability in a VMEbus product was John Willis, head of the Rapid Bus multiprocessor project at Carnegie Mellon's Robotics Laboratory (Pittsburg, PA). Willis had planned to use Motorola's VM02 cards as 68000-based CPU nodes in a multiprocessor design, but eventually discarded the VM02 parts. According to Willis, a VMEbus-based system using as few as two 8-MHz VM02 cards would lock up within 4 to 10 minutes. The Motorola specification states that up to 16 VM02 boards can operate reliably in a multimaster configuration. Willis was able to isolate the failure to the VM02 card's dual-port arbiter, bus request arbiter, and bus requester. His biggest source of trouble was the dual-port arbiter, which controls access to each VM02 card's dual-port memory. The dual-port arbiter decides whether the VM02's onboard 68000 processor will access memory, or whether a processor on another board will access the memory via the VMEbus. Metastability problems arose in the arbiter's synchronization device, which was responsible for synchronizing the two bus-request sources with its own clock. (The arbiter's clock is synchronous with the 68000's clock.) Because the arbiter makes its arbitration decisions in about 20ns, the output of its synchronizer has only 20 ns to settle to a stable state, but needs at least 50 ns to ensure reliable operation. The VM02's bus-grant arbiter proved unreliable for the same reason -- it was trying to make dual-port arbitration decisions in only 20 ns. The bus requester's function is to issue bus requests and sample the bus-grant signal. Each master on the VMEbus has a requester, and the requesters are arranged in a daisy chain fashion, such that the bus-grant signal issued by the arbiter passes serially through each requester. Each requester decides whether to intercept the bus-grant signal and access the bus, or pass the signal on to the next requester on the daisy chain. If the master associated with a particular requester has a request pending, the requester will intercept the bus-grant signal. If the master has made no request, it will pass the grant signal along to the next requester. This scheme is unreliable, Willis claims, because of the asynchronous relationships between the bus-grant and bus-request signals. He points out that the device in each requester that synchonizes bus grants and bus requests is not allowed enough time to settle. If the rising edge of a request signal at one synchonizer's input coincides with the rising edge of a grant signal at the other synchronizer's input, the requester will behave unpredictably, and may grant the bus to two masters at once. Support for Wills' argument comes from Dave Barr of Indocomp (Drayton Plains, MI), the designer of that company's line of VMEbus-based multiprocessor systems. During a test in which two masters on the VMEbus periodically wrote data to a global memory (also on the VMEbus), Barr found that the bus collisions between requesting masters caused incorrect data to be written to memory. To avoid bus-collision problems on the VMEbus, Barr scrapped the VMEbus' arbitration strategy in favor of a synchronous arbitration scheme which uses a single 4-MHz clock to coordinate bus requests and bus grants. This solution places an upper limit on arbitration speed, but guarantees system reliability. Barr could have modified the VMEbus' arbitration logic, but admits that his lack of familiarity with the problem of metastability led him to avoid it altogether. A potentially simple fix was not implemented because of a lack of understanding. Ken Marrin Senior Editor -- John Witters John Fluke Mfg. Co. Inc. P.O.B. C9090 M/S 243F Everett, Washington 98206 (206) 356-5274
jack@boring.UUCP (10/04/85)
I think that putting multiple CPU's on a VME bus isn't going to do you a lot of good, unless you either have a lot of local memory, or a very large cache. Also, the bus master arbitration of the VME bus is very poor, I think. Although it is a nice and quite general scheme, performance looks awful. It's OK for a DMA device wanting to do a block transfer, but if you are in a system with 4-8 CPU's and you're all trying to execute off the bus, you'll probably spend most of your time arbitrating. At the HTS"A", we're working on a project to put multiple 32016 (with a lot of cache) on the same VME bus. What we did is abuse on of the BREQ/BG pairs for bus arbitration: If you want to do a bus request, you wait 'till you get the BG bit. Then, you do your request, and pass the BG on to your neighbour/. If you don't want to do anything, you pass the bit on immedeately. The CPU that's having the BG at the moment pulls down BREQ, so as soon as BG 'falls out' of the cardcage, you notice it and generate a new BG at the beginning of the bus. This gives a very efficient and fair scheme if you have a reasonable number of bus masters, who are expected to do large numbers of small transfers, in stead of small numbers of large ones. Also, you can still use ordinary VME boards. -- Jack Jansen, jack@mcvax.UUCP The shell is my oyster.
fred@mot.UUCP (Fred Christiansen) (10/04/85)
> The August 1, 1985 issue of Computer Design contains a rather interesting > article on metastability problems in multiprocessor VME systems. It appears > to be a little trickier than expected to design VME bus and memory arbitration > logic that is not vulnerable to metastability problems in synchronization. In > particular, some folks at CMU's robotics lab discovered that "as few as two" > 8-MHz Motorola VM02 68000 boards would lock up "within 4 to 10 minutes". > > Baba ROM DOS I read the article a found the line of reasoning a little weak in places. VME was being blamed, yet the example cited used VM02 boards which are Versabus not VMEbus based. So, I dropped in on a hardware sharpie and asked what the scoop really was. The gist of his statement was that the problem was one found in all the popular busses, although more particularly the asynch busses. HW designers are aware of the problem and have been successful in working around it. -- << Generic disclaimer >> Fred Christiansen ("Canajun, eh?") @ Motorola Microsystems, Tempe, AZ UUCP: {seismo!terak, trwrb!flkvax, utzoo!mnetor, ihnp4!btlunix}!mot!fred ARPA: oakhill!mot!fred@ut-sally.ARPA Telephone: +1 602-438-3472
jxw@fas.ri.cmu.edu.ARPA (John Willis) (10/05/85)
The reference in Computer Design, August 1 to VMEbus problems is (loosely) based on problems we experienced several years ago using Motorola's VM02 cards on a Versabus. We bought the first and second cards off the production line for use in prototyping a very early version of our RapidBus multiprocessors. Despite these cards being recommended for use in a multiple processor configuration, I do not believe that Motorola Microsystems actually tried to use two in the same backplane for many months after their release. The designers had ignored the asynchronous relationship between local and Versabus requests to the dual port arbiter, resulting in frequent maloperation (4 to 20 minutes under heavy load). It required nearly two years, and several very pointed questions to get Motorola to produce an ECO. As far as I know, later versions of the card increased the arbiter resolution time up to fifty nanoseconds, dramatically improving reliability. We have since switched to processor cards from IBM Instruments (CS-9000) and BioResearch for later and much larger prototypes. In talking with Ken Marrin for the article, we tried to point out the importance of recognizing and intelligently handling each of the many asynchronous interfaces designed into both VMEbus and Versabus, with Motorola as one example. The VM02 is only one of several cards we have run into with problems correctly handling the asynchronous interface is MSI. It is disappointing that Motorola has not coupled their support for asynchronous bus design with responsible literature helping designers to handle asynchronous interface designs correctly. I believe that many of their claimed performance figures for VMEbus ignore synchronizer resolution delays, resulting in either unreliable or slower systems. VMEBus multiprocessors can be built reliabably, but the bus specifications don't tell you everything you need to know. For further information, I urge you to read some of the good papers coming out of the Washington University Asynchronous Systems Group (T. Chaney et al), or a very practical article by Stoll at Intel in VLSI Design several years ago. -John