folta@tove.cs.umd.edu (Wayne Folta) (11/21/90)
I am not a hardware guy, but I have been asked to make some suggestions regarding a multi-processor system's design. This system will be pulling in data at a high rate, and it should allow for many processors to work on this data at once (each processor independent of the others). It also should use off-the-shelf components, and it must be fairly rugged and small (portable). So... I have come up with a crazy idea, using a NuBus, say, to act as a broadcasting medium, broadcasting the incoming data to multiple CPUs. Could anyone please comment on the feasibility of doing this: 1. There is 36Mbyte/sec of incoming data. It will be read by a CPU, plugged into, say, a NuBus. It will write this data onto the bus. 2. All of the other processors on the bus will have their own memory, but all of the memories will have the same address space. Thus, one write to the bus would "copy" the data to N CPU's memories at once(?). The CPUs will read/write only their own local memory, so the address collisions won't matter--no one will attempt to read across the bus. * Is this possible? * Is NuBus (or other off-the-shelf bus) fast enough for, say 50Mbytes/sec of throughput? * Would multiple memory boards at the same address create a mirrored-memory effect like I want? * At these speeds, how many boards could I fit on the bus? * Could the CPUs be reading their memories while the broadcasting CPU is writing to them? (I have heard of "dual-ported" memory. Would this do it? If so, does it come as fast as 25ns?) * Would it be relatively easy for a CPU to disable its local memory, so that it ignores the bus temporarily? (This would allow more leisurely processing of some data.) * What if I wanted to have each CPU's memory divided in two: one part shared as above, and one part with a unique address, for communication? Is this possible? * Is there a much better way to do what I want to do? Thanks for your help on a crazy idea. -- Wayne Folta (folta@cs.umd.edu 128.8.128.8)
wangjw@usceast.cs.scarolina.edu (Jingwen Wang) (11/28/90)
Dear Mr. Folta, Your intuition is correct. In fact we had built a 8-processor multiprocessor with a similar architecture in China using the TMS 320C25. The Broadcast bus is a 16-bit parallel bus link all processors' communication memoreis. The difference is that in our system each processor can broadcasts messages to all the others. It is thus designed to meet the communication requirements of continuous system simulation applications. Each processor uses a dual- port memory as the communication memory attached to the bus. We have made simulations for this system and the results indicated very attractive performance over a shared global memory architecture. The system has a PC-AT computer as the front end host together with a graphics terminal for dynamic visual display. Although your basic idea is wonderful, there is still problem to meet the time constraints on your system. A 36Mbytes/sec in-coming data rate can hardly be manageble by even the fastest processors to-date. You can not simply execute that many instructions per second. Even if a DMA mode transmission is used, it is still a headache. The design would not be a trifle one since high data rates will incur lots of reliability problems. Only for your reference. Hope it helps a little. Jingwen Wang Department of Electrical & Computer Engineering University of South Carolina Columbia, SC 29208