andrew@alice.UUCP (Andrew Hume) (02/01/90)
o'dell is not kidding. some parts of the long distance network use a crossbar switch to switch fast (T3(=45Mbps) i think) lines. the biggest are approx 1Kx1K but there are only 2 or 3 of these: they are expensive.
panek@hp-and.HP.COM (Jon Panek) (02/02/90)
I think there might be an advantage in taking the inherently simpler approach proposed in the basenote. So far, most of the responses have quickly extrapolated to the NxM cross-bar architecture. While this is obviously the most general-purpose and most flexible one, it also incurs the highest implementation cost. By having a single linear bus with CPUs and Memory distributed along it in sections which can either be connected to the segments on either side of it or not, the implementation aspect becomes much more tractable. One obvious result of this is that the scheduler must become much smarter; assigning tightly-coupled tasks to physically proximate CPUs. Rather than having single on/off switches to connect segments of busses, perhaps a dedicated limited-function CPU could also straddle the boundaries and serve as a message-transmitter/receiver across otherwise disconnected segments of the bus. It would grab bus cycles during dead time of the main CPUs. In this way, any CPU could talk with any other CPU, and the only penalty would be longer latency for physiclly disparate boxes. Any Master's candidates looking for a research topic??? Jon P panek@hp-and.HP.COM
mo@flash.bellcore.com (Michael O'Dell) (02/02/90)
It has been brought to my attention that because of a previous posting, some folks have concluded that I am now affiliated with Bellcore. That is NOT the case beyond a courtesy most graciously extended to me by the kind folks at Bellcore. As you can see by the line above, Organization is something I'm seldom accused of.... -Mike O'Dell ----------------------------- "I can barely speak for myself, much less anyone else..."
rex@mips.COM (Rex Di Bona) (02/03/90)
In article <6960003@hp-and.HP.COM> panek@hp-and.HP.COM (Jon Panek) writes: >I think there might be an advantage in taking the inherently simpler >approach proposed in the basenote. So far, most of the responses have >quickly extrapolated to the NxM cross-bar architecture. While this is >obviously the most general-purpose and most flexible one, it also incurs >the highest implementation cost. Quite true, there are other ways of connecting, such as perfect shuffle, or its topological equivalents. > >By having a single linear bus with CPUs and Memory distributed along it >in sections which can either be connected to the segments on either side >of it or not, the implementation aspect becomes much more tractable. One >obvious result of this is that the scheduler must become much smarter; >assigning tightly-coupled tasks to physically proximate CPUs. > I have been working on a similar system (not here, but for my PhD at The University of Sydney) and it is possible, there are some problems (of course :-) If you are careful the bus can be reduced to a single combinatorial circuit which is really nice. >Rather than having single on/off switches to connect segments of busses, >perhaps a dedicated limited-function CPU could also straddle the boundaries >and serve as a message-transmitter/receiver across otherwise disconnected >segments of the bus. It would grab bus cycles during dead time of the >main CPUs. In this way, any CPU could talk with any other CPU, and the >only penalty would be longer latency for physiclly disparate boxes. In this case, why not try to improve the own/release times for the bus, so that a CPU can talk to others by just grabbing the required segment(s) of the bus. If you are talking about having this limited function CPU do store and forward then you end up with either "async cycles" which raises all the problems with store and forward networks, acknowlegments, lost signals, etc, etc, etc (see networking texts for a good list of these problems) or with long (and I mean long) delays in completing a cycle. In any case, you will want to eventually just make these interconnect CPUs as powerful as the real CPU (why waste that board/system space, "we can just run a small async job" is usually the first argument that will be raised) and eventually you will end up with either the transputer array/hypercube (only CPUs talking) or (and this one IS interesting) a network of nodes (maybe hypercubed), but with each node being similar in design to a Sequent type multi CPU backplaned machine. > >Any Master's candidates looking for a research topic??? > >Jon P >panek@hp-and.HP.COM ---- DISCLAIMER: this article concerns work that I have done at The University of Sydney, Australia. It does NOT refer to any work that I am doing at MIPS, and should not be taken as an indication that MIPS is either involved, or not involved, in this area. (I just wanted to make this clear). Rex. -- Rex di Bona Penguin Lust is NOT immoral! rex@mips.com apply STD disclaimers here.
borrill@bus.Sun.COM (Paul Borrill) (02/06/90)
In article <1990Jan30.174807.14657@ncsuvx.ncsu.edu> aras@ecerl3.UUCP () writes: >Has anyone in the group run across articles, research, etc, on >partitioning of single or multiple buses to create independent bus >segments? I am planning to work on this, to partition the buses on the >fly, reflecting changes in the locality of data exchanges among the >processes. >T >The idea is this: > >Given N processors all attached to a single bus. If several processes >on processors have a high percentage of data communication among >themselves (among these processors) why not: > > - assign these processes to processors that are physically >adjacent on the bus > - Partition the bus so that it turns into several bus "segments" >each independent of each other > - For global communucation, use a second bus, or, if only >segment to segment communication is necessary, combine two segments on >the fly. > >Responses will be appreciated. > > >Caglan M. Aras aras@eceris.ncsu.edu| Experts know more and >N. C State Univ. | more about less and >ECE Dept. Robotics Lab | less till they know >Raleigh, NC 27695 | everything about nothing! The question you are asking has been a hot topic in the Futurebus+ working group over the past year, and many of the tough issues, including deadlock, hierarchical cache protocols and address management have been included in the P896.1 Futurebus+ Specifications. You can get an overview of the Futurebus+ family of standards (called "What is Futurebus+") from the VME International Trade Association, phone (602) 951-8866. I tried publishing it on here, but whoever moderates this category must have blocked it, probably due to its size. The P896.1 Spec has just been released from the Working Group: IEEE P896.1: Futurebus+ Draft 8.2, Published February, 1990, by the IEEE Computer Society, 1730 Massachusetts Avenue, N.W., Washington, D.C. 20036-1903. Call 1-800-CS-BOOKS. The Futurebus Working Group currently has over 800 people on its mailing list, and meets every other month for a week-long Workshop. Over 100 companies are actively involved in the final stages of its definition. This is a VERY LARGE IEEE activity. Anyone who is seriously interested in the Futurebus+, or the related activites, is strongly urged to consider attending the meetings. (As an IEEE activity, all meetings are open to the public, however Working Group rules require that you attend at least two of the past four meetings in order to vote). The Futurebus+ Mailings (which are about 1-1/2" thick, and are issued every other month), are available from the Futurebus+ Executive Secretary, Anatol Kaganovich at (408) 991-2599. A great deal of material on multiple segment buses has appeared here in the past. Back copies of mailings can be obtained from Lisa Granoien at the IEEE Computer Society (202) 371-0101 The next IEEE Futurebus+ Workshop will be held at the DoubleTree Hotel, Santa Clara, from March 12 through March 16, 1990. Information on meeting agendas, special events etc., can be obtained from the Futurebus+ Information center, at VITA, Phone 602-951-8866. Kind regards, Paul.
mshute@r4.uucp (Malcolm Shute) (02/08/90)
In article <6960003@hp-and.HP.COM> panek@hp-and.HP.COM (Jon Panek) writes: >[...] So far, most of the responses have >quickly extrapolated to the NxM cross-bar architecture. While this is >obviously the most general-purpose and most flexible one, it also incurs >the highest implementation cost. But going further to the NxN case *can* allow a sudden halving in the implementation cost over the NxM case (with M -> N ) (You can dispense with half the switches if you treat each bus as being owned by a particular node). >By having a single linear bus with CPUs and Memory distributed along it >in sections which can either be connected to the segments on either side >of it or not, the implementation aspect becomes much more tractable. This sounds like a variation of a nearest-neighbour vector network: -O-O-O-O-O-O-O-O-O-O-O-O-O-O-O- where "-O-" is a module containing all of the CPUs and memories which share a common local bus. And in article <AGLEW.90Jan31205148@dwarfs.csg.uiuc.edu> aglew@dwarfs.csg.uiuc.edu (Andy Glew) writes: >One of the busses may be designated the "global" bus, listened to by all >processors. > Others might be allocated to connect groups of processors (and I/O >controllers) as needed. This reconfiguration would be done infrequently. > So, for example, if you have an I/O going on, allocate it a bus >for the (long) duration of the I/O. > Or, if a group of programs appear to communicate heavily, allocate >them a private bus. This sounds like an implementation of a tree network. Malcolm Shute. (The AM Mollusc: v_@_ ) Disclaimer: all
aglew@oberon.csg.uiuc.edu (Andy Glew) (02/13/90)
>And in article <AGLEW.90Jan31205148@dwarfs.csg.uiuc.edu> aglew@dwarfs.csg.uiuc.edu (Andy Glew) writes: > >>One of the busses may be designated the "global" bus, listened to by all >>processors. >> Others might be allocated to connect groups of processors (and I/O >>controllers) as needed. This reconfiguration would be done infrequently. >> So, for example, if you have an I/O going on, allocate it a bus >>for the (long) duration of the I/O. >> Or, if a group of programs appear to communicate heavily, allocate >>them a private bus. > >This sounds like an implementation of a tree network. > >Malcolm Shute. (The AM Mollusc: v_@_ ) Disclaimer: all Tree is a fixed topology. I mean a system where any components can talk directly on a private link - just requiring a bit of setup. Hmmm... this sounds a lot like a fiber-optic WAN that somebody from IBM just presented. Bandwidth limited by the speed of electronics attached to the net. Since aggregate bandwidth of optics >> bandwidth of receiving electronics, you can have almost unlimited simultaneous conversations. Except that a receiver can only listen to one frequency band at a time. Senders can only send one frequency at a time. Senders can only send on a fixed frequency (tuneable lasers expensive/impractical). Receivers can only change frequency they are listening to infrequently (tuneable receivers slow (currently physical (piezoelectric) eventually electroacoustic). Modulo switching protocol difficulties, here's your complete crossbar. What type of system are we going to put on this interconnect? -- Andy Glew, aglew@uiuc.edu
ccplumb@lion.waterloo.edu (Colin Plumb) (02/13/90)
In article <AGLEW.90Jan31205148@dwarfs.csg.uiuc.edu> aglew@dwarfs.csg.uiuc.edu (Andy Glew) writes: >One of the busses may be designated the "global" bus, listened to by all >processors. > Others might be allocated to connect groups of processors (and I/O >controllers) as needed. This reconfiguration would be done infrequently. > So, for example, if you have an I/O going on, allocate it a bus >for the (long) duration of the I/O. > Or, if a group of programs appear to communicate heavily, allocate >them a private bus. There exists a commercial product that looks like this: The Cogent Research (Beaverton, Oregon) XTM. A box of processors has a parallel broadcast bus, used for exchanging synchronisation information, and 4 32-way crossbar switches connecting the 20-Mbit/sec serial DMA links of the transputers. These get used for sending larger messages (like file I/O) around. The link controller is a dedicated processor that also lives on the global broadcast bus. It's a lot faster than a LAN (I forget exactly, but channel setup times are a few microseconds), but a lot slower than a processor bus. So somebody thinks it's a good idea... :-) -- -Colin