dakramer@phoenix.Princeton.EDU (David Anthony Kramer) (11/28/89)
I am looking to references on the VME bus and board standard. Firstly what is the official (IEEE?) standard? Secondly are there any books/papers which detail the standard, but are a little more readable than the official standard documentation. I will post a summary to the net if there is sufficient interest. Thanks in advance David David Kramer Department of Electrical Engineering Princeton University Princeton, NJ 08544 Internet: dakramer@olympus.princeton.edu
linimon@attctc.Dallas.TX.US (Mark Linimon) (11/28/89)
In article <11759@phoenix.Princeton.EDU> dakramer@phoenix.Princeton.EDU (David Anthony Kramer) writes: >I am looking to references on the VME bus and board standard. Firstly what >is the official (IEEE?) standard? Secondly are there any books/papers which >detail the standard, but are a little more readable than the official >standard documentation. VMEbus has its own non-profit trade group: VITA, the VMEbus International Trade Association. (10229 N. Scottsdale Road, Suite E, Scottsdale AZ 85253 602-951-8866; P.O. Box 192, 5300 AD Zaltbommel, The Netherlands 31.4180.14661). They offer both the reference specifications and vendor guides: VMEbus Revision C.1 VMSbus Revision C.2 VMXbus Revision B VSBbus Revision C VMEbus Compatible Products Directory (twice yearly, a recent one was 300 pp+) VMEbus Software Source Directory (somewhat oriented to the European market; the edition I have may only be available in Europe) (Disclaimer: I work for a VMEbus manufacturer but otherwise have no connection with VITA). Copies of the VMEbus specification are also usually available from VMEbus manufacturers; in particular Mizar and Motorola. The Motorola book is published by Micrology pbt, 2618 S. Shannon, Tempe, Arizona, 85282, 602-966-5936. VMEbus is also known as IEC 821 bus and IEEE standard P1014/D1.2; see below for the adress of IEEE. However, there may be slight changes in these standards versus the C.1 spec. David Pointer (dpointer@uicsrd.csrd.uiuc.edu) contributed the following: ------------------------------------------------------------------ I pulled this info from the 1989 IEEE Publications catalog: ANSI/IEEE Std 1014-1987, Standard for a Versatile Backplane Bus: VMEbus Order code: SH11544 List: $42.00 IEEE Members: $21.00 It looks like credit card orders can be placed with: IEEE Service Center Cash Processing Sales Dept. 445 Hoes Lane P.O. Box 1331 Piscataway, NJ 08855-1331 (201) 562-5346 The IEEE can be reached at 345 East 47th Street, New York, NY 10017 USA. Sorry, I don't have their phone number handy. ------------------------------------------------------------------ I've seen an "Introduction to VMEbus" book but don't have a pointer to it. Disclaimer: I believe all this information to be accurate. However, if anyone who has more up-to-date information will email to me I will post an updated message. Mark Linimon Mizar, Inc. linimon@mizarvme {attctc, sun!texsun, convex, texbell}!mizarvme!linimon
afgg6490@uxa.cso.uiuc.edu (11/29/89)
IEEE Standard for a Versatile Backplane Bus: VMEbus ANSI/IEEE Std 1014-1987 Published by the Institute of Electrical and Electronics Engineers, Inc. 345 East 47th Street, New York, NY 10017 USA Distributed in cooperation with Wiley-Interscience, a division of John Wiley & Sons, Inc. ISBN 0-471-61601-X Library of Congress Catalog Number 87-46413 I believe that Motorola distributes a version, possibly more recent than what I finally sent off for, as part of their technical publications. You'd probably want a VSBbus standard as well. I know of no "easy" books on the VMEbus, but the standard isn't so difficult to read. Reading a few other bus standards at the same time for comparison helps. Also try to get some tech info for the VME bus controller chips (Motorola's, Force's, or the recent VIC chip) as well as typical boards - since, of all the VME features, many systems do not stretch them to the limit, so it's more a question of what is done rather than what is permitted.
afgg6490@uxa.cso.uiuc.edu (11/29/89)
A while back M. Faiman of the UIUC said something to me that crystallized my feelings on the subject of busses: It's time we got some RISC busses. I'd like to start a conversation string on this. Topics: What (CISCy) features are there in existing busses that could be eliminated? VMEbus? Multibus? FUTUREbus? What is the set of "good ideas" that should be in new busses?
alvitar@weasel.austin.ibm.com (Phillip L. Harbison) (11/30/89)
In article <112400007@uxa.cso.uiuc.edu>, afgg6490@uxa.cso.uiuc.edu writes: > A while back M. Faiman of the UIUC said something to me > that crystallized my feelings on the subject of busses: > It's time we got some RISC busses. > I'd like to start a conversation string on this. > Topics: > What (CISCy) features are there in existing busses that > could be eliminated? [1] No justified data busses! A justified data bus requires byte shuffling such that a byte transfer always takes place on the first byte path and a word transfer takes place on the first two byte paths. I suppose the same concept can be extended to 32-bit transfers over a 64-bit bus. Multibus I, Multibus II, and most PC busses use a justified data bus. An unjustified data bus always transfers bytes within a "stripe" of memory space over the same byte path. For example, on a 32-bit unjustified data bus, the byte path is the address anded with 011 (binary). Therefore, all bytes ending in xxx00 transfer over byte path 0, xxxx01 over byte path 1, etc. Only one transceiver is required for each byte path. Nubus uses an unjustified data bus, and I believe Futurebus does too. VME is inconsistent since it uses unjustified transfers on the 16-bit bus but justified transfers on the 32-bit bus. These transceivers use alot of board space (especially on tiny Eurocards), waste power, and add extra delay. The following table shows the number of transceivers required to implement justified and unjustified busses. Number of Transceivers Bus Size Justified Unjustified Transfer Sizes -------- --------- ----------- -------------- 8-bit 1 1 8 bits 16-bit 3 2 8, 16 bits 32-bit 8 4 8, 16, 32 bits 64-bit 20 8 8, 16, 32, 64 bits Keep in mind that this circuitry must be repeated on every card. For a 16 card system using a 32-bit justified data bus, that's 64 extra chips! Of course the shuffle network has to be implemented somewhere, but why not do it on the CPU card? Most modern micros implement this on the chip anyway. (at the the 68020, 68030, 88200, 386, and probably many more) [2] Get rid of daisy-chained signal lines! I'm talking about signals that go into a board on one pin and out on some other pin. In other words, a board must occupy every slot or you must have a jumper connection to maintain continuity. Not only is this a con- figuration hassle for the user, but it makes it difficult to implement hot card replacement (removing or inserting cards with power on). This is very important in many fault-tolerant systems. [3] Get rid of centralized resource managers. Picture this: your $5000 CPU card, $4000 DRAM card, $2000 disk controller and $2000 serial port controller are all working fine; however, because a $500 system controller card is broken, the whole system is useless. Most older busses used centralized bus arbiters. NuBus, FutureBus, and Multi- Bus2 all use distributed arbiters. > What is the set of "good ideas" that should be in new busses? [1] Geographic Addressing Have some pins on each connector that identify the slot number. The board can use this information to select an address range. Given that each board can be uniquely addressed automatically, it is possible to design systems with few or no jumpers and DIP switches. The system configuration software can probe each slot to determine which resources are present, then install the appropriate drivers, kernel options, device inodes, etc. This feature is supported by NuBus, FutureBus, and MultiBus2. [2] Try-Again Signal There should be a signal or combination of signals that indicate the pending bus cycle should be retried at a later time. This provides a hook for some cache consistency protocols, and makes it easier to implement bus couplers between two busses. NuBus has this, and I believe FutureBus and Multibus2 also have it. This is a glaring deficiency of VME bus since the only option is to complete a bus cycle or issue a bus error. The latter is a severe reaction to what may be a temporary resource conflict. [3] Cache Consistency (or Coherency) This is a must for multi-processor systems, or even single processor systems where there are other bus masters (DMA devices) and the processor has cache. Support should be provided for snooping the bus, preempting a cycle, and performing a write-back operation. FutureBus has already done a great job defining a family of consistency protocols. NuBus has most of the hooks, but last I heard, no standardized protocol. [4] Pin & Socket Connectors Use decent connectors, like the DIN-41612 connectors used in VME, NuBus, FutureBus, and MultiBus2. I hate the edge connectors used in MultiBus I and the PC. Those infernal EISA multi-level connectors are even worse: all the cost of pin & socket connectors without the reliability. Pin & socket connectors have good density and superior retention force and resistance to contaminants. Another good selection would be those 3- and 4-row modular connectors made by Amp. Talk about pin density! [5] Support Redundant Busses Allow two busses to be operated in parallel, sharing the work load, with a clean mechanism to switching all the traffic to one bus if the other bus breaks. [6] Hardware Semaphores Provide semaphore support for multiple processors. This should work much like bus arbitration, with waiting processors spinning on the busy semaphore without using any bus cycles. Just a few ideas to get things started. This is one of my favorite subjects, as if you couldn't tell by the size of this article. :-) If you need to reply to me, use the address below instead of the one in the header since I'll be long gone from IBM (thank God!) before most of the net gets this. ---- Live: Phil Harbison, soon to be at Xavax, Inc. Mail: alvitar@uahcs1.UUCP or alvitar@xavax.UUCP "Skin it back!" - The Unknown Blues Band
davidb@brad.inmos.co.uk (David Boreham) (11/30/89)
There is a book on the VMEbus. It is written by one Wade Peterson and is published by VITA (I think). If I can find the flyer Wade Gave me at Buscon, I'll post the full details. CISy'ness of VMEbus: How about 1) Address Modifiers (remove) 2) Address only cycles. (remove) 3) D16 data width (remove) 4) A24 address size (remove) 5) Most of the arbitration options (use ROR and pri, say) 6) ROAK Interrupts 7) Interrupt Vectors (remove) That would do me for starters. I don't think there's any hope for Miltibus :) David Boreham, INMOS Limited | mail(uk): davidb@inmos.co.uk or ukc!inmos!davidb Bristol, England | (us): uunet!inmos.com!davidb +44 454 616616 ex 547 | Internet: davidb@inmos.com
linimon@attctc.Dallas.TX.US (Mark Linimon) (12/01/89)
In article <3070@cello.UUCP> alvitar@weasel.austin.ibm.com (Phillip L. Harbison) writes: >Of course the shuffle network has to be implemented somewhere, but why >not do it on the CPU card? Most modern micros implement this on the chip >anyway. (at the the 68020, 68030, 88200, 386, and probably many more) Note that at least one of the RISC designs does _not_, many other may not as well. I will agree, however, that this is one area where one might reasonably expect an MPU to perform this function, given its inefficient implementation otherwise. Mark Linimon Mizar, Inc. linimon@mizarvme disclaimer: Mizar neithers knows nor cares that I have opinions.
lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) (12/02/89)
In article <3070@cello.UUCP> alvitar@weasel.austin.ibm.com (Phillip L. Harbison) writes: >[1] No justified data busses! >[2] Get rid of daisy-chained signal lines! >[3] Get rid of centralized resource managers. >[1] Geographic Addressing >[2] Try-Again Signal >[3] Cache Consistency (or Coherency) >[4] Pin & Socket Connectors >[5] Support Redundant Busses >[6] Hardware Semaphores To which I would add: [7] Power Mate Before Signal Mate This allows a board to be inserted into a running system, without (necessarily) glitching the system. The logic gets to power up before the signal lines are connected. [8] Bus Isolation Mode A freshly-inserted/reset board should be in BI mode. This means that it "isn't there", electrically and logically. To the rest of the system, the slot appears to be empty/jumpered. Other boards are free to respond to addresses which the isolated board would normally have responded to. This feature is useful for physically adding/removing boards to/from a running system: for electronically deleting failed boards: for electronic swapin of spare boards: and for "blocking" during self-test. [9] Built In Self Test ( Built In Test Equipment ) Each board should have BIST (BITE) features. The bus should be involved: the system should be able to force a board into BIST mode. This also puts the board's interface into BI mode, so that boards which temporarily/permanently aren't useful, won't be called upon. There is a question as to how a board comes back out of BIST mode, depending on why it went in, and what the result was. Also, if the board has failed, how much can the system find out about the failure. [10] No Byzantine Agreement When a system comes up, there should be one board which knows that it is in charge of bringing up the rest (and the rest should agree). Software ("byzantine") schemes don't cut it. The initial board should run BIST, then start putting the other boards through their BIST. The simplest scheme is just the board in slot zero, but perhaps something more robust can be found that isn't too complicated. (Maybe we can assume that the operator put a board in slot zero, but we certainly can't assume that that board will never fail.) -- Don D.C.Lindsay Carnegie Mellon Computer Science
lamaster@athena.arc.nasa.gov (Hugh LaMaster) (12/04/89)
Are there any comments on how well FutureBus meets these improvements? There have been several articles describing FutureBus in IEEE Micro, as well as the draft standard itself. My understanding is that the 32 bit standard is now complete, and 64bit+ extensions are now being drafted. Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117
afgg6490@uxa.cso.uiuc.edu (12/04/89)
..> Distributed arbitration... I go both ways on this. I have worked with several systems where bus arbitration time was one of the principal bottlenecks. IE. full bus bandwidth could never be acheived in a real system because (1) all those asynch signals were being synchronized, and (2) the system spent much of its time asking "please can I use the bus" instead of actually using it. Nubus distributed arbitration seems simple enough, but the wave form of scheduling probably is not acceptable to hard real time (fixed priority) scheduling (see Shan "a bus access should be scheduled at the same priority as a bus access"). Futurebus+ distributed arbitration seems to further complicate things. Potentially all of the bits in the arbitration code can flip - at least O(N). All are asynch signals.... because its distributed, seems to me that you are limited by the response time of the *slowest* arbitration logic in the system. I shudder at the thought of trying to assemble a system for a customer, customized with a single board, available from only one vendor, and finding out that this crucial board synchronizes all bus signals to a 4MHz clock. Ouch! Other things that scare me about FUTUREBUS+ distributed arbitration include the long and short arbitration cycles. And another arbitration mode is proposed as an improvement! And all of the priority manipulation rules... What we need is a distributed arbitration scheme that (1) is positive - instead of waiting for a potentially long time interval for stability of of the arbitration code, grants when all contestants acknowledge the same winner (ok, that's already been done) (2) is fast in special circumstances - like idle bus (without requiring a special arbitration mode) (3) supports stripped down priority manipulation (like, none) in systems where it is appropriate.
afgg6490@uxa.cso.uiuc.edu (12/04/89)
>[9] Built In Self Test ( Built In Test Equipment ) > >Each board should have BIST (BITE) features. The bus should be >involved: the system should be able to force a board into BIST mode. >This also puts the board's interface into BI mode, so that boards >which temporarily/permanently aren't useful, won't be called upon. >There is a question as to how a board comes back out of BIST mode, >depending on why it went in, and what the result was. Also, if the >board has failed, how much can the system find out about the failure. Don't oblige potentially expensive features. Encourage them. Support them. But do not oblige boards to support a complicated BIST protocol. I've seen boards spend a lot of logic supporting a BIST interface, only to have no test hardware at all. The BIST interface could have been used more productively on more basic things, like ECC. Put another way: accept "No" responses to external BIST requests. A simple No. Not a No that needs to be bundled in some complicated packedt and returned.
davidb@braf.inmos.co.uk (David Boreham) (12/04/89)
I said that I'd post the full title of Wade Peterson's book on VMEbus: "The VMEbus Handbook" by Wade D. Peterson Available from; VITA 10229 N. Scottsdale Road, Suite E Scottsdale, Arizona 85253 USA +1 (602) 951-8866 Although I've not read this book in detail, it looks like a good treatment of the VMEbus and is useful for prospective VMEbus card designers. I costs $39.95 + postage of $5. David Boreham, INMOS Limited | mail(uk): davidb@inmos.co.uk or ukc!inmos!davidb Bristol, England | (us): uunet!inmos.com!davidb +44 454 616616 ex 547 | Internet: davidb@inmos.com
dave@dtg.nsc.com (David Hawley) (12/09/89)
> From: lamaster@athena.arc.nasa.gov (Hugh LaMaster) > Message-ID: <5693@eos.UUCP> > > Are there any comments on how well FutureBus meets these improvements? [elimination of CISC features; good ideas that should be in new busses] > There have been several articles describing FutureBus in IEEE Micro, > as well as the draft standard itself. My understanding is that the > 32 bit standard is now complete, and 64bit+ extensions are now being > drafted. The new Futurebus+ development supercedes, as well as extends the old IEEE 896.1 Futurebus standard. Here are some comments: > From: alvitar@weasel.austin.ibm.com (Phillip L. Harbison) > (soon to be alvitar@uahcs1.UUCP or alvitar@xavax.UUCP) > Message-ID: <3070@cello.UUCP> > [1] No justified data busses! > [2] Get rid of daisy-chained signal lines! > [3] Get rid of centralized resource managers. Futurebus has none of these. 32-bit transfers are standard; wider data paths and byte lane enables are supported. > [1] Geographic Addressing > [2] Try-Again Signal > [3] Cache Consistency (or Coherency) > [4] Pin & Socket Connectors > [5] Support Redundant Busses > [6] Hardware Semaphores Futurebus has explicit support for all of these except redundant busses. Futurebus is the only standard bus I know of that can support copy-back cache protocols without using busy-retry. It now uses a Metral pin and socket connector (4-row, 2mm grid, 192 pins). > From: lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) > Message-ID: <7172@pt.cs.cmu.edu> > [7] Power Mate Before Signal Mate > [8] Bus Isolation Mode > [9] Built In Self Test ( Built In Test Equipment ) > [10] No Byzantine Agreement All that is required for live insertion is that the board be brought to the same ground as the system (static discharge) before insertion, as long as power is sequenced correctly on the board to ensure that the bus drivers are disabled. The live insertion facility is available in Futurebus+, but not required; many systems do not need it. Bus isolation, built-in self-test, and "monarch" selection on power-up are mentioned in Futurebus+, but an exact implementation is not specified. Some things that Futurebus+ provides that are required in any bus that have not been mentioned yet are: [11] Clean Electrical Environment Futurebus+ uses BTL (backplane transceiver logic), which is designed to operate in a backplane's transmission line environment. This allows reliable incident-wave switching (no reflections) on all signals, preventing the settling-time performance loss typical of TTL systems. [11] High-Performance Block Data Transfers Futurebus+ source-synchronous/packet-mode transfers allow data to be transferred on the backplane at the physical limits of the media, which is somewhere between 50 and 100 MHz for BTL. Unfortunately, specialized silicon must be developed to take advantage of this protocol. Futurebus also specifies a fully handshaken "compelled" transfer mode for more traditional implementations, up to 25 mega-transfers/second. A true RISC bus would probably allow only one type and size of block transfer. [12] Split Transactions A write-only protocol has significant advantages in system-wide access latency. Of course, the design of all boards, especially memory boards, gets more complicated. Futurebus+ supports (but does not require) split transactions - a "CISC" compromise. Futurebus meets most of these "requirements" for a RISC bus, but it is still CISC (and still valuable, I believe). So what really defines a RISC bus? Think about your system implementation requirements, and strip anything that does not compromise your cost and performance goals. This will vary for I/O, memory, multiprocessing, real-time, or fault- tolerant busses. If you are designing for one of these, you can build a RISC bus. If you want to support more than a few of these, look at CISC Futurebus. Dave Hawley National Semiconductor Corp. dave@dtg.nsc.com (408)721-6742 Disclaimer: I do not represent the IEEE or the P896.x Futurebus+ committee.
des@dtg.nsc.com (Desmond Young) (12/15/89)
One comment, there is a group of (future) Futurebus+ users in Europe that have sent out fliers (should that be flyers?) encouraging people to join their effort to: "define useable subsets of Futurebus +". This does hint at the very (very) CISC nature of Futurebus+. It has every facility (almost) ever dreamt of. As an analogy to CISC processors, it almost has a single instruction to compile a program :-). (Well, a wee bit of an exaggeration). Anyway, if you want to go fast, it has too much baggage. My Opinion etc.
johnt@opus.WV.TEK.COM (John Theus;685-2564;61-183;625-6654;hammer) (12/15/89)
In article <411@blenheim.nsc.com> des@dtg.nsc.com (Desmond Young) writes: > >One comment, there is a group of (future) Futurebus+ users in Europe >that have sent out fliers (should that be flyers?) encouraging >people to join their effort to: > "define useable subsets of Futurebus +". > This does hint at the very (very) CISC nature of Futurebus+. It has >every facility (almost) ever dreamt of. As an analogy to CISC >processors, it almost has a single instruction to compile a program > :-). (Well, a wee bit of an exaggeration). >Anyway, if you want to go fast, it has too much baggage. >My Opinion etc. I'm not sure what flyer Des has seen, but the one I have was not produced by the FMUG group in England, but by a couple of guys located in the northeast. They are trying to form FIA, the Futurebus Implementors Association. In my opinion they are trying make a buck off the Futurebus bandwagon by setting up this group and then charging management fees. They held their first meeting last week at the site of the Futurebus+ meeting. I should point out that this activity is in no way associated with the IEEE sponsored Futurebus effort, and that the 2 principals have not been contributors to the development of the standard. The stated purpose of FIA is to produce a subset document of the Futurebus+ standard and then lead the development of a silicon implementation. They passed out a strawman proposal to show us how they would subset the documents. Basically, their proposal went down in flames; for several reasons. The simple ones were: hey, we're already doing all that; and technically, you don't know what you are talking about. In the latter case, their subset would have broken several of the protocols. Futurebus+ has a 4 bit transaction command code that is transmitted with each address. In their subset, they picked some of the codes but not others that are required to use the first set. For example, they included some of the cache coherence codes, but left out the codes for invalidate and copyback. In addition, their subset would not have supported any of the lock facilities. In the first case, several commercial silicon companies are working on chip designs that, assuming they follow the spec, will work together properly. At the board level we realize this is a much harder problem. The boards for this bus will occupy a very large design space; much larger than any previous standard bus - the CISC factor. Futurebus+ as a whole provides a large number of facilities that no single board will fully implement. As examples; coherent caches, message passing, live insertion, dual redundant buses, split transactions (write-only protocols), etc. At the physical layer there are issues of board size, electrical environment, and physical environment. The range includes the Navy putting Futurebus+ boards in combat vehicles to HP talking about Futurebus+ boards in a desktop PC. To focus this range of applications into distinct groups, the 896.2 standard (Futurebus is the IEEE 896.x family of standards) will include chapters called profiles. At present 2 profiles are being developed; one for general purpose computing, the VMEbus replacement, and the other for I/O applications, which is being led by DEC. As we better understand the needs of these two applications, these profiles may merge. Clearly the Navy's profile for their combat applications will be very different. The profiles specify the physical, electrical and protocol decisions that are required to ensure interoperability among all boards built to the profile. The question is whether RISC vs. CISC is a meaningful discussion for a bus, especially an industry standard bus. Probably most well-designed proprietary buses are RISC, simply because a company usually designs a bus for a specific application and its not cost effective to carry excess baggage. I don't expect to ever see an industry-standard RISC bus, simply because the market for it will be too small. In fact I think Futurebus+ will be the last great bus. As more and more systems require greater than a few Gbytes/sec of bandwidth, buses will be replaced by switch-based interconnect schemes such as SCI, and I wouldn't call switch routing RISC. Finally, as you might expect from my title, I have a different opinion about the obtainable performance with Futurebus+. Fast is of course a relative term, so I'll just state that I expect to be building hardware next year that can sustain more than 500 MBytes/sec. on a 64 bit wide Futurebus+. John Theus johnt@opus.wv.tek.com Futurebus+ Parallel Protocol Coordinator Tektronix, Inc. Interactive Technologies Div. - shipping the Futurebus-based XD88 workstations
afgg6490@uxa.cso.uiuc.edu (12/15/89)
>Nubus, Multibus II, and Futurebus all use the same basic parallel >contention logic for resolving multiple requests. A "fairness" >protocol is layered over this logic. In order to allow real-time >priority scheduling, a priority protocol also must be layered. >None of this is easy; all of it takes time, bus lines, or both. Does the "distributed arbitration" protocol not implicitly provide priority arbitration directly? IE. cannot the arbitration code be a fixed numerical priority? I think that the problem is that when non-priority arbitration codes get layered on top of the distributed arbitration scheme (ones in which you effectively manipulate the level of the arbitration code) you have to devote sufficient logic that the simplicity of fixed priority arbitration gets lost. >Some aspects of Futurebus+ arbitration are limited by the slowest logic >in the system, some by the slowest in a group of competitors, and some >only by the speed of the winning board. This makes the implementation >of the protocol critical. Synchronizing to any clock is suicide. The >protocol is complex, but the committee had a number of historical, >political, and schedule constraints, as well as functional ones (eg, >real-time priority scheduling). Bus interface silicon should eventually >hide some of this complexity, as well as improve speed. (1) Could you list, for the benefit of readers, which actions are limited by what? (Else I'll have to dig through my spec). (2) It would be interesting to see what bus interface implementations for existing busses, both discrete and integrated, are truly asynchronous versus synchronous, synchronizing the asynch bus signals to an internal clock. Saying "bus interface silicon should hide complexity" is a bit of a cop out - yeah, sure, but look how long it took for decent VME bus interface chips to come out (particu;larly the VITA chip, for people who didn't want to buy Motorola's VME chip). I do not know of any VME chip interface that is truly asynchronous. It sure would be nice if the spec made real asynch implementations easier. (Of course, there is some hope that the resurgence of asynch techniques, a la Sutherland, may make asynch implementations easier for the average Joe designer of the bus interface logic. That's been the real bottleneck - not that asynch is terribly hard, just that it's less well known). Note that I am not against FUTUREBUS and FUTUREBUS+ -- the fact that they are becoming standards is great. What I am asking about is what a RISC bus would look like, faster than FUTUREBUS+, or less complex.
afgg6490@uxa.cso.uiuc.edu (12/15/89)
OK, let me make some suggestions for a RISC bus: (1) All transactions are disconnected or split. Possibly an arbitration preemption line if the response is immediately available. (IE. you don't assume connected and then change over to split depending on ACK. You assume split. Connected = split with immediate response separate). (2) Throw out all the fancy synchronization operations. Provide (i) a LOCK signal that can be applied only to a single resource of less than bus width. Let software protocols handle multiple resource locking - don't require the bus interfaces to track it. If you feel adventurous, provide (ii) a remote load-store-fixed or compare-and-swap, or (ii) a remote fetch-and-add. These because they possibly permit combining. Probably only provide one of them. Now I'll go out on a limb. (N->infinity) Forget about arbitration fairness. Software can implement fairness at the process level (eg. by counting blocked bus cycles and scheduling processes to even them out).
afgg6490@uxa.cso.uiuc.edu (12/15/89)
>One comment, there is a group of (future) Futurebus+ users in Europe >that have sent out fliers (should that be flyers?) encouraging >people to join their effort to: > "define useable subsets of Futurebus +". Can you give me any pointers to these folks? I'd like to contact them.
afgg6490@uxa.cso.uiuc.edu (12/15/89)
Another RISC bus suggestion: -> Don't have combined A/D lines. Although it is very attractive to take your 32 address lines and your 32 data lines and combine them for a 64 bit wide data path, it is a lot sillier when you have 256 lines in total (an extra 32 address lines gets lost in the shuffle). There are a lot of address only data transactions, in a cache coherent system. Conversely, the block data transfers that we are trying to optimize with 256 bit transfers would tend to use the data bus for a long time. (Is this true? I/O transfers use blocks >> 256 bits, but do (should) cache systems...) So address only transactions for processors get blocked by block transfers for I/O. Let em pass.
jjg@walden.UUCP (John Grana) (12/17/89)
In article <112400018@uxa.cso.uiuc.edu> afgg6490@uxa.cso.uiuc.edu writes: > >Another RISC bus suggestion: > > >Although it is very attractive to take your 32 address lines >and your 32 data lines and combine them for a 64 bit wide data path, >it is a lot sillier when you have 256 lines in total (an extra 32 >address lines gets lost in the shuffle). > Speaking of combining the address and data lines for a 64 bit data path, the latest VMEbus specification (rev. D?) will define a new type of block transfer mode - BLT64 or VME64 (I'm not sure what they plan on calling it). It is like the present Block Mode (address/data cycle then data only) except that: 1) The first cycle is an address only cycle. 2) All cycles after that are 64 bits (both the address and data lines transfer the data). 3) 1 or 2 new timing parameters have been added (I don't recall what they are...) John Peters from Performance Technologies Inc in Rochester NY came up with the initial timing and byte lane ordering. He also designed a "proof of concept" board set and is running > 60 Mbytes/sec on various VMEbus backplanes. John Grana jjg@walden.UUCP
johnt@opus.WV.TEK.COM (John Theus;685-2564;61-183;625-6654;hammer) (12/19/89)
In article <112400016@uxa.cso.uiuc.edu> afgg6490@uxa.cso.uiuc.edu writes: > >OK, let me make some suggestions for a RISC bus: > >(1) All transactions are disconnected or split. > Possibly an arbitration preemption line if the response is > immediately available. (IE. you don't assume connected > and then change over to split depending on ACK. You assume split. > Connected = split with immediate response separate). > >(2) Throw out all the fancy synchronization operations. > Provide (i) a LOCK signal that can be applied only to a single > resource of less than bus width. Let software protocols handle > multiple resource locking - don't require the bus interfaces > to track it. > If you feel adventurous, provide (ii) a remote load-store-fixed > or compare-and-swap, or (ii) a remote fetch-and-add. These > because they possibly permit combining. > Probably only provide one of them. I think this is a good example why designing a RISC bus is difficult. If you did item (1) and item (2)(i) you would have a flawed lock operation. A split transaction interconnect requires at least one of item (2)(ii) to have a true atomic operation. Which one? That's why both Futurebus+ and SCI (Scalable Coherent Interface) have these lock operations. Before I launch into a long lock discussion, I want to point out that making the basic decision to use split transactions adds several times to the complexity of the bus interface logic over a connected protocol. So even if the bus protocols are RISC, the interface implementation is very complex. For those of you that haven't thought about this first problem, let me try to explain locks in a split transaction environment. A split transaction consists of a request transaction e.g. a processor requests a read from memory (the requester), follow eventually by a response transaction from memory that returns the requested data (the responder). Only writes are performed on the bus since memory becomes a bus master as the responder. Several other transactions can occur on the bus between a request and its response. For the typical processor generated semaphore of a read followed by a write (swap, test-and-set, etc.), the processor might simply make a read request followed by a write request with its accompanying data. The responder would return the requested read data as the first response, and a write acknowledge as the second response. Besides being very inefficient, there is no guarantee the responder will not receive a request from another party between the two semaphore requests. If this occurred, the semaphore would not be atomic. The lock protocols might require the "bus" to prevent another request from being issued, or they might prevent the responder from acting on another request, but this would largely defeat the purpose of using split transactions, especially in a switch environment. Split response transactions require a different technique to lock the read and write operations. The solution is to perform the lock operation with a single request. Accompanying this request is a command for the responder to execute. For example, to perform a swap operation, the requester becomes master and addresses the responder with a swap lock transaction command and sends the data to be written. Then the master disconnects. When the responder acts upon the request, it executes the command by first reading the data addressed and stores it in a temporary buffer. The responder then writes to memory the data that was sent along with the request. The responder atomically executes the read and write memory operations. The buffered data is sent back to the requester in the form of a response transaction. The fetch and add command is executed by the requester sending the value to add in the request. The responder returns the original unmodified value to the requester, and then stores the sum in the addressed location. The compare and swap is executed by the requester sending the compare value and the swap value in the request. The responder returns the original unmodified value to the requester, and if the compare value is equal to the original value, it then stores the swap value in the addressed location. So, the next problem for the bus designer is deciding which lock operations to support. Most processors can generate a swap of some form, but the load occurs first followed by the store which is great for a connected bus, but backwards for a split environment. A lot of system designers would like a fetch-and-add since it is a more powerful operation than swap, but I far as I know, only Intel has produced a mainstream processor with this instruction. Right now I know of no processor that can directly generate a lock operation for a split transaction interconnect. Fetch-and-add allows combining in switch environments and allows the return of many (one per bit of data width) unique values in a single transaction on a bus with broadcast. Futurebus+ provides 3 bits for lock command encoding. Four codes are reserved for the future, and the others are nop, swap, compare-and-swap, and fetch-and-add. SCI provides 4 bits for lock command encoding. Eight codes are reserved for the future, and the 3 bits that make up the other 8 are used to directly control the hardware facilities that would be required if you implemented all of the above Futurebus+ commands. With this approach you can generate all the lock permutations that the hardware can could support. In my opinion this goes beyond CISC since most of the lock operations SCI implements no one knows how to use. On your split transaction RISC bus, how would you do locks? John Theus johnt@opus.wv.tek.com Futurebus+ Parallel Protocol Coordinator Tektronix, Inc. Interactive Technologies Div. - shipping the Futurebus-based XD88 workstations