wrd3156@fedeva.UUCP (Bill Daniels) (05/19/88)
Some folks in my organization have been led to believe that a "screaming" modem/transceiver can lock up an ethernet by asserting carrier forever. Supposedly token stuff like GM/MAP does not permit this. I have little knowledge of lans and laning but I can see in the literature that etherneted TCP/IP has about 99.999% of the non-IBM networking in the universe. It just doesn't seem prudent to buck such massive trends. What do you think? -- bill daniels federal express, memphis, tn {hplabs!csun,gatech!emcard}!fedeva!wrd3156
rpw3@amdcad.AMD.COM (Rob Warnock) (05/19/88)
In article <299@fedeva.UUCP> wrd3156@fedeva.UUCP (Bill Daniels) writes: +--------------- | Some folks in my organization have been led to believe that a "screaming" | modem/transceiver can lock up an ethernet by asserting carrier forever. | Supposedly token stuff like GM/MAP does not permit this... | What do you think? +--------------- Any piece of hardware can fail. Your token-ring transmitters can go into "screaming" mode, too. But well-designed hardware tries to avoid failing in ways that will take down the whole net. In particular, Ethernet transceivers (see the DEC/Intel/Xerox Ethernet spec) have what is called "jabber control" to prevent exactly this kind of "screaming" (which is usually a controller board fault, b.t.w., not the transceiver itself). The odds that a controller will go beserk *AND* that the jabber control will have failed at the same time are much less than either fault alone. And each fault is itself very rare. Ethernet (*any* net!) usually suffers much more from: (1) badly planned cable runs [such that the cable gets continual motion, for example]; (2) poor/broken wires [due to #1, or just rough handling]; (3) badly trained installers; (4) accidental damage by unrelated maintenance workers [as when changing a fluorescent light]; (5) poor/broken software on the hosts [*sigh*]; (6) the very robustness of upper-level protocols which hide problems from you. (All of these also affect token systems.) Still, it's a *very* reliable technology. As an aside, there seem to be a number of token advocates whose major style of promoting token rings is to knock Ethernet, usually with some panic stories about Ethernet "locking up" or "overloading". (From the tone of your question, you have one or more of these in your organization.) Some of this comes from not understanding Ethernet (which is a "controlled CSMA/CD" system, and does not "collapse" like uncontrolled CSMA, or even uncontrolled CSMA/CD), while some comes from politicking a vested interest. While it is certainly possible to have a badly overloaded Ethernet (witness some of the "broadcast storms" some diskless workstations can get into), it is also just as possible to have a badly overloaded token net. *ANY* shared resource will experience a sharp increase in delay as the average load exceeds 70-85%. (See any basic book on queueing theory.) There are good and bad points about both Ethernet and token rings. (Just ask about recovering from a lost token... Oops! There I go, doing what I was criticizing... ;-} ;-} ) Any technology needs to be analyzed for its suitablility before being used. Token rings have a place in certain constrained process-control environments. But note that even here you can't permit "general timesharing" on the same net as your process-control, or you'll blow your real-time constraints. Conversely, a dedicated Ethernet can be run as a "virtual token bus", and meet essentially the same performance constraints. All such "guarantees", however, assume there will be *NO* data errors, as these completely upset the real-time constraints. (Hence my comment above about lost tokens.) The major differences in performance between Ethernet and 10 Mbit/sec token rings occur in the very-high-average-load regime, where you never want to design a general-purpose net to run. At reasonable loads (under 70-85% or so), the two technologies are practically identical. Also, token rings do better at the higher data rates (above 50 Mbit/sec) or for geographically very large nets (diameter >2500 meters), regimes where CSMA/CD doesn't work as well (or at all!). Anyway, Ethernet's there, it's (relatively) cheap (finally!), and everyone from clone-makers to Big Blue supports it. Where it fits, it fits very well indeed... Rob Warnock Systems Architecture Consultant UUCP: {amdcad,fortune,sun,attmail}!redwood!rpw3 ATTmail: !rpw3 DDD: (415)572-2607 USPS: 627 26th Ave, San Mateo, CA 94403
phil@amdcad.AMD.COM (Phil Ngai) (05/20/88)
In article <21674@amdcad.AMD.COM> rpw3@amdcad.UUCP (Rob Warnock) writes: >In article <299@fedeva.UUCP> wrd3156@fedeva.UUCP (Bill Daniels) writes: >| Some folks in my organization have been led to believe that a "screaming" >| modem/transceiver can lock up an ethernet by asserting carrier forever. >| Supposedly token stuff like GM/MAP does not permit this... >| What do you think? > >Any piece of hardware can fail. Your token-ring transmitters can go into >"screaming" mode, too. But well-designed hardware tries to avoid failing in >ways that will take down the whole net. In particular, Ethernet transceivers >(see the DEC/Intel/Xerox Ethernet spec) have what is called "jabber control" >to prevent exactly this kind of "screaming" (which is usually a controller >board fault, b.t.w., not the transceiver itself). The odds that a controller >will go beserk *AND* that the jabber control will have failed at the same time >are much less than either fault alone. And each fault is itself very rare. In addition to Rob's comments, it should be noted that the jabber control is supposed to be implemented so as to eliminate the chance of the network being overloaded if any one component fails. In one design I saw, the jabber control was replicated three times. Any one of them could shut down the transceiver if carrier were asserted too long. The jabber control functions independently of the controller or the transmit or receive circuitry. It listens to the trunk cable; there is no possible failure of the transmitter or receiver that could disable it. Of course, this increases the chance that one node will be cut off if the jabber control activates when it shouldn't. But this is consistent with the philosophy of protecting the network even at the cost of slightly decreased availability for a particular node. All this not withstanding, we have hundreds of transceivers in active use at my company and I haven't ever seen one fail. -- Make Japan the 51st state! I speak for myself, not the company. Phil Ngai, {ucbvax,decwrl,allegra}!amdcad!phil or phil@amd.com
smb@ulysses.homer.nj.att.com (Steven Bellovin) (05/20/88)
This afternoon, we had some sort of network lockup that could have been a two-point failure. Ulysses (a Sun-3/280) suddenly started muttering ``ie1: Ethernet jammed''. The lights on the transceiver showed continuous receive, as if someone were indeed talking continuously. I've only seen something like this once before, and I deliberately applied the technique I discovered accidentally last time: I unterminated the coax. That caused the jabberer to see a collision, and hence to shut up. The network recovered immediately, and everything was able to talk once again. The particular net in question is a very difficult one to debug. It's our backbone, and consists of a very short segment of coax with lots of repeaters to other segments. Some of those repeaters and transceivers are ancient; our segment may be connected via a 5 or 6 year-old 3Com transceiver. I have no idea which host was misbehaving; it could even have been Ulysses, since the repeater may have isolated such a failing segment. All that, right on the heels of this discussion (and while a network equipment sales rep was in my office, trying to sell me a network management gizmo!) got me thinking. There is in general *no way to know* if a jabber-detect has failed -- there is no standard diagnostic for it! Thus, the second failure (of a controller) can happen at any time in the future; the two don't have to be coincident. (As has been noted, some jabber circuits are redundant, but not necessarily all of them.) --Steve Bellovin ulysses!smb, smb@ulysses.att.com
kwe@bu-cs.BU.EDU (kwe@bu-it.bu.edu (Kent W. England)) (05/20/88)
In article <299@fedeva.UUCP> wrd3156@fedeva.UUCP (Bill Daniels) writes: >Some folks in my organization have been led to believe that a "screaming" >modem/transceiver can lock up an ethernet by asserting carrier forever. >Supposedly token stuff like GM/MAP does not permit this. A screaming baseband transceiver can take down a single Ethernet segment. A screaming broadband modem can take down a broadband CATV network. MAP runs on a broadband network, using token bus. A broadcast medium like Ethernet or broadband CATV can be disabled by hardware failures in transmitters. This is independent of the medium acquisition methodology. MAP and Ethernet/802.3 are equivalent in this respect. Kent England, Boston University
kwe@bu-cs.BU.EDU (kwe@bu-it.bu.edu (Kent W. England)) (05/20/88)
In article <10303@ulysses.homer.nj.att.com> smb@ulysses.homer.nj.att.com (Steven Bellovin) writes: >This afternoon, we had some sort of network lockup that could have been >a two-point failure. Ulysses (a Sun-3/280) suddenly started muttering >``ie1: Ethernet jammed''. > >The particular net in question is a very difficult one to debug. It's >our backbone, and consists of a very short segment of coax with lots of >repeaters to other segments. Some of those repeaters and transceivers >are ancient; our segment may be connected via a 5 or 6 year-old 3Com >transceiver. I have no idea which host was misbehaving; it could even >have been Ulysses, since the repeater may have isolated such a failing >segment. > Old transceivers may not implement jabber control. Old repeaters do not provide fault isolation. For new equipment, specify 802.3 compliance and test compliance. Transceivers should implement jabber control and new repeaters should implement the new IEEE 802.3 repeaters specification which provides a degree of fault isolation and should interrupt repeating of jabbering [illegal] signals. I believe all implementations of the multiport repeater follow the new 802.3 repeater rules. I think all new implementations of Ethernet concentrators (ala the new twisted pair concentrators) should implement the new rules. Kent England, Boston U
phil@amdcad.AMD.COM (Phil Ngai) (05/21/88)
In article <10303@ulysses.homer.nj.att.com> smb@ulysses.homer.nj.att.com (Steven Bellovin) writes: >There is in general *no way to >know* if a jabber-detect has failed -- there is no standard diagnostic >for it! Thus, the second failure (of a controller) can happen at any >time in the future; the two don't have to be coincident. This is a very important principle. I call it the "testing your spare tire" policy. Redundancy without an alarm to notify you when it has been invoked is very dangerous. I tend to think of things like this as belonging under network management. The designers of Ethernet were concerned about this. That is why the version 2 has the "heartbeat" or collision presence test at the end of every packet. Unfortunately jabber detect is not automated like this. There are transceiver testers that can be used for this. Either Cabletron or Titn made one that had such a test, unfortunately I looked at this two years ago and don't remember which one. In any case, you'd have to manually go out and hook up the transceiver tester to check the jabber detect. -- Make Japan the 51st state! I speak for myself, not the company. Phil Ngai, {ucbvax,decwrl,allegra}!amdcad!phil or phil@amd.com
eshop@saturn.ucsc.edu (Jim Warner) (05/21/88)
In article <21695@amdcad.AMD.COM> phil@amdcad.UUCP (Phil Ngai) writes: > >This is a very important principle. I call it the "testing your >spare tire" policy. Redundancy without an alarm to notify you >when it has been invoked is very dangerous. > Not quite. If a transceiver's jabber circuit causes it to disconnect it will stay disconnected until either (a) the power is cycled or (b) is is explicitly reset over control lines that are not implimented in any products I know of. Nothing more will be transmitted til the fault is cleared by (a) or (b). I'd say that's pretty good notification that the fail safe has tripped. >The designers of Ethernet were concerned about this. That is why >the version 2 has the "heartbeat" or collision presence test >at the end of every packet. I have a transceiver cable breakout box. I used it on several systems to disconnect the collision pair between several systems and their transceivers. I expected to see messages start appearing on the console. What I got instead was silence. The OS was never notified. I did this to a 3com 3C501 and ran the self test that came on a diskette with the interface. It told me that my system "passed with flying colors." jim warner
smb@ulysses.homer.nj.att.com (Steven Bellovin) (05/21/88)
The whole discussion about Ethernets locking up raises another issue: what do folks use for network control, monitoring, and management? We have a moderately complex topology: one building wired with thinwire Ethernet according to the DECconnect wiring scheme (which I described recently in a long posting to comp.sys.sun), linked by LANbridges and fiber transceivers to another building; it in turn has an IP-level gateway to two other Ethernets, one of which is a multi-organization backbone. There are assorted other links as well, using varying technologies. The question is this: what can we buy/build to monitor this? I'm especially concerned about the Ethernets; I'm interested in routine monitoring for collision rates, plus fault isolation in case of network meltdowns, collision storms, etc. Presumably we need a TDR (and baseline photographs of each segment); what else do we need? Are etherfind(1) and traffic(1) on our Suns sufficient? How about an Excelan Lanalyzer, or a Cabletron LAN MD or LAN SPECIALIST? Does anyone have any experience with any of those products? --Steve Bellovin ulysses!smb smb@ulysses.att.com
phil@amdcad.AMD.COM (Phil Ngai) (05/22/88)
In article <3366@saturn.ucsc.edu> eshop@saturn.ucsc.edu (Jim Warner) writes: >In article <21695@amdcad.AMD.COM> phil@amdcad.UUCP (Phil Ngai) writes: ..This is a very important principle. I call it the "testing your ..spare tire" policy. Redundancy without an alarm to notify you ..when it has been invoked is very dangerous. .. .Not quite. If a transceiver's jabber circuit causes it to disconnect .it will stay disconnected until either (a) the power is cycled or .(b) is is explicitly reset over control lines that are not implimented Sorry about that, my brain was going faster than my fingers at that point. There are two concerns. First, you want a way to test backup features which are normally rarely put in use. Second, you want to be notified when a backup feature is put in use. An example of the first is the heartbeat signal for the transceiver's collision detect function. There is nothing similar for jabber. An example of the second would be for the OS to complain if it wasn't receiving heartbeat. As you have noted, jabber is very noticable when it trips, so that is not a problem. .I have a transceiver cable breakout box. I used it on several .systems to disconnect the collision pair between several systems .and their transceivers. I expected to see messages start appearing .on the console. What I got instead was silence. The OS was never .notified. I did this to a 3com 3C501 and ran the self test that .came on a diskette with the interface. It told me that my system ."passed with flying colors." I think that says something about 3Com. -- I speak for myself, not the company. Phil Ngai, {ucbvax,decwrl,allegra}!amdcad!phil or phil@amd.com
ron@topaz.rutgers.edu (Ron Natalie) (05/25/88)
Broken devices can blow nearly all networks. The continuously jabbering transceiver on Ethernet is subject to the same problem as broken RF modems on MAP broadbands. The fact that one uses token passing rather than carrier sense to arbitrate the bus does not help when some host decides not to play by the rules. This whole thing has nothing whatsoever to do with TCP/IP or the higher level protocols, it's a media issue. -Ron
ron@topaz.rutgers.edu (Ron Natalie) (05/25/88)
Well, transcievers do fail. A jabbering transciever is very very hard to find. More frequently what fails is the vampire tap, that can be found with TDR, but finding a malicious transciever is very hard. BRL had one that would intermittantly transimit for about 15 seconds straight. This only became really noticeable as the Braindamaged microcode in one of our Ethernet blew up when they couldn't get on the net for seven seconds. -Ron