XJELDC@gemini.ldc.lu.se (Jan Engvald LDC) (07/06/90)
>Date: Tue, 19 Jun 90 17:13:43 -0700 >From: Greg Wohletz <greg%duke.cs.unlv.edu@RELAY.CS.NET> >Subject: strange behaviour involving repeaters >To: cisco@spot.Colorado.EDU >Message-id: <9006200655.AA23113@spot.Colorado.EDU> > >We are experiancing some very strange problems. First let me draw >you a diagram of part of our network. ... > o our cisco has 6 ethernet interfaces, the problem exists on > all of them. > >So, for some reason the gateway is ignoring, or throwing away most packets that >have passed through the repeater. However, all our other machines can send >and recieve packets that pass through the repeater without any >problems. I don't think the problem is the repeater, but the Cisco. We have the same problems in a setup where we have two MT800s cascaded, to which about 10 Retix bridges, a Dataco remote bridge, a Dec VAX and two Ciscos are connected. Everybody can talk to anybody, except host behind the remote bridge can not talk through the Cisco. The Cisco decides that 80% of those packets are bad, but an Ethernet monitor and all other equipment says they are OK. This problem was introduced when we upgraded from old Ethernet cards to the MCI cards. We swapped cards back and forth a couple of times and with old cards we got 0 lost packets in 100000, with MCI we got 80% lost from the Dataco bridge and 0.1% - 0.2% from any of the Retix bridges. (Whenever we have had an error rate above 0.01% it has always been due to some faulty hardware or configuration, so 0.2% is MUCH too high!) The above problems with MCI cards was reported to Cisco via our Swedish representative in autumn 1989, but Cisco didn't believe in it then. Later this problem was recognised, as it has occured at other places too, and I have been told that there will be a firmware fix to the MCI card soon that will correct it. I have no information on what the MCI card is doing wrong and how that is corrected, though. For Ethernet trancseivers there are dedicated test equipment that can tell you if the device under test is within allowed limits, but I have not found any such test equipment that can test the other end of the transceiver cable, the controller. Jan Engvald, Lund University Computing Center ________________________________________________________________________ Address: Box 783 E-mail: xjeldc@ldc.lu.se S-220 07 LUND Earn/Bitnet: xjeldc@seldc52 SWEDEN (Span/Hepnet: Sweden::Gemini::xjeldc) Office: Soelvegatan 18 VAXPSI: psi%24020031020720::xjeldc Telephone: +46 46 107458 (X.400: C=se; A=TeDe; P=Sunet; O=lu; Telefax: +46 46 138225 OU=ldc; S=Engvald; G=Jan) Telex: 33533 LUNIVER S
"James_W._Morrison.ESAE"@Xerox.COM (07/06/90)
Greg; Try changing the tranceivers on on either side of the repeater one at a time. Sometimes tranceivers will break packets when heavily loaded with traffic (especially if they go flacky and not totally break). If the tranceivers don't fix the problem then the repeater has to be breaking packets either because it can't handle the traffic load or it's flacky. If all else fails hook an o-scope up to the ethernet using a "T" adapter in series with the terminated end and look for broken packets (packets with an amplitude of more than -2 volts). The broken packets may all be mangled or only a portion of the packet near the end. Disconnect the reapeater and see if the broken packet condition still exists. If so you have a machine or tranceiver somewhere on that physical segment that's flacky/broken. Using the buddy system go and disconnect each machine on taht segment one at a time until you find the mangler. Walla fix this guy and all will be well. The reason this causes problems on both sides of the repeater is that it looks like an open and repeaters only pass what they see and don't try to "qualify" any packet, good or broken. hope this helps, Jim Morrison Network Systems Analyst Xerox Corp. El Segundo, CA.
robelr@bronze.ucs.indiana.edu (Allen Robel) (07/07/90)
>I don't think the problem is the repeater, but the Cisco. We have the same >problems in a setup where we have two MT800s cascaded, to which about 10 Hmmm. We recently evaluated a product called NQA The Prophet that is basically a Physical/MAC layer LAN analyser. The first thing this product told us was that the MCI on the cisco for this LAN was clocking 3 times too fast. I talked to cisco about this and they requested information on this analyser. After looking over what I had sent them, they did admit that this was a problem with some of their earlier interfaces. I asked them how one could differentiate these interfaces from their newer ones and have since not gotten a response. Anyway, could this problem manifest itself in the symptoms mentioned above and in earlier notes? As this is a physical layer problem, tools like the Sniffer, LANWatch, etc wouldn't catch it. Just a thought. regards, Allen Robel robelr@bronze.ucs.indiana.edu University Computing Services ROBELR@IUJADE.BITNET Network Research & Planning voice: (812)855-7171 Indiana University FAX: (812)855-8299
hedrick@cs.rutgers.edu (07/07/90)
I'm always suspicious of vendors that claim somebody is sending data "too fast for an Ethernet". You may recall that a number of people claimed that Sun was violating Ethernet specs, when it turns out that their interfaces and/or software were simply not capable of dealing with high traffic levels. We heard a claim recently from a vendor of fiber Ethernet that MCI's ran "faster than 10Mbps". When we finally traced this down through their technical people, we believe the problem is that they simply can't handle as high packet rates as the MCI. The 10Mbps speed is a fairly fundamental feature of Ethernet which should be determined by the controller chip (through I guess it probably depends upon something like a crystal to do timing, so we have to assume the cisco designer was competent enough to use the right frequency crystal). Apparently the firmware does determine the minimum interpacket spacing. If you suspect you are dealing with a device that can't take packets as fast as the MCI can generate them, you can always use the "transmitter-delay" interface parameter to insert additional delay.
robelr@bronze.ucs.indiana.edu (Allen Robel) (07/07/90)
>I'm always suspicious of vendors that claim somebody is sending data >"too fast for an Ethernet". You may recall that a number of people >claimed that Sun was violating Ethernet specs, when it turns out that The problem WAS with the crystal cisco was using for timing and it is a problem that cisco has admitted to. Allen Robel robelr@bronze.ucs.indiana.edu University Computing Services ROBELR@IUJADE.BITNET Network Research & Planning voice: (812)855-7171 Indiana University FAX: (812)855-8299
BILLW@mathom.cisco.com (WilliamChops Westfield) (07/07/90)
Ok, here is the complete story. In early 1989, a batch of MCI boards were built with an incorrect TYPE of crystal. This incorrect crystal caused the wire clocking to run .03% faster than 10 Mhz. The Ethernet Spec only allows .01% variation, so we were 3x over the variation limit (Which is not nearly the same thing as being 3x too fast!) In most cases, the interfaces continued to work just fine, and interoperated with all other devices on the ethernet cable. However, some devices, notably some twisted pair ethernet transceivers, didn't like it. Most of the MCIs that were causing problems in the field have been fixed. All boards manufactured since May 1989 should have the correct crystals, including all version 3 MCIs. Boards with the wrong crystals still interoperate with most other equipment. You can identify suspect boards by inspecting the ethernet encoder crystals, which are near the ethernet connectors on the board. The out-of-spec crystals are tiny little things (.2 x .2 x .5 inches or so) labeled "fs200". Anything else is correct. William Westfield cisco Engineering. -------
hedrick@athos.rutgers.edu (Chuck Hedrick) (07/10/90)
One thing to be careful about with MCI's: in normal operation and MCI will report more errors than older interfaces. Collisions cause fragments. I have a feeling that the MCI sees these as individual packets with errors, and older cards don't see them at all. At any rate, on busy networks, our MCI's show .1 to 1% input errors. However tests with ping show pretty clearly that there are no actual errors occuring. We've tended to ignore input errors unless it gets over 1% or there are other symptoms of problems. This doesn't mean that you have nothing to worry about. I don't know your situation, so I can't tell. But simply the fact that you have higher rates reported by the MCI than by older interfaces does not automatically mean there are problems with the MCI's.