bob@pirates.UUCP (Bob Fawcett) (12/04/90)
I would like to know what can cause excessive collisions on Ethernet. I realize that collisions are a fact of life on ethernet. My net analyzer shows a large number of local collisions (as high as 20%). I can't isolate to just one or two machines. I also don't have any one machine which is completely non functional. In other words I haven't nailed it down to a problem with one machine (bad card or ?). What other things should I look for? Ground problems? My cable scanner doesn't show any bad segments of cable. MOst of the machines are on Thinnet. There is fiber between buildings. Total of about 150 stations campus wide. Any Ideas? Thanks in advance. Bob Fawcett Director, Academic Computing Armstrong State College bob@pirates.uucp
dave@monu6.cc.monash.edu.au (Dave Schwarz) (12/05/90)
In article <467@pirates.UUCP>, bob@pirates.UUCP (Bob Fawcett) writes: > I would like to know what can cause excessive collisions on Ethernet. I > realize that collisions are a fact of life on ethernet. > My net analyzer shows a large number of local collisions (as high as 20%). > I can't isolate to just one or two machines. I also don't have any one > machine which is completely non functional. In other words I haven't > nailed it down to a problem with one machine (bad card or ?). What other > things should I look for? Ground problems? My cable scanner doesn't show > any bad segments of cable. > MOst of the machines are on Thinnet. There is fiber between buildings. > Total of about 150 stations campus wide. > > Any Ideas? Try looking at any AUI cables you may have, we just had a very simmilar problem with one of out subnets. The transfer rate had dropped from aprox 200Kbytes/sec to 20Kbytes/sec. The problem only appeared when we swapped and Dempa for a Delni (2 dec products) both were driving a 20m piece of AUI cable connected to a F/O repeater. When we swapped we got the reduction in bandwidth, after much checking we discovered that the cable was to long, we shortened it back to 5 meters and it worked ? It seemed that the Delni didn't have enought omph to drive long bits of cable. dave.... -- Dave Schwarz @ Monash Uni Caulfield Campus This space now for hire 900 Dandynong Rd,East Caulfield,Vic,Australia (and I know that DANDENONG dave@monu6.cc.monash.edu.au doesn't have a Y in it !) Dave@banyan.cc.monash.edu.au Dave@vx24.cc.monash.edu.au (Yuk a vax)
gaj@hpctdja.HP.COM (Gordon Jensen) (12/06/90)
>I would like to know what can cause excessive collisions on Ethernet. I >realize that collisions are a fact of life on ethernet. >My net analyzer shows a large number of local collisions (as high as 20%). >I can't isolate to just one or two machines. I also don't have any one >machine which is completely non functional. In other words I haven't >nailed it down to a problem with one machine (bad card or ?). What other >things should I look for? Ground problems? My cable scanner doesn't show >any bad segments of cable. >Bob Fawcett First thing to check for is grounding problems, especially on your ThinLAN. If there are two grounds on a segment, 60 Hz can flow in the shield. Enough IR drop can occur to trigger a repeater's carrier sense circuit. Since there is no signal to lock to, repeaters that I've seen just source 10 MHz out all ports. This is bad. Extra grounds can occur when the T connector isn't covered with it's prophylactic. A quick test is to throw an *analog* scope on the cable and sync to line, with the timebase set to show multiple cycles of 60 Hz. Good luck, Gordon
iiitih@cybaswan.UUCP (Ivan Izikowitz) (12/06/90)
If you don't think you have a faulty card, then you probably have a heavily loaded network. Just take a look at any of the published performance curves for the 802.3 protocol - throughput is severely degraded once the offered load exceeds a certain value (I think about 40% of channel capacity?) Ivan @ The Institute for Industrial Information Technology, Innovation Centre Swansea SA2 8PP Phone: (+44) 792 295213 | JANET: iiitih@uk.ac.swan.pyr Fax: (+44) 792 295532 | UUCP: ..!ukc!cybaswan.UUCP!iiitih
john@newave.UUCP (John A. Weeks III) (12/07/90)
In article <467@pirates.UUCP> bob@pirates.UUCP (Bob Fawcett) writes: > I would like to know what can cause excessive collisions on Ethernet. I > realize that collisions are a fact of life on ethernet. > What other things should I look for? Ground problems? My cable scanner > doesn't show any bad segments of cable. I was recently fighting a problem like this. It turned out to be a bad connection between a tranciever and a drop cable to a router. Unknown to me at the time, the number of retries was very high. This extra traffic lead to excessive collisions. -john- -- =============================================================================== John A. Weeks III (612) 942-6969 john@newave.mn.org NeWave Communications ...uunet!rosevax!tcnet!wd0gol!newave!john ===============================================================================
spurgeon@.uucp (Charles E. Spurgeon) (12/12/90)
In article <2184@cybaswan.UUCP> iiitih@cybaswan.UUCP (Ivan Izikowitz) writes: >If you don't think you have a faulty card, then you probably have a >heavily loaded network. Just take a look at any of the published >performance curves for the 802.3 protocol - throughput is severely >degraded once the offered load exceeds a certain value (I think about >40% of channel capacity?) > I think that the 40% figure you refer to comes from simulations of "Ethernet" that don't happen to reflect the real Ethernet protocol all that well. Ethernet traffic tends to be bursty, and one second samples of traffic showing 40% utilization would be nothing to get worried about. Even a constant load of 40% on an Ethernet (a situation that is unusual) would still not be all that big a deal. For the empirical evidence as to Ethernet's ability to move data, see the SIGCOMM paper presented a couple of years back. Here's the access info from the Network Manager's Reading List: The following technical report from the Digital Equip- ment Corporation's Western Research Lab documents empirical evidence showing that the 10 megabit Ethernet system is capable of transmitting large amounts of data in a reliable fashion. The report is also useful for its analysis of what makes a good Ethernet implementa- tion. Included is a brief set of guidelines for the network manager who wants their Ethernet system to run as well as possible. o Measured Capacity of an Ethernet: Myths and Real- ity David R. Boggs, Jeffrey C. Mogul, Christopher A. Kent. Proceedings of the SIGCOMM '88 Symposium on Com- munications Architectures and Protocols, ACM SIGCOMM, Stanford, CA., August 1988, 31 pps. From the Abstract: "Ethernet, a 10 Mbit/sec CSMA/CD network, is one of the most successful LAN technologies. Considerable confu- sion exists as to the actual capacity of an Ethernet, especially since some of the theoretical studies have examined operating regimes that are not characteristic of actual networks. Based on measurements of an actual implementation, we show that for a wide class of appli- cations, Ethernet is capable of carrying its nominal bandwidth of useful traffic, and allocates the bandwidth fairly." This paper is also available over the Internet via electronic mail from the DEC Western Research archive server. Send a message to the following address with the word "help" in the Subject line of the message for detailed instructions. The address is WRL- Techreports@decwrl.dec.com. You may also request a copy of the report through the U.S. postal system by writing to: Technical Report Distribution DEC Western Research Laboratory, UCO-4 100 Hamilton Avenue Palo Alto, California 94301
henry@zoo.toronto.edu (Henry Spencer) (12/12/90)
In article <2184@cybaswan.UUCP> iiitih@cybaswan.UUCP (Ivan Izikowitz) writes: >... Just take a look at any of the published >performance curves for the 802.3 protocol - throughput is severely >degraded once the offered load exceeds a certain value (I think about >40% of channel capacity?) What published performance curves for what protocol? The throughput of 802.3, aka Ethernet, is monotonic increasing as load increases. There is no "severe degradation". Even under massive overload it continues to move data, although collisions limit it to something like 70% of the theoretical channel capacity under those conditions. (Note, this assumes multiple sources of traffic. A single source of traffic can run an Ethernet at circa 100% of theoretical, so 70% is down somewhat compared to that ideal state.) Many of the early simulation studies of "Ethernet" were actually studying different protocols with inferior performance, either because the folks involved thought they could "improve" Ethernet or because they didn't understand it very well to begin with (often both). The numbers and curves from those studies are completely irrelevant to real Ethernet, although myths derived from them are persistent among Ethernet's detractors. -- "The average pointer, statistically, |Henry Spencer at U of Toronto Zoology points somewhere in X." -Hugh Redelmeier| henry@zoo.toronto.edu utzoo!henry
cornutt@freedom.msfc.nasa.gov (David Cornutt) (12/15/90)
The percent utilization of the channel capacity of an Ethernet (or any CSMA-type network) depends not so much on the total volume of traffic as on the number of nodes that have traffic ready to transmit simultaneously. As Henry Spencer noted in a previous article, an Ethernet with only one node transmitting can get pretty close to 100% utilization. (Such situations do occur; we have an application here where there may be about 70 nodes on a net, but only one or two nodes generating the lion's share of the traffic.) The limiting factor is the probability of getting a collision, which is roughly proportional to the number of nodes that are generating large amounts of traffic. There is a derivation in the Tannenbaum book (*Computer Networks*, second edition, Prentice Hall, 1988) which can be expanded to show the expected utilization for n number of nodes transmitting (or attempting to) simultaneously. The theoretical worst case occurs about n = 100, where the channel utilization is down to about 37%. (I have seen values close to this in a campus network that I once worked on.) In practice, an Ethernet starts to break down at this point as controllers begin giving up due to exceeding their max retry settings. There are ways to make CSMA networks get better utilization under these conditions by introducing a random go/no-go decision into retransmissions. A node attempts to retransmit picks a random number such that it has an x% chance of attempting the retansmit; if the random draw loses, the node does not attempt retransmission but backs off again. The lower x gets, the better the overall channel utilization gets. The tradeoff is that the average latency for individual packets becomes very long as x increases, which is why you don't see many commercial implementations of this scheme. Of course, none of the above figures take into account the overhead introduced by upper layer protocols. -- David Cornutt, New Technology Inc., Huntsville, AL (205) 461-6457 (cornutt@freedom.msfc.nasa.gov; some insane route applies) "The opinions expressed herein are not necessarily those of my employer, not necessarily mine, and probably not necessary."
mart@csri.toronto.edu (Mart Molle) (12/15/90)
In article <1990Dec14.191255.20529@freedom.msfc.nasa.gov> cornutt@freedom.msfc.nasa.gov (David Cornutt) writes: >The percent utilization of the channel capacity of an Ethernet (or any >CSMA-type network) depends not so much on the total volume of traffic as >on the number of nodes that have traffic ready to transmit simultaneously. >As Henry Spencer noted in a previous article, an Ethernet with only one >node transmitting can get pretty close to 100% utilization. (Such >situations do occur; we have an application here where there may be about >70 nodes on a net, but only one or two nodes generating the lion's share of >the traffic.) The limiting factor is the probability of getting a >collision, which is roughly proportional to the number of nodes that are >generating large amounts of traffic. There is a derivation in the >Tannenbaum book (*Computer Networks*, second edition, Prentice Hall, 1988) >which can be expanded to show the expected utilization for n number of >nodes transmitting (or attempting to) simultaneously. The theoretical >worst case occurs about n = 100, where the channel utilization is down to >about 37%. (I have seen values close to this in a campus network that I >once worked on.) In practice, an Ethernet starts to break down at this >point as controllers begin giving up due to exceeding their max retry >settings. Pay no attention to Tananenbaum's calculation of Ethernet throughput. It is a gross simplification, based on a model first put forward by Metcalfe and Boggs in 1976 that assume slotted operation, ``global queue in the sky'' backoff algorithm, etc. Also, you've neglected to include their full model which includes the effect of packet lengths. Basically, this model says the channel consists of a repeating pattern of ``cycles'' each of which consists of a run of [short] ``wasted'' slots (whose analysis is assumed to be the same as slotted Aloha) followed by a single [long] ``useful'' slot. The 37% figure you quote above is for minimal-length packets, where ``useful'' slots are the same size as ``wasted'' slots and the whole thing degenerates into slotted Aloha. If you make the ``useful'' slots bigger (i.e., you put a non-trivial amount of data into each packet), then the model predicts much higher attainable throughputs. For example if the ``useful'' slots are 1/a times longer than ``wasted'' slots, the capacity is about 1/ (1 + a * (e-1)), where e is the base of the natural logarithm and 1/e is the capacity of slotted Aloha. If you want a more accurate analysis of the throughput for unslotted 1-persistent CSMA/CD used in Ethernet, go read the articles by Sohraby, Molle and Venetsanopoulos, and by Takagi and Kleinrock, in IEEE Transactions on Communications, February 1987. (BTW, neither paper appears in the widely referenced ``Myths and Reality'' paper from Sigcomm 88, which includes an earlier paper by Takagi and Kleinrock that gave totally wrong answers due to errors in the analysis....) These papers show that CSMA/CD can get very high channel efficiencies even in the limit of infinitely many active stations. However, they fail to include the truly bizarre influences of the truncated binary exponential backoff algorithm used on Ethernet and thus are not the last word on the subject. >There are ways to make CSMA networks get better utilization under these >conditions by introducing a random go/no-go decision into retransmissions. [Description of p-persistent CSMA deleted] There are lots of other CSMA protocols in the world that look better than Ethernet. I don't think p-persistent stands out in any way in this group. Obviously, there are other reasons (like compatibility) that make people stick with the standard... Mart L. Molle Computer Systems Research Institute University of Toronto Toronto, Canada M5S 1A4 (416)978-4928
lws@comm.wang.com (Lyle Seaman) (12/21/90)
iiitih@cybaswan.UUCP (Ivan Izikowitz) writes: >If you don't think you have a faulty card, then you probably have a >heavily loaded network. Just take a look at any of the published Right. >performance curves for the 802.3 protocol - throughput is severely >degraded once the offered load exceeds a certain value (I think about >40% of channel capacity?) Rong. Yeah, a lot of the papers say that, but they're talking about the entire load evenly distributed over the entire network. And even then, they cite figures of 65% of capacity. I routinely see usage approaching 90% with very little trouble (admittedly, an unusual LAN configuration, as well). -- Lyle Wang lws@capybara.comm.wang.com 508 967 2322 Lowell, MA, USA Source code: the _ultimate_ documentation.
pcg@cs.aber.ac.uk (Piercarlo Grandi) (12/22/90)
On 14 Dec 90 21:33:57 GMT, mart@csri.toronto.edu (Mart Molle) said:
mart> In article <1990Dec14.191255.20529@freedom.msfc.nasa.gov>
mart> cornutt@freedom.msfc.nasa.gov (David Cornutt) writes:
cornutt> The percent utilization of the channel capacity of an Ethernet
cornutt> (or any CSMA-type network) depends not so much on the total
cornutt> volume of traffic as on the number of nodes that have traffic
cornutt> ready to transmit simultaneously. [ ... look at Tanebaum's book
cornutt> and see that ... ] The theoretical worst case occurs about n =
cornutt> 100, where the channel utilization is down to about 37%.
mart> Pay no attention to Tananenbaum's calculation of Ethernet
mart> throughput. It is a gross simplification, [ ... ] For example if
mart> the ``useful'' slots are 1/a times longer than ``wasted'' slots,
mart> the capacity is about 1/ (1 + a * (e-1)), where e is the base of
mart> the natural logarithm and 1/e is the capacity of slotted Aloha.
mart> If you want a more accurate analysis of the throughput for
mart> unslotted 1-persistent CSMA/CD used in Ethernet, go read the
mart> articles by Sohraby, Molle and Venetsanopoulos, and by Takagi and
mart> Kleinrock, in IEEE Transactions on Communications, February 1987.
mart> [ ... ] These papers show that CSMA/CD can get very high channel
mart> efficiencies even in the limit of infinitely many active stations.
But this is just the utlization factor of the channel. Okay, Ethernet is
not slotted Aloha, and it can get very high utlization factors basically
because latency is very small and in the occurrence of a collision the
abort is nearly istantaneous, and the retry by another station has good
chance of success.
However what about delay? The medium gets near its rated thruput, but
the average station will have to wait a time that is pretty long,
retrying quite a bit. Suppose we have 100 stations on the net, each of
them, if efficiency is 100%, will get almost 10KB per second bandwidth,
1% of the total, and wait (assuming equal sized packets) 99% of the time
for a chance to send its packet, by waiting for silence on the wire or
for retransmit timeouts. Even with 10 stations, say communicating in
pairs, things are fairly bleak.
How do we square the model that says that we should get near optimal
efficiency with the reality that we do not get it? A hint is given by
comparing:
cornutt> (I have seen values close to this in a campus network that I
cornutt> once worked on.) In practice, an Ethernet starts to break down
cornutt> at this point as controllers begin giving up due to exceeding
cornutt> their max retry settings.
mart> However, they fail to include the truly bizarre influences of the
mart> truncated binary exponential backoff algorithm used on Ethernet
mart> and thus are not the last word on the subject.
In practice even if the medium efficiency is high, the big problem is
that the network interfaces are bad, and are not fast enough. If you
take into account the limitations of network interfaces the picture
changes suddenly, in particular for rooted communications patterns, in
which a lot of the traffic goes to a single interface, which gets
swamped.
If the idea that Ethernet-the-wire-and-protocol can achieve 100%
efficiency (but with long and variable delays) is true, and I think this
is now established, it is interesting but not much relevant, because the
real bottleneck is the network interface, and those are usually quite
horrid.
--
Piercarlo Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
rauletta@gmuvax2.gmu.edu (R. J. Auletta) (04/15/91)
We have been having some problems with an ethernet installation that our Computer Network Services organization seems unwilling to resolve. I am looking for some insight from those who might have a sense of whether what we are seeing is normal. The problem is characterized as follows. 1) Interactive sessions (typing etc) tends to get periodically interrupted every couple of seconds for a tenth of a second or more (the echo time becomes longer than the time to type a 5-10 character word.) (This seems directly related to any burst of ethernet traffic over about 10,000 bytes/sec as reported by etherd on a Sun.) (When the ethernet load is low, the ethernet is very responsive.) 2) The following indications on an American Network Connections ANC-80 8-port fanout transceiver 802.3 while the problem is present. RCV light intermittently active REM COL blinks as the RCV goes out [everytime]. (Remote Collision) LOC COL off TRVR PRES light is dimly on. (?) SQE is off. 3) Traffic on the ethernet is (as reported by etherd on a Sun workstation) about 25K-75K bytes per second running about 50-150 packets per second when (1) is observed. Most of the traffic is between just two Sun workstations. 4) Running netx (a tcp exerciser) on a Vax3600 to a VS2000 showing a network load of about 10% shows almost continuous collisions even when only the two machines are active on the network. (Every blink of the RCV light on the fanout unit results in the REM COL blinking, the LOC COL stays off.) Is this normal? At the described load would one expect to experience poor interactive response that ethernets are known for? Might this be due to the "supposed" problem with Sun's interpretation of the ethernet standard in regards to back to back packets? The general form of the ethernet is a thick riser with one thinnet transceiver and several AUI transceivers with a bridge to a fiber-optic segment. What I am looking for are some suggestions as to what we might look for to isolate the problem (such as "this sounds like a noise problem", or "excessive reflections", or "load is just too high"). Characterized but still confused, R J Auletta rauletta@sitevax.gmu.edu
andrew@jhereg.osa.com (Andrew C. Esh) (04/16/91)
In article <4150@gmuvax2.gmu.edu> rauletta@gmuvax2.gmu.edu (R. J. Auletta) writes: >We have been having some problems with an ethernet installation >that our Computer Network Services organization seems unwilling >to resolve. I am looking for some insight from those who might >have a sense of whether what we are seeing is normal. > Unwilling to resolve? Pardon me, but this sounds like an attitude problem. It's their job to resolve just this sort of thing. Maybe they are stuck, and can't think of an approach. Keep at them. >The problem is characterized as follows. > >1) Interactive sessions (typing etc) tends to get periodically >interrupted every couple of seconds for a tenth of a second or >more (the echo time becomes longer than the time to type a 5-10 character word.) >(This seems directly related to any burst of ethernet traffic >over about 10,000 bytes/sec as reported by etherd on a Sun.) >(When the ethernet load is low, the ethernet is very >responsive.) > >2) The following indications on an American Network Connections >ANC-80 8-port fanout transceiver 802.3 while the problem is present. > >RCV light intermittently active >REM COL blinks as the RCV goes out [everytime]. (Remote Collision) >LOC COL off >TRVR PRES light is dimly on. (?) >SQE is off. Collisions! There's most of it right there! > >3) Traffic on the ethernet is (as reported by etherd on a >Sun workstation) about 25K-75K bytes per second running >about 50-150 packets per second when (1) is observed. >Most of the traffic is between just two Sun workstations. > >4) Running netx (a tcp exerciser) on a Vax3600 to a VS2000 >showing a network load of about 10% shows almost continuous collisions even >when only the two machines are active on the network. >(Every blink of the RCV light on the fanout unit results in the REM COL >blinking, the LOC COL stays off.) > >Is this normal? At the described load would one expect to experience >poor interactive response that ethernets are known for? Might >this be due to the "supposed" problem with Sun's interpretation >of the ethernet standard in regards to back to back packets? > >The general form of the ethernet is a thick riser with one thinnet >transceiver and several AUI transceivers with a bridge to >a fiber-optic segment. > >What I am looking for are some suggestions as to what we might >look for to isolate the problem (such as "this sounds like a noise >problem", or "excessive reflections", or "load is just too high"). > >Characterized but still confused, > >R J Auletta >rauletta@sitevax.gmu.edu I would suggest checking everything between the ANC-80 and the main backbone (or whatever it connects to). I would concentrate on the cable, but check transceivers too. If you can get a cable or a TDR, that will probably show you that the cable is bad. Check the end connectors, and try it with a VOM, testing for a short between the shield and the conductor. Wiggle and twist the ends as you do this test, since it could be intermittent. Also find a convenient ground and see if there is any voltage potential between the shield and ground. A ground faluted sheild make sending a signal down the wire like trying to blow a marble through a garden hose full of holes. Bad cable will usually give you the kind of reflactions that cause the collisions you seem to be seeing. Even mediocre cable will run 20% with less than one collision per second. Also, could you say more about this ANC-80 thing? Is it a Multiport Repeater, or a Bridge, or what? 8 port fanout transciever? Not sure what that might be. -- Andrew C. Esh andrew@osa.com Open Systems Architects, Inc. Mpls, MN 55416-1528 Punch down, turn around, do a little crimpin' (612) 525-0000 Punch down, turn around, plug it in and go ...
brian@telebit.com (Brian Lloyd) (04/16/91)
SQE simply asserts the collision signal momentarily during the interpacket gap. This is to let the interface know that the transceiver is still alive. Normally the interface ignores SQE because it sees it at a particular time immediately following the transmission of a packet but when you plug the transceiver into a fanout box, everyone sees the SQE/collision signal and interprets it as a remote collision because they didn't just send a packet. I personally find the behavior of SQE to be annoying for this and other reasons. Turn off SQE and your "remote collision" problems will be greatly reduced. -- Brian Lloyd, WB6RQN Telebit Corporation Network Systems Architect 1315 Chesapeake Terrace brian@napa.telebit.com Sunnyvale, CA 94089-1100 voice (408) 745-3103 FAX (408) 734-3333