rando@skipper.dfrf.nasa.gov (Randy Brumbaugh) (01/17/90)
We recently observed some bizarre behavior on our net. The problem is solved, but we still don't fully understand the cause. It may be very clear to someone who understands repeaters and heartbeat-SQE. First, some info on our set up. A thick IEEE 802.3 cable runs around our building. At strategic locations, multi-port repeaters tap this thick cable and branch out into thinwire cabling to individual office areas. On our thinwire segment, we have 3 Suns, 1 Masscomp, 1 Kinetics Fastpath and 2 386-PCs with 3c503 boards. This setup worked fine, EXCEPT that the PCs couldn't communicate with anybody. Not telnet, ftp, PCNFS. Watching traffic on the LAN showed that for a telnet the PC would ARP 3 times. Each time a reply was sent, but apparently ignored by the PC. The problem seemed to be the 3c503 receiver. Then we discovered that if the thick interface on the repeater was disabled, everything worked - The PCs could talk (through the repeater) to other thinwire segments as well as hosts on the same segments. After a lot of theories were discarded, we found that the SQE was enabled on the transciever connected to the thick cable and the repeater. Disabling SQE fixed the problem. I have three questions: 1- What exactly is SQE? I think it is supposed to be turned off on repeater transcievers. Is this true? Why? 2- Why did everything work fine EXCEPT the 3c503 cards on the PC? Note- I'm not knocking the 3 Com card- if the SQE was set wrong, it was the only one to notice- maybe that makes it better. Maybe. The cards seemed to transmit, but not receive. The symptoms were completely repeatable. 3- If IEEE 802 says SQE should be off for repeater transcievers, does it say it should be on for others? Is it required? Can similar bad things happen if SQE is accidentaly disabled on a workstation transceiver? Thanks, Randy Brumbaugh rando@skipper.dfrf.nasa.gov
karn@jupiter..bellcore.com (Phil R. Karn) (01/17/90)
In article <408@skipper.dfrf.nasa.gov> rando@skipper.dfrf.nasa.gov (Randy Brumbaugh) writes: >We recently observed some bizarre behavior on our net. >The problem is solved, but we still don't fully understand >the cause. It may be very clear to someone who understands >repeaters and heartbeat-SQE. SQE is an "enhancement" that was made to Ethernet when it was standardized as IEEE 802.3. Basically, it calls for the transceiver to pulse the collision detect pair to the controller after each packet has been transmitted "in order to make sure the pair is still working". The problem is that some older controllers may interpret this signal as indicating a collision, treating it as an unsuccessful transmission when in fact it went out fine. As with the SAP encapsulation scheme used in IEEE 802.3, SQE is a gratuitous, brain-damaged idea that never should have seen the light of day. The subtle incompatibilies between DIX Ethernet and IEEE 802.3 have caused nothing but grief, but I guess the committee could not just leave well enough alone. We use lots of DIX Ethernet hardware around here, along with the original encapsulation scheme. Whenever I install a new transceiver that has the SQE option, I always turn it off. Everything works just fine. Phil
jstewart@ncs.dnd.ca (John Stewart) (01/18/90)
In article <408@skipper.dfrf.nasa.gov> rando@skipper.dfrf.nasa.gov (Randy Brumbaugh) writes: >segments as well as hosts on the same segments. After a lot of >theories were discarded, we found that the SQE was enabled on the >transciever connected to the thick cable and the repeater. >Disabling SQE fixed the problem. > >I have three questions: > >1- What exactly is SQE? I think it is supposed to be turned off > on repeater transcievers. Is this true? Why? SQE (sometimes called heartbeat) sends a signal back to the controller on successful transmission of a packet. The signal pair that it uses is the collision detect pair, so the controller has to know when to interpret the pulse on the pair as a collsison or SQE. Now; one of the design criteria of repeaters is that if they see a collision on any segment, then they must then jam the other segments to simulate the collision on any other connected networks. (read the specs if you don't believe me; they have to do this to keep the signals on all connected networks close) So; we have a repeater that does not understand SQE. If a packet is sent to it (and there will be many :-), the repeater will take that packet and propogate it to all connected networks. If it just happens to send to the tranceiver with SQE, then it will receive a collision detect. It will then jam all the connected networks to copy the perceived collision. I am not sure if there are repeaters that use SQE or not; but I know that any that I have worked with do not. Off the top of my head, I can not think of a reason why it would not be possible for a repeater to work with SQE. > >2- Why did everything work fine EXCEPT the 3c503 cards on the PC? > Note- I'm not knocking the 3 Com card- if the SQE was set wrong, > it was the only one to notice- maybe that makes it better. Maybe. > The cards seemed to transmit, but not receive. The symptoms were > completely repeatable. Good question. Timing? Luck? Maybe the 3-com card does funny (or correct) things after seeing a collision? This would be an interesting one to follow up. > >3- If IEEE 802 says SQE should be off for repeater transcievers, > does it say it should be on for others? Is it required? Can > similar bad things happen if SQE is accidentaly disabled on a > workstation transceiver? > If you turn off SQE, and the controller needs it, then the controller will inform your driver (which will inform... etc) that it can not transmit. It's safe to bet that it can be disabled for most, if not all devices. John Stewart.
henry@utzoo.uucp (Henry Spencer) (01/18/90)
In article <19024@bellcore.bellcore.com> karn@jupiter.bellcore.com (Phil R. Karn) writes: >SQE is an "enhancement" that was made to Ethernet when it was >standardized as IEEE 802.3... [assorted unkind words about it] On the whole I agree with Phil about the various little stupidities perpetrated by 802.3. The key point, which Phil wasn't too explicit about, is that SQE is another aspect where the interface and the transceiver *must* agree or there will be trouble. -- 1972: Saturn V #15 flight-ready| Henry Spencer at U of Toronto Zoology 1990: birds nesting in engines | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
wyatt@cfa.HARVARD.EDU (Bill Wyatt) (01/18/90)
>>1- What exactly is SQE? I think it is supposed to be turned off >> on repeater transcievers. Is this true? Why? > > SQE (sometimes called heartbeat) sends a signal back to the controller > on successful transmission of a packet. The signal pair that it uses is > the collision detect pair, so the controller has to know when to > interpret the pulse on the pair as a collsison or SQE. > > Now; one of the design criteria of repeaters is that if they see a > collision on any segment, then they must then jam the other segments [...] > So; we have a repeater that does not understand SQE. [...] > it just happens to send > to the tranceiver with SQE, then it will receive a collision detect. > > It will then jam all the connected networks to copy the perceived > collision. > > I am not sure if there are repeaters that use SQE or not; but I know > that any that I have worked with do not. Off the top of my head, I can > not think of a reason why it would not be possible for a repeater to > work with SQE. No absolute reason. The DEC DEREP works fine with SQE-enabled transceivers. They had to, as DEC's old H4000 transceiver had no way to turn off SQE. I think there must be a good reason the 802.3 standard says repeaters shouldn't have SQE circuits. Maybe it preserves a little more of the time budget, as there's no dead time where the repeater has to ignore collision detect; it can then jam the other segment that much sooner if one is detected. >>3- If IEEE 802 says SQE should be off for repeater transcievers, >> does it say it should be on for others? Is it required? Can >> similar bad things happen if SQE is accidentaly disabled on a >> workstation transceiver? >> > If you turn off SQE, and the controller needs it, then the controller > will inform your driver (which will inform... etc) that it can not > transmit. > > It's safe to bet that it can be disabled for most, if not all devices. While disabling SQE may work for some devices, it's not a good idea. The use of SQE is necessary since data corruption and general havoc will ensue if a node's collision detection has failed. Data integrity is a vital commodity on a shared bus! Bill Wyatt, Smithsonian Astrophysical Observatory (Cambridge, MA, USA) UUCP : {husc6,cmcl2,mit-eddie}!harvard!cfa!wyatt Internet: wyatt@cfa.harvard.edu SPAN: cfa::wyatt BITNET: wyatt@cfa
jstewart@ncs.dnd.ca (John Stewart) (01/19/90)
In article <286@cfa.HARVARD.EDU> wyatt@cfa.HARVARD.EDU (Bill Wyatt) writes: ... >While disabling SQE may work for some devices, it's not a good idea. >The use of SQE is necessary since data corruption and general havoc >will ensue if a node's collision detection has failed. Data integrity >is a vital commodity on a shared bus! It's been about 2 years since I last went thourgh the 802.3 spec, but I seem to remember that the cd line would be used as the SQE for a certain time period after transmission. So: 1) if no SQE, but the controller expects one, then the controller thinks that the tranceiver is dead. 2) if SQE, and the controller doesn't expect one, it will be interpreted as a collision, and the controller will try a re-transmit. So, if the above is correct, then 1) above would seem to be the lesser of two evils. Note that disabling SQE does not disable the collision detect! Am I correct, or should I get the spec for bed-time reading again? John Stewart.