[comp.dcom.lans] SQE and strange behavior

rando@skipper.dfrf.nasa.gov (Randy Brumbaugh) (01/17/90)

We recently observed some bizarre behavior on our net.
The problem is solved, but we still don't fully understand
the cause.  It may be very clear to someone who understands
repeaters and heartbeat-SQE.

First, some info on our set up.  A thick IEEE 802.3 cable runs
around our building.  At strategic locations, multi-port
repeaters tap this thick cable and branch out into thinwire
cabling to individual office areas.  On our thinwire segment,
we have 3 Suns, 1 Masscomp, 1 Kinetics Fastpath and 2 386-PCs
with 3c503 boards.

This setup worked fine, EXCEPT that the PCs couldn't communicate
with anybody.  Not telnet, ftp, PCNFS.  Watching traffic on the
LAN showed that for a telnet the PC would ARP 3 times. Each time
a reply was sent, but apparently ignored by the PC.  The problem
seemed to be the 3c503 receiver.  Then we discovered that if 
the thick interface on the repeater was disabled, everything worked -
The PCs could talk (through the repeater) to other thinwire
segments as well as hosts on the same segments.  After a lot of
theories were discarded, we found that the SQE was enabled on the
transciever connected to the thick cable and the repeater.  
Disabling SQE fixed the problem.

I have three questions:

1-  What exactly is SQE?  I think it is supposed to be turned off
 on repeater transcievers.  Is this true?  Why?

2-  Why did everything work fine EXCEPT the 3c503 cards on the PC?
 Note- I'm not knocking the 3 Com card- if the SQE was set wrong, 
 it was the only one to notice- maybe that makes it better. Maybe.
 The cards seemed to transmit, but not receive.  The symptoms were
 completely repeatable.

3-  If IEEE 802 says SQE should be off for repeater transcievers,
 does it say it should be on for others?  Is it required?  Can
 similar bad things happen if SQE is accidentaly disabled on a 
 workstation transceiver?

 Thanks,
  Randy Brumbaugh
  rando@skipper.dfrf.nasa.gov
  
	

karn@jupiter..bellcore.com (Phil R. Karn) (01/17/90)

In article <408@skipper.dfrf.nasa.gov> rando@skipper.dfrf.nasa.gov (Randy Brumbaugh) writes:
>We recently observed some bizarre behavior on our net.
>The problem is solved, but we still don't fully understand
>the cause.  It may be very clear to someone who understands
>repeaters and heartbeat-SQE.

SQE is an "enhancement" that was made to Ethernet when it was
standardized as IEEE 802.3. Basically, it calls for the transceiver to
pulse the collision detect pair to the controller after each packet has
been transmitted "in order to make sure the pair is still working". The
problem is that some older controllers may interpret this signal as
indicating a collision, treating it as an unsuccessful transmission when
in fact it went out fine.

As with the SAP encapsulation scheme used in IEEE 802.3, SQE is a
gratuitous, brain-damaged idea that never should have seen the light of
day. The subtle incompatibilies between DIX Ethernet and IEEE 802.3 have
caused nothing but grief, but I guess the committee could not just leave
well enough alone.

We use lots of DIX Ethernet hardware around here, along with the
original encapsulation scheme. Whenever I install a new transceiver that
has the SQE option, I always turn it off. Everything works just fine.

Phil

jstewart@ncs.dnd.ca (John Stewart) (01/18/90)

In article <408@skipper.dfrf.nasa.gov> rando@skipper.dfrf.nasa.gov (Randy Brumbaugh) writes:
>segments as well as hosts on the same segments.  After a lot of
>theories were discarded, we found that the SQE was enabled on the
>transciever connected to the thick cable and the repeater.  
>Disabling SQE fixed the problem.
>
>I have three questions:
>
>1-  What exactly is SQE?  I think it is supposed to be turned off
> on repeater transcievers.  Is this true?  Why?

SQE (sometimes called heartbeat) sends a signal back to the controller
on successful transmission of a packet. The signal pair that it uses is
the collision detect pair, so the controller has to know when to
interpret the pulse on the pair as a collsison or SQE.

Now; one of the design criteria of repeaters is that if they see a
collision on any segment, then they must then jam the other segments to
simulate the collision on any other connected networks. (read the specs
if you don't believe me; they have to do this to keep the signals on 
all connected networks close)

So; we have a repeater that does not understand SQE. If a packet is sent
to it (and there will be many :-), the repeater will take that packet
and propogate it to all connected networks. If it just happens to send
to the tranceiver with SQE, then it will receive a collision detect.

It will then jam all the connected networks to copy the perceived
collision.

I am not sure if there are repeaters that use SQE or not; but I know
that any that I have worked with do not. Off the top of my head, I can
not think of a reason why it would not be possible for a repeater to 
work with SQE.

>
>2-  Why did everything work fine EXCEPT the 3c503 cards on the PC?
> Note- I'm not knocking the 3 Com card- if the SQE was set wrong, 
> it was the only one to notice- maybe that makes it better. Maybe.
> The cards seemed to transmit, but not receive.  The symptoms were
> completely repeatable.

Good question. Timing? Luck? Maybe the 3-com card does funny (or
correct) things after seeing a collision? This would be an interesting
one to follow up.
>
>3-  If IEEE 802 says SQE should be off for repeater transcievers,
> does it say it should be on for others?  Is it required?  Can
> similar bad things happen if SQE is accidentaly disabled on a 
> workstation transceiver?
>
If you turn off SQE, and the controller needs it, then the controller
will inform  your driver (which will inform... etc) that it can not
transmit.

It's safe to bet that it can be disabled for most, if not all devices.


John Stewart.

henry@utzoo.uucp (Henry Spencer) (01/18/90)

In article <19024@bellcore.bellcore.com> karn@jupiter.bellcore.com (Phil R. Karn) writes:
>SQE is an "enhancement" that was made to Ethernet when it was
>standardized as IEEE 802.3... [assorted unkind words about it]

On the whole I agree with Phil about the various little stupidities
perpetrated by 802.3.  The key point, which Phil wasn't too explicit
about, is that SQE is another aspect where the interface and the
transceiver *must* agree or there will be trouble.
-- 
1972: Saturn V #15 flight-ready|     Henry Spencer at U of Toronto Zoology
1990: birds nesting in engines | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

wyatt@cfa.HARVARD.EDU (Bill Wyatt) (01/18/90)

 >>1-  What exactly is SQE?  I think it is supposed to be turned off
 >> on repeater transcievers.  Is this true?  Why?
 > 
 > SQE (sometimes called heartbeat) sends a signal back to the controller
 > on successful transmission of a packet. The signal pair that it uses is
 > the collision detect pair, so the controller has to know when to
 > interpret the pulse on the pair as a collsison or SQE.
 >
 > Now; one of the design criteria of repeaters is that if they see a
 > collision on any segment, then they must then jam the other segments 
 [...]
 > So; we have a repeater that does not understand SQE. 
 [...]
 >                     it just happens to send
 > to the tranceiver with SQE, then it will receive a collision detect.
 > 
 > It will then jam all the connected networks to copy the perceived
 > collision.
 > 
 > I am not sure if there are repeaters that use SQE or not; but I know
 > that any that I have worked with do not. Off the top of my head, I can
 > not think of a reason why it would not be possible for a repeater to 
 > work with SQE.

No absolute reason. The DEC DEREP works fine with SQE-enabled
transceivers. They had to, as DEC's old H4000 transceiver had no way
to turn off SQE. I think there must be a good reason the 802.3
standard says repeaters shouldn't have SQE circuits. Maybe it
preserves a little more of the time budget, as there's no dead time
where the repeater has to ignore collision detect; it can then
jam the other segment that much sooner if one is detected.

 >>3-  If IEEE 802 says SQE should be off for repeater transcievers,
 >> does it say it should be on for others?  Is it required?  Can
 >> similar bad things happen if SQE is accidentaly disabled on a 
 >> workstation transceiver?
 >>
 > If you turn off SQE, and the controller needs it, then the controller
 > will inform  your driver (which will inform... etc) that it can not
 > transmit.
 > 
 > It's safe to bet that it can be disabled for most, if not all devices.

While disabling SQE may work for some devices, it's not a good idea.
The use of SQE is necessary since data corruption and general havoc
will ensue if a node's collision detection has failed. Data integrity 
is a vital commodity on a shared bus!

Bill Wyatt, Smithsonian Astrophysical Observatory  (Cambridge, MA, USA)
    UUCP :  {husc6,cmcl2,mit-eddie}!harvard!cfa!wyatt
 Internet:   wyatt@cfa.harvard.edu
     SPAN:   cfa::wyatt                 BITNET: wyatt@cfa

jstewart@ncs.dnd.ca (John Stewart) (01/19/90)

In article <286@cfa.HARVARD.EDU> wyatt@cfa.HARVARD.EDU (Bill Wyatt) writes:
...
>While disabling SQE may work for some devices, it's not a good idea.
>The use of SQE is necessary since data corruption and general havoc
>will ensue if a node's collision detection has failed. Data integrity 
>is a vital commodity on a shared bus!

It's been about 2 years since I last went thourgh the 802.3 spec, but I
seem to remember that the cd line would be used as the SQE for a certain
time period after transmission. So:

1) if no SQE, but the controller expects one, then the controller
   thinks that the tranceiver is dead.

2) if SQE, and the controller doesn't expect one, it will be interpreted
  as a collision, and the controller will try a re-transmit.

So, if the above is correct, then 1) above would seem to be the
lesser of two evils. Note that disabling SQE does not disable
the collision detect!

Am I correct, or should I get the spec for bed-time reading again?

John Stewart.