[fa.tcp-ip] Ethernet problems

raj@UCI-ICSA.ARPA (Richard Johnson) (10/17/85)

I'm not sure if this is the correct Bboard for this.  If not,
please let me know where it should be sent.

We have an ethernet which consists of 5 VAX 750s plugged into a DEC
Delni.  This is connected onto a length of cable which has 4 other
Int. Sol. machines and 2 suns.  We just recently ran about 500 ft. of
cable (which together with the already existing cable placed us
around 300 meters or so) and moved 2 of the Int. Sol. machines to the
end of this length. (All systems are running 4.2BSD [Suns run SUN's
version]) Basically our configuration looks like this:

A  B  C                          +-- added length --+
 \ | /                           V                  V
 delni ----------Sun--Sun--IS--IS--------------IS--IS
 / | \                                         (x) (y)
D  E  IS

About a week or so after this change was made we started having
problems. About every 2-4 weeks suddenly every machine on the net
claims that every other machine is down.  You can't connect to any
other machines.  I have found that if I disconnect the extra length
(recently added) the problem seems to go away. It would seem to not
be a bad ethernet board in one of the machines because I can connect
either one of the 2 IS's ('x' or 'y') onto the net by itself and the
net is still upset. However, I can reconnect the extra length (along
with the 2 machines at the end of it) after about 15-30 min's and
everything is fine for another few weeks.  Sometimes the problem
seems to just goes away on its own also!  (By the way, a "netstat -i"
says we are sending and receiving lots of packets but getting about
as many input errors as input packets!)

Has anyone ever seen anything like this?  I'm guessing it means we're
too close to the max. length for the ethernet, but I calculate the
total length as around 300 meters and the standard say 500 meters.

------------------------------------------------------------------------
Richard Johnson                             raj@uci.edu           (ARPA)
UCI ICS Research Systems Manager            ucbvax!ucivax!raj     (UUCP)

radzy@calma.UUCP (10/19/85)

In article <376.498353365@uci.edu> you write:
>I'm not sure if this is the correct Bboard for this.  If not,
>please let me know where it should be sent.

I would have sent it to net.lan, but I don't think net.tcp-ip is
INappropriate for it.  Ignore any flames.

>                                We just recently ran about 500 ft. of
>cable (which together with the already existing cable placed us
>around 300 meters or so)
>
>About a week or so after this change was made we started having
>problems. About every 2-4 weeks suddenly every machine on the net
>claims that every other machine is down.
>                 I have found that if I disconnect the extra length
>(recently added) the problem seems to go away.
>                    However, I can reconnect the extra length (along
>with the 2 machines at the end of it) after about 15-30 min's and
>everything is fine for another few weeks.
>
>Has anyone ever seen anything like this?  I'm guessing it means we're
>too close to the max. length for the ethernet, but I calculate the
>total length as around 300 meters and the standard say 500 meters.

I doubt the problem is with length.  The ethernet specification says
2.5 kilometers, not 500 meters (unless you're using experimental ether,
in which case it's still 1 kilometer).

I had problems something like that a while back.  The symptoms were
that the network was (always) slow, and there were lots of illegal
packets.  Often, you couldn't get connected, same as your net now,
but it wasn't periodic, like you're seeing.

The problem turned out to be a combination of two things, neither
of which is quite kosher, but neither of which (alone) would cause
the net to behave quite so lousy:
	1.  I had added a few sections of the thin cable to one
		end of the net, for IBM PCs to use with their
		internal transceivers.  The thin cable has a differet
		impedance than the thick cable.
	2.  Several of out transceivers were not placed at correct
		locations.  The result of this was that the machines
		connected to those transceivers were much worse about
		problems.

Could something like either of those problems be related to yours?
For the first, check if you have a bad connector (or transceiver,
if you are using 3-com type).  Also, make sure the cable is top
quality (I'm not sure of the nomenclature here, but there are a
couple of kinds and one kind costs about 1/2 what the other kind
costs -- get the expensive one).

For the second, make sure your cable has the marks at about 9-foot
intervals, and make sure all your transceivers are on one of the 
marks.


Hope this helps.

-- 
Tim (radzy) Radzykewycz
	calma!radzy@ucbvax.ARPA
	ucbvax!calma!radzy

HEDRICK@RED.RUTGERS.EDU (Charles Hedrick) (10/20/85)

As I read it, the spec says 500 meters.  2.5 Km is for configurations
involving multiple bridges, and also a 1 Km length of fiber optic
cable.

If it really takes you 15 - 30 min to recover once you remove the
extra length, this suggests that the problem involves software tables.
This is about the amount of time it takes Unix systems to clear their
ARP tables.  I would look carefully at the two machines on the end of
the extra cable.  Is it possible that they are set up with wrong network
configuration information, such that somehow duplicate addresses or
forwarding loops are resulting?

-------