[comp.unix.xenix] TCP/IP network "crash"

shepperd@dms.UUCP (Dave Shepperd) (09/29/89)

Is it just my network, or are all TCP/IP Ethernet networks so "fragile"?

It seems anybody can write a program to open a socket connecting to a 
remote node then do something to lockup the network on both systems.
Sometimes I've noticed that this can even bring down the network on
nodes not involved in the connection.

We're doing X window development using the X11R3 distribution on several
Xenix/386 systems using NCD X window servers all running TCP/IP over Ethernet.
It happens all too often to accidentally do something with X which will
knock the 386's off the network. Whatever happens (I don't know what it
is), isn't always fixed by just restarting the network on the first system
to die. Sometimes I've had to stop and restart the network on the 386's
and ALL the servers. This is icky.

Unfortunately, I don't have a network analysier or any promiscious mode
software to see what might be happening on the wire, but I do have
transceivers with leds indicating send/receive traffic. They don't
indicate anything different than they normally do during one of these
crashes. I.e., they don't go into steady send or receive and there are
no more collisions than normal (collisions are pretty rare in any event).

There is also DECnet, LAT and LAVC traffic on the same wire which has
apparently never been affected by any of the IP traffic even during one of
these crashes. There are some non-Unix boxes on the net that speek TCP
as well as one of the VMS Vaxen. Their network doesn't crash or get
stuck when one of the Xenix systems network dies, but the TCP on VMS
can be crashed by opening a socket and doing something incorrectly.

I should point out that there has never been any messages produced by
the TCP software on the console during one of these crashes, so how
does one go about figuring out what the hell is happening?

Thanks for any help anyone can provide.


-- 
Dave Shepperd.	    shepperd@dms.UUCP or motcsd!dms!shepperd
Atari Games Corporation, 675 Sycamore Drive, Milpitas CA 95035.
(Arcade Video Game Manufacturer, NOT Atari Corp. ST manufacturer).

hedrick@geneva.rutgers.edu (Charles Hedrick) (09/29/89)

Your software is buggy.  Now and then we've run into implementations
where for some reason or other the software hung.  These have
generally been new implementations.  Such bugs were regarded
(correctly) as serious problems, and fixed.  It's also possible that a
bug or misconfiguration has resulted in a "broadcast storm".  In that
case, your software isn't hung -- it's just being saturated by lots of
packets.  I would suggest getting one of the MS/DOS TCP/IP
implementations, and running netwatch.  That should show you what is
going on if it's a broadcast storm.  If it's a fragile Ethernet device
driver, looking at the net may not shou anything.  Probably that can
only be debugged if you have source to the software.