[comp.sys.ti.explorer] hosed networks

jwz@teak.berkeley.edu (Jamie Zawinski) (11/21/89)

This keeps happening to me: at random times, certain hosts out in net-land
(unix machines) will appear to stop responding.  No connections can be 
opened to them, no matter how many times hit resume or call IP:RESET.
The problem doesn't go away after several hours.  However, some other hosts
will respond, and I don't see the pattern.

I touch three IP hosts regularly: spice.cs.cmu.edu, which is on the other
side of the country; teak.berkeley.edu, which is upstairs; and 
pasteur.berkeley.edu, which is also local.  The most common (but by no 
means exclusive) way that my network gets wedged is that Spice and Pasteur
will seem to stop responding, but Teak won't; and other machines have no
problem connecting to these, so it's not the network itself.

Warm-booting often causes this, but it happens spontaniously as well.
Cold-booting always makes it go away.  I'm using IP 3.43.

Does this happen to anyone else?

	-- Annoyed.

snicoud@ATC.BOEING.COM (Stephen Nicoud) (11/21/89)

    Date: 20 Nov 89 21:01:22 GMT
    From: pasteur!jwz%teak.berkeley.edu@ucbvax.berkeley.edu.ARPANET  (Jamie Zawinski)


    This keeps happening to me: at random times, certain hosts out in net-land
    (unix machines) will appear to stop responding.  No connections can be 
    opened to them, no matter how many times hit resume or call IP:RESET.
    The problem doesn't go away after several hours.  However, some other hosts
    will respond, and I don't see the pattern.

    I touch three IP hosts regularly: spice.cs.cmu.edu, which is on the other
    side of the country; teak.berkeley.edu, which is upstairs; and 
    pasteur.berkeley.edu, which is also local.  The most common (but by no 
    means exclusive) way that my network gets wedged is that Spice and Pasteur
    will seem to stop responding, but Teak won't; and other machines have no
    problem connecting to these, so it's not the network itself.

    Warm-booting often causes this, but it happens spontaniously as well.
    Cold-booting always makes it go away.  I'm using IP 3.43.

    Does this happen to anyone else?

	    -- Annoyed.

Dear Annoyed, :-)

I don't know if this is related, but when a failed IP connection occurs
to a host, I've discovered that a property (:connection-failures) is
added to the host object's property list.

	(send (si:parse-host "HOST") :connection-failures)

Removing this property allows me to try again (otherwise, subsequent
attempts fail immediately without trying to connect).

	(remprop (si:parse-host "HOST") :connection-failures)

I'd be interested in what you do or do not find out.

Hope this black magic helps. :-)

	-- Dr. Net Voodoo

jwz@TEAK.BERKELEY.EDU (Jamie Zawinski) (11/21/89)

Nope, that ain't it.  The 1:connection-failures* property prevents it
from trying to connect; in my situation, it actually tries and fails.

John Nguyen suggests 4(ip:reset-ip-routing-table nil)*.
Next time it happens, I'll give that a shot.

		2-- Jamie

Rice@SUMEX-AIM.STANFORD.EDU (James Rice) (11/21/89)

Yes, I've seen this too.  When It's happened to me it has
always been correlated with the Unix boxes in question
doing heavy NFS activity.  Arresting the processes on the
unix boxes causing this behaviour seems to let the Exp get
in and then I can resume the process on the unix box.  I
have no idea what causes it, but I wish it would go away.
I'm tempted to suspect that its a unix bogosity of not
scheduling the ftp server properly.

Unix is the real virus....



Rice.