[comp.sys.sequent] Sequent arp

fletcher@cs.utexas.edu (Fletcher Mattox) (10/21/90)

We seem to have a problem when a Sequent arps for a SparcStation 1+.

The Sequent (cs.utexas.edu) is a Balance 21k running Dynix 3.0.4.
The SS1+ (blox.cs.utexas.edu) is running SunOS 4.1.

This etherfind was run on another sun on the same wire as the
SS1+ and the Sequent.  It is monitoring the arp exchange when
someone on the Sequent types "telnet blox".

	Script started on Sat Oct 20 13:06:08 1990
	blitz% etherfind -t -u -arp -between blox cs
	Using interface ie0
							  icmp type
	       lnth proto         source     destination   src port   dst port
	 0.00    60  arp   cs.utexas.edu blox.cs.utexas. 
	 0.00    60  arp blox.cs.utexas.   cs.utexas.edu 
	 2.82    60  arp   cs.utexas.edu blox.cs.utexas. 
	 2.82    60  arp blox.cs.utexas.   cs.utexas.edu 
	12.32    60  arp   cs.utexas.edu blox.cs.utexas. 
	12.32    60  arp blox.cs.utexas.   cs.utexas.edu 
	24.96    60  arp   cs.utexas.edu blox.cs.utexas. 
	24.96    60  arp blox.cs.utexas.   cs.utexas.edu 
	69.26    60  arp   cs.utexas.edu blox.cs.utexas. 
	69.26    60  arp blox.cs.utexas.   cs.utexas.edu 
	211.58    60  arp   cs.utexas.edu blox.cs.utexas. 
	211.58    60  arp blox.cs.utexas.   cs.utexas.edu 
	306.34    60  arp   cs.utexas.edu blox.cs.utexas. 
	306.34    60  arp blox.cs.utexas.   cs.utexas.edu 
	495.78    60  arp   cs.utexas.edu blox.cs.utexas. 
	495.78    60  arp blox.cs.utexas.   cs.utexas.edu 
	^C
	blitz% exit
	blitz% 
	script done on Sat Oct 20 13:18:43 1990

I've looked at the contents of the arp packets.  Nothing unusual there.
It's as if the Sequent just isn't seeing the arp response from
the SS1.

But it works sometimes.  If you try the above experiment 10 times,
you'll usually get a successful arp entry into the Sequent's
cache and TCP/IP then proceeds normally.

It even happens (to a lesser extent) on our Symmetry running 3.0.12.
That makes me wonder if there's a timing problem here.  Could the
SS1+ be getting the arp response back on the wire before the
Sequent is prepared to deal with it?  Hm.  I dunno.

(Yes, I could add a permanent entry in the Sequent's arp cache.
I don't want to.)

By the way, as long as I'm talking about Sequent arps:
Isn't 495.78 seconds a little too persistant?  That's
when the telnet session finally timed out.  If a host hasn't
responded to an arp within a few seconds, it is never going to.
The above strategy causes TCP/IP to take 10 minutes(!) to time out
to a dead host.