[comp.protocols.tcp-ip] Suspected problem with new end-to-end

oconnor@SCCGATE.SCC.COM (Michael J. O'Connor) (12/15/87)

We've observed a problem with our gateway's interaction with the Arpanet and
we believe that the problem might be associated with the end-to-end upgrade
since it was not observed prior to the current end-to-end testing period.  We
realize that the end-to-end testing was scheduled to end on Dec. 13 and that
all of our tests were made on Dec. 14 but something is not as it used to be
and the new end-to-end is our best guess as to the cause.  Regardless of the
culprit, the problem exists and can be repeatably demonstrated.
	The problem arises when certain net 10 hosts establish connections to
our gateway which require that a new X.25 virtual circuit be established.  The
connection is made, an X.25 virtual circuit is established, one (1) packet is
received and the connection hangs.  When the circuit is timed out due to
inactivity, it is immediately reestablished and another packet is received.
Any outbound traffic from our gateway that uses that virtual circuit will free
it up and traffic will commence flowing from the remote node.  In our tests we
used one (1) ICMP echo-request to free the hang-ups.
	Our gateway (SCCGATE-GW.SCC.COM) is a Sun-3 running Sun's 3.4 OS and
Sun's DDN X.25 software.  We are connected to port 11 on PSN 20.
	Sun supplied a virtual circuit monitor as part of their DDN package
which we used in our tests.  This monitor displays all of the X.25 circuits
between our Sun and the PSN.  The display consists of the locally assigned
circuit number (which indicates whether the circuit was initiated locally or
remotely), the current state of the circuit, the number of packets sent, the
number of packets received, and the IP address of the net 10 destination.  An
X.25 line monitor would be a better tool but we do not have access to one.
	In order to reproduce the problem, we send traffic to destinations via
routes that differ from the remote system's idea of the route to us.  For
instance, when we Telnet to the Milnet address of SRI-NIC.ARPA, the return
traffic comes from their Arpanet interface.
	The following table summarizes our test results to date:

Destination Name and Address	| Outbound VC	| Inbound VC	| Hangs?
--------------------------------+---------------+---------------+------
twg.arpa	26.5.0.73	| 10.7.0.20	| 10.4.0.51	| Yes
sri-nic.arpa	26.0.0.73	| 10.7.0.20	| 10.0.0.51	| Yes
dugway-mil-tac.	26.0.0.120	| 10.7.0.20	| 10.5.0.5	| Yes
sac-milnet-gw.a	26.0.0.105	| 10.7.0.20	| 10.2.0.80	| Yes
arpa-milnet-gw.	26.0.0.106	| 10.7.0.20	| 10.2.0.28	| Yes
milnet-gw.isi.e	26.0.0.103	| 10.7.0.20	| 10.2.0.22	| Yes
enet1-gw.bbn.co	8.5.0.18	| 10.4.0.82	| 10.2.0.5	| Yes
egp-gate.mitre.	128.29.31.2	| 10.1.0.111	| 10.5.0.111	| Yes
gateway.mitre.o	128.29.31.10	| 10.5.0.111	| 10.1.0.111	| Yes
violet.berkeley	128.32.136.22	| 10.2.0.78	| 10.0.0.78	| Yes
bikini.cis.ufl.	128.277.2.1	| 10.8.0.20	| 10.9.0.20	| No
bikini.cis.ufl.	128.277.2.1	| 10.1.0.17	| 10.9.0.20	| No
terp.umd.edu	128.8.10.90	| 10.8.0.20	| 10.1.0.17	| No
trantor.umd.edu	128.8.10.14	| 10.1.0.17	| 10.8.0.20	| No

	In case our explanation was not too clear, I'll describe a
test scenario.  After insuring that a VC to 10.4.0.51 did not exist, we
would ping twg.arpa.  We would receive 1 echo-reply and the newly created
VC to 10.4.0.51 would show 1 packet received.  The outbound counter on the
VC to 10.7.0.20 would steadily increment as the ping program continued.  If
we simply sat and watched, eventually the VC to 10.4.0.51 would time out and
be removed.  A new VC to 10.4.0.51 would be created immediately thereafter and
we would then receive another echo-reply from twg.arpa.  This new VC would
also show 1 packet received.  If we sent as little as 1 echo-request to
10.4.0.51, the flood-gates would open and traffic would flow normally.
	We believe that this problem could explain some of the recent
comments about network troubles in tcp-ip.  For instance, Eric Johnson's
mail delivery problems to violet.berkeley.edu from hosts in cis.ufl.edu.
	I wish we could explain the cause of the problem instead of just
describing symptoms but that's all our current resources allow.


			Bob & Mike

Bob Enger	enger@bluto.scc.com
Mike O'Connor	oconnor@sccgate.scc.com