pogran@CCQ.BBN.COM (Ken Pogran) (12/24/87)
Folks,
Here's where we stand on resolving ARPANET PSN 7 problems:
Please refer to my message of 15 December entitled "An ARPANET
Update" for a description of the problems referred to here.
We have successfully tested fixes for the "one packet problem"
and the "pinging yourself" problem. These patches should be
deployed ARPANET-wide within the next day or so. We have
identified the "multiple of 128 bytes" problem as a host software
problem. Here are the details:
1. The "one packet problem" (otherwise known as the "stuck VC
problem," "thrashing VC problem," etc.). Known to affect Sun
X.25 hosts.
When an 1822-connected host begins to send data to an
X.25-connected host, the destination PSN, to which the
X.25-connected host is attached, must open an X.25 VC with to
the destination host. Under PSN 7, the PSN opens the VC,
sends the first IP datagram, and waits for an RR from the
host before allowing the source PSN to send additional data
across the network (and to return a RFNM to the source host
for the first packet). This behavior is different from PSN
6, where up to 8 datagrams could be sent to the destination
host. Under PSN 6, a source host could conceivably receive
the RFNM for the first such datagram before the datagram was
acknowledged by the destination host.
RRs are often piggy-backed on traffic flowing over the same
VC in the opposite direction. However, conditions such as
Mailbridge homing in the DDN can produce asymmetric flows.
Many X.25 implementations send an RR to acknowledge a
packet based on the expiration of a timer, if there is no
reverse traffic. Sun X.25 does not, however, but instead
waits for the window to become "half full".
The behavior of the "interoperability" mechanism of PSN 7,
together with the behavior of the Sun X.25, created a "deadly
embrace" in which only one datagram would be received on the
VC.
Behavior of the PSN 7 interoperability mechanism is being
changed to eliminate this condition. The patch to do this
has been tested in our lab and will be deployed to the
ARPANET shortly.
NOTE: A patch was deployed last week in an attempt to fix
this problem. That patch did not work, and was removed from
the network last night. We had been unable to test that
patch in the lab beforehand, because at the time we did not
have a Sun with an X.25 interface hooked up to our lab net.
2. The "pinging yourself" problem. The timing bug described in
my message of 15 December has been fixed in a patch tested
earlier today. Mike Petry at UMd was our "guinea pig", and
reports that the problem he saw has in fact been corrected by
this patch. This patch will also be deployed to the ARPANET
shortly.
3. The "multiple of 128 bytes" problem. Using our Sun with X.25
interface in our lab net, and with a data scope on the X.25
link between the PSN and the Sun, we tried "pinging" the Sun
from another host. We found that with packets of length 127,
128, 255, 256 ... the datascope showed the "ping" going to
the Sun, but no response from the Sun. With packets of other
length, the datascope showed the "ping" and its reply going
across the link. The packets from the PSN are well-formed in
every respect. At this point we can only assume there's a
bug in the host code.
--> Has anyone OTHER than folks with Suns with X.25
interfaces seen this problem? If so, please send a message
to ARPAUPGRADE@BBN.COM.
Happy holidays, everyone.
Ken Pogran
BBN COMMUNICATIONSmelohn@Sun.COM (Bill Melohn) (01/05/88)
After catching up on my mail over the holidays, it appeared as though everyone now believes the problems related to "stuck VCs" have been fixed. However, when I drop my host Sun.COM back to the non-patched version of our kernel (the one that doesn't attempt to kludge around the BBN bug by always sending an RR with each packet) I still notice "stuck VCs", easily reproduceable on one-way VCs between us and machines on IMPs 11 and 68. Either the latest version of Andy's patch has not been fully deployed, or it too does NOT fix the problem. On the 128 byte packet problem; we are in the process of getting packet traces from the ARPAnet to see exactly what the packet traffic looks like when we appear to lose the 128 byte packets. I should point out that this too only appears to happen between us and 1822 hosts running the new end to end; I suspect that we will find another PSN 7.0 bug at the root cause of this problem. More as soon as I have the traces. We are in the process of testing a new version of our software to handle multiple incoming VCs from the same IP host. Because multiple VCs are used for X.25 loopback by the PSN under the new end to end, we feel we have little choice but to support them. I do feel that requiring this support without any warning that such support would be required by the new end to end was a mistake, one that our mutual customers may have to live with until we can test and manufacture a new software release. It also conceptually wastes VCs, which are limited resources in many X.25 implementations, because it encourages in many more cases two or more one-way VCs between host pairs where a single VC would have existed under the old end to end.