[comp.protocols.tcp-ip] Strange problems with VAX/VMS DDN X.25 hosts

WANCHO@SIMTEL20.ARPA.UUCP (06/07/87)

Karl Goodloe's machine is a VAX 11/750 running 4.2bsd with some Mt.
Xinu fixes from a couple years back.  The problem he describes below
seems to happen only when VAX/VMS systems running Wollongong TCP/IP
over a DDN X.25 connection attempt to send mail over a certain size to
his host.  If this situation looks familiar to anyone, or you might
have some clue as to what's happening, please contact Karl directly
and copy me in your message.

Here are the details from Karl:

Date: Wednesday, 3 June 1987  10:11-MDT
From: kgoodloe at miser.ARPA (Karl G. Goodloe)
To:   fwancho
Re:   Mail Problem

We really haven't installed any software updates on MISER.  We have
done a lot of work on the sendmail.cf file to accomodate the change in
the host table.  None of the ones that you gave us [Eric Fair's
versions of the sendmail.cf files on SRI-NIC.ARPA] were compatible
with the way that we have set up to exchange mail with our INTEL-310
systems.  We use uucp; however, some changes were made in the normal
way that uucp hosts are taken care of in the configuration file.  When
the problem to appeared, I changed back to the old sendmail.cf just to
make sure that the changes weren't causing the problem.

The problem occurs when we receive mail from a VAX/VMS host.  A
sendmail process is created for the message, and the corresponding
dfAA<PID> file is created.  No corresponding qfAA<PID> file is ever
created, so it is a bit difficult to know where the message is coming
from and more difficult to tell who it was supposed to be addressed
to.  We have always been able to find the addressee from the context
of the message in the df* file, and eventually the host name shows up
in the xfAA<PID> file where some hand shaking between the two machines
is recorded.  Nothing is ever entered into the syslog file.

The worst of the problem is that the df* file just receives about the
first five lines of the message, with the next line repeated millions
of times until the file system is full.  As if this wasn't bad enough,
a few minutes later it will start up another process for the same
message, with a new sendmail process and a new giant dfAA<PID> file.
The file grows at about 100K/minute, so things get pretty bad in a
hurry.  Even if you see what is happening before the file system fills
and other mail is rejected, twenty minutes later the VAX wakes up and
sends it again.  [Perhaps it is fortunate that Karl's machine is
connected to a slow 9.6Kbps line or that growth rate would be even
more dramatic!]

This has gone on for days when it starts on a week-end.  I presently
have a shell script that watches for this to happen, and when the file
gets larger than 100K, the process is killed and the files deleted.

For the moment, I am considering it to be a problem with the new
installations.  They all seem to be VAX/VMS hosts running Wollongong
software and connecting to DDN with X.25 protocol.  Since one of these
is Nuclear Effects [a local host, NEL.ARPA, in Karl's organization],
we have someone nearby with considerable interest in solving the
problem.

Very short messages are not a problem, but they have to be less than
1000 bytes.  The people at Nuclear Effects have also discovered that
they can not FTP large files from NEL.ARPA to MISER.ARPA.  The other
way works fine.  They assume that this is just another manifestation
of the same problem, but this doesn't cause any trouble for MISER when
they try.  It seems to just sit there and do nothing until the job is
killed.  The other two hosts that we have had trouble with are
AFOTEC.ARPA (26.1.0.87) and DPG-MT.ARPA (26.8.0.120).

   If you have any ideas, both I and Dave Watson (dwatson@MISER.ARPA),
the NEL.ARPA system administrator, would be glad to hear from you.

Karl

PS: Overnight we had a similar problem caused by a message from
AMC-4.ARPA, which is apparently a PYRAMID-90X running UNIX.

hedrick@TOPAZ.RUTGERS.EDU.UUCP (06/08/87)

The 4.2 version of sendmail has some places where it should check for
the connection being closed and does not.  This can put it into
infinite loops.  We have also seen the situation where it writes a
file while into this loop.  As far as I know, the 4.3 version of
sendmail does not have this problem.  It is available from Berkeley by
anonymous ftp, because it is part of the MX update.  It will work
under 4.2.