[comp.mail.sendmail] Trouble sending mail to MMDF sites

mcb@ncis.tis.llnl.gov (Michael C. Berch) (05/05/89)

Our site hosts the MHSNEWS (X.400) and IFIP-GTWY (RFC987 stuff)
Internet mailing lists, and recently a number of the addresses on the
list have become undeliverable.  A number of things have changed
around here, including an upgrade from SunOS 3.5 to 4.0 on the sending
host (tis.llnl.gov), an IP address change, and a couple mods to our
sendmail config, but none of them should really affect anything.
We're running Sun Sendmail 4.0.

The undeliverable sites all appear to be running MMDF (at least that's
what the banner looks like), and they all give the same error in
response to the MAIL command, as shown below.  Occasionally the response is 
"451 Nameserver timeout during parsing" instead of what's shown.
Sites that have done this include sh.cs.net, brl.mil, cc5.bbn.com, and
a couple of others.

I should point out that during this period our host has delivered mail
to zillions of other Internet hosts, most of them sendmail sites plus
a few TOPS-20 and VMS and what-have-you, without any problems, and our
zone name server and its parent seem to be working just fine.

Anybody got a clue?  David Herron, are you out there?

Michael C. Berch  
mcb@ncis.llnl.gov / uunet!ncis.llnl.gov!mcb

-----
Transcript of SMTP session:
Running AA18351
arpa-mhs-bboard@sh.cs.net... Connecting to sh.cs.net via tcp...
Trying 192.31.103.3...  connected.
220 sh.cs.net Server SMTP (Complaints/bugs to:  Postmaster@SH.CS.NET)
>>> HELO tis.llnl.gov
250 sh.cs.net - you are a charlatan
>>> MAIL From:<ifip-gtwy-request@tis.llnl.gov>
451 [ Invalid argument ] Nameserver timeout for 'ifip-gtwy-request@tis.llnl.gov
>>> QUIT
221 sh.cs.net says goodbye to [128.115.26.3] at Wed May  3 19:17:35.
arpa-mhs-bboard@sh.cs.net... Deferred: [ Invalid argument ] Nameserver timeout for 'ifip-gtwy-request@tis.llnl.gov

iglesias@orion.cf.uci.edu (Mike Iglesias) (05/08/89)

In article <181@ncis.tis.llnl.gov> mcb@ncis.tis.llnl.gov (Michael C. Berch) writes:
>I should point out that during this period our host has delivered mail
>to zillions of other Internet hosts, most of them sendmail sites plus
>a few TOPS-20 and VMS and what-have-you, without any problems, and our
>zone name server and its parent seem to be working just fine.

The problem is that the MMDF systems can't resolve your IP address
to your host name.  My system can't resolve tis.llnl.gov to an IP
address, and several of the systems here (VMS and Unix) can't get
mail delivered to llnl.gov sites.  I think there's a problem with
a nameserver somewhere - I can't tell if it's yours or not.  If you're
not running the llnl.gov server, maybe that one is broken?


Mike Iglesias
University of California, Irvine

david@ms.uky.edu (David Herron -- One of the vertebrae) (05/09/89)

In article <181@ncis.tis.llnl.gov> mcb@ncis.tis.llnl.gov (Michael C. Berch) writes:
>...  A number of things have changed
>around here, including an upgrade from SunOS 3.5 to 4.0 on the sending
>host (tis.llnl.gov), an IP address change, and a couple mods to our
>sendmail config, but none of them should really affect anything.
>We're running Sun Sendmail 4.0.
>
>The undeliverable sites all appear to be running MMDF (at least that's
>what the banner looks like), 

MMDF's "banner" is like the one you see below for sh.cs.net giving
a message about Complaints/bugs to:..

>I should point out that during this period our host has delivered mail
>to zillions of other Internet hosts, most of them sendmail sites plus

>Anybody got a clue?  David Herron, are you out there?

yea .. I'm out here and I got some clues.


>Transcript of SMTP session:
>Running AA18351
>arpa-mhs-bboard@sh.cs.net... Connecting to sh.cs.net via tcp...
>Trying 192.31.103.3...  connected.
>220 sh.cs.net Server SMTP (Complaints/bugs to:  Postmaster@SH.CS.NET)
>>>> HELO tis.llnl.gov
>250 sh.cs.net - you are a charlatan
		^^^^^^^^^^^^^^^^^^^^
>>>> MAIL From:<ifip-gtwy-request@tis.llnl.gov>
>451 [ Invalid argument ] Nameserver timeout for 'ifip-gtwy-request@tis.llnl.gov
>>>> QUIT
>221 sh.cs.net says goodbye to [128.115.26.3] at Wed May  3 19:17:35.
				^^^^^^^^^^^^^
>arpa-mhs-bboard@sh.cs.net... Deferred: [ Invalid argument ] Nameserver timeout for 'ifip-gtwy-request@tis.llnl.gov

MMDF is a little bit stricter about some things than a lot of people
are accustomed to.

For instance, at the places I've marked what's happening is that the
SMTP daemon has accepted the connection from your daemon.  It does
that system call to find the IP address of the remote end of the connection.
Then it asks gethostbyaddr() (or whatever that call is) what the host name
is for that IP address.  If they don't match it prints that charlatan
message -- tho' it can be configured to simply NOT allow the connection
at this point.  

Farther down it's saying goodbye to your IP address.  Again, it wasn't
able to get the host name for your IP address.

Both imply nameserver timeouts, something that's definitely happening.

451 is a "transient" error, meaning that you're supposed to go away
and come back later.  Hopefully that's what your software is actually doing.

Now, the nameservers at sh.cs.net and relay.cs.net do not, even today,
know any answers for tis.llnl.gov.  Bizarre, they know pointers to the
nameservers for both "gov" and "llnl.gov" but still hasn't found an
answer for tis.llnl.gov...  And finally, the server at LLL-CRG.LLNL.GOV
DOES know the right answer.

So anyway.  MMDF tries to verify the addresses and such in both the out-of
band information *AND* the header -- Rather, it's verifying the stuff in
the header in as part of normalizing the header.  There's a number of options
on configuring MMDF's responses to this.  Like you can turn off the header
parsing entirely, enable it to only "try" but if it runs into a nameserver
timeout to just go ahead and accept it but mark the message with a warning.
And so on.  I don't know how they have sh.cs.net configured and even if they
have all the newest options enabled.

I think, though, the real problem is the recent developments on the Arpanet.
The "net 10" portion is going away and apparently a large chunk of it 
disappeared very recently.  As it was described to me, previously
net 10 was the "glue" that held the nameservers together.  But now
the nameservers are somewhat disconnected from each other now.  Certainly
the one on sh.cs.net isn't able to get answers from the one at llnl.gov.
I tried getting the answer from sh.cs.net a number of times over an
hour period.  Something which, on our nameserver, would've gotten the
answer ... usually.




How does SendMail react to nameserver timeouts?
-- 
<- David Herron; an MMDF guy                              <david@ms.uky.edu>
<- ska: David le casse\*'      {rutgers,uunet}!ukma!david, david@UKMA.BITNET
<- By all accounts, Cyprus (or was it Crete?) was covered with trees at one time
<- 		-- Until they discovered Bronze