[comp.mail.sendmail] Strange behaviour with MX record lookups, mailers, ...

kre@cs.mu.OZ.AU (Robert Elz) (01/31/91)

I received the following message from a new Internet site here in
Australia ...

   From: r.liu@trl.OZ.AU (Richard Liu)
   Subject: Re: MX records for hosts on oz.au

   Hmmm. I did an nslookup and type 
   > set querytype=MX
   > aarnet.edu.au
   Server:  shiva.trl.OZ.AU
   Address:  137.147.20.34

   Non-authoritative answer:
   aarnet.edu.au.OZ.AU     preference = 200, mail exchanger = munnari.OZ.AU
   aarnet.edu.au.OZ.AU     preference = 500, mail exchanger = uunet.UU.NET
   Authoritative answers can be found from:
   ...

   I think there is something screwed up with our name server. Its putting the
   .OZ.AU on the end of any MX lookups. "aarnet.edu.au." (with the trailing
   .) works.

The relevant point being that the correct answer for this query should
have been ...

	aarnet.edu.au.  86400   MX      10 jatz.aarnet.EDU.AU.

I thought that perhaps my reply may be of some more general interest
(none of this IMHO stuff here...) so I thought I would send it to
the newsgroups/mailing lists this article is being sent to...

It follows (quoting a later message from the same source on the
same topic):

      It also seems to behave this way on munnari's name server.

Yes, and on all that are in .oz.au (people who live in other
domains add other things to the end, which end up causing less
problems).

      This causes mail to someone@aarnet.edu.au to be relayed
      to munnari. (from trl).

Are you sure?   If it really does, then your mailer is broken.
(Just looking at what appears should happen from nslookup, does
not necessarily indicate what the mailer will actually do - eg:
as you indicated, here munnari's nslookup does just the same
thing, yet munnari's mailer (obviously) doesn't do that).

      Any idea how we change this ?

Yes - but first an explanation of what is happening, and
why (while it will fool nslookup) it shouldn't matter to mail.

Whenever you look up a name with the resolver, unless the name
ends with a '.' (ie: munnari.oz.au. or trl.oz.au.) the
resolver adds your default domain (I'll assume its trl.oz.au,
its in /etc/resolv.conf) to the name, and looks up that.

So, when you look up "jatz.aarnet.edu.au" the resolver first
tries, "jatz.aarnet.edu.au.trl.oz.au", and then when that fails
it tries "jatz.aarnet.edu.au.oz.au" (ie: omits the first
component of your default domain, and adds what's left).

That continues until the lookup succeeds, or until there are
only two components left in the default domain that was last
appended (ie: it will never attempt "jatz.aarnet.edu.au.au",
as that would have only 1 added default component).  After that
the original name that you used is looked up (just as if you
had typed it with a trailing '.' initially).

This allows you to do things like "telnet shiva", or
"ftp usage.unsw" which would otherwise not work.

Some modified resolvers don't do this if there was more than
one '.' in the original name, which is probably a good idea,
but is not material here.

Normally, other than in the wildest circumstances (eg: if there
really was a host called "jatz.aarnet.edu.au.trl.oz.au")
this works well - except when what you are looking for is
an MX record, and there is a wildcard MX in one of the
default domains that is added.

Ie: at the minute, when you lookup "jatz.aarnet.edu.au.oz.au"
for the MX record, it succeeds, and returns the default MX
for *.oz.au (as there is no au.oz.au really), so the lookup
process terminates, and returns the wrong thing (or rather,
returns something that you weren't expecting).

This shouldn't be a problem in practice, other than in confusing
people who run nslookup to find MX records, as no human ever
cares what the MX record for something is in general (ie: you
don't need that for any practical work, only network admin
and mailer people ever look them up), the system mailer is the
only application that ever wants MX records, and unless it
has been severly broken, it should disable adding the default
domains before it does the MX lookup (there's a resolver option
to do that).

Before I go into why it sometimes is a problem, I will answer
one seeming objection ... ie: that it prevents your local people
mailing to user@shiva - and would require them to always use
user@shiva.trl.oz.au.

First - that is not necessarily a bad thing, as the mailer can't
possibly do the "correct" thing in all cases - its possible for
users (by setting the LOCALDOMAIN environment variable) to
make the default domain be whatever they like, but the mailer
doesn't get to see this, as its a running daemon process, and
the only communication with users is via smtp usually (even where
the user's mail agent runs the system mailer as a sub-process,
and the mailer does get to see the environment, that won't still
be available later if the message is queued for some reason, and
only processed by a later queue run).

Because the mailer can't do the correct thing, it probably
shouldn't do any of this - you don't really want mail to
"user@shiva" to possibly end up at different places, depending
on whether or not the mail happened to be delivered immediately
by the mailer, or was queued for later...

With that in mind, I always guarantee that no unexpected
default domain will ever be added by my sendmail, by forcing
sendmail.cf to always add a trailing '.' to every name before
it is looked up by the resolver.

To make it easy for people to mail to "user@host" etc, I have
sendmail.cf do several explicit lookups with various diffferent
default domains, as I choose, in the order I choose, so a
message to "user@host" arriving at munnari will try to be
sent to (in order) "user@host.cs.mu.oz.au."
"user@host.its.unimelb.edu.au", "user@host.maths.mu.oz.au",
... (several other possible default domains).  The rules
for "user@host.onedomain" where "onedomain" isn't "oz" or
a top level domain (au, nl, edu, gov, ...) are similar.

Note that the default domains used are fixed, predictable,
but not in any way based on a single default domain string
(ie: I can have cs.mu.oz.au and ucs.unimelb.edu.au, and others).

Really though, the user's mail agent should be adding default
domains, it can do it in the way that the user expects (so that
the "shiva" that answers ftp commands is the same one reached
by mail to "user@shiva").   MH will do that, ucbmail (Mail)
doesn't (so, use MH!).   I have no idea about elm, mush,
etc - if they don't do this, they should be fixed so they do.

However, many people don't set up their sendmail.cf like this,
and default domains can still work (using the system default
domain from resolv.conf), even in the presence of the wildcard
MX record.  That's because the first thing that sendmail does
when it seens a host name, is attempt to see if it is an alias.
To do this, it typically looks for A records for the host
(using the standard gethostbyname()) which also maps any
CNAME it finds along the way (this isn't the best way to do it,
but its satisfactory).   The wildcard MX gets ignored here,
and you don't get a false halt at "jatz.aarnet.edu.au.oz.au"
and instead end up finding the "A" as "jatz.aarnet.edu.au"
(or if there was no A record, then that lookup fails, and the
name is left alone - which is why looking for the A isn't the
best way, it fails to find a CNAME whose result has no A record).
But if you had given just "shiva" the lookup would have found
"shiva.trl.oz.au" and the name would be translated to the full
name for you, before any MX lookups get started.

Here's where the first potential sendmail bug appears (and the
one that is most prevalent, as I think its actually distributed
this way on some systems by the system vendors).

Sendmail has a compilation option "NO_WILDCARD_MX".  If it is
set, you are promising sendmail that there is no wildcard MX
record anywhere up the default domain tree which could confuse
things.  With this option set, instead of looking for a A
record (or even a CNAME) in the above alias lookup, sendmail
instead looks for any nameserver records (ie: type=any), which
will of course, find an MX record as well as A records and
CNAME records.

This is intended as a nameserver optimisation - this one
(initial) request causes your local server to fetch all
possible information about the name being looked up, its
CNAME if any, MX's, and A records - initially only the
CNAME is of interest, the idea is that its likely that
soon after we will be wanting the MX, and then usually, the
A records - by having them all returned to the local server
on the first lookup the later lookups will be much faster.

This seems like a big win - but you must realise that it
only makes a difference if the data wasn't already in your
local server's cache - ie: only of the destination of this
message isn't one that has been sent to recently (within 12
hours, or a day, or so usually), and so really only makes
any difference to mail to unusual destinations, which isn't
normally a case you would consider worthy of much optimisation.

Consequently, while there are arguments for providing this
facility, there really aren't any for actually using it,
and certainly any vendor who supplies compiled sendmail binaries
with this option set have supplied buggy code, as how can they
possibly know whether there will be a wildcard MX or not.
(I suspect that those vendors simply are incapable of thinking).


The next (more serious, and fortunately, less common) bug,
is that sendmail deliberately disables the resolver default
domain appending in several places where it calls the
nameserver.  This is fine, and what should happen.  Unfortunately
there are modified versions of sendmail around, in which that
particular line of code has been commented out, so default
domais get added in places where they shouldn't (under any
circumstances) be added (it will work in some setups, but not
in others).

To actually fix things, the easiest thing to do is to first
modify your sendmail.cf so host names always have a '.'
appended whenever they go near the resolver (and so you add
the default domains that you want yourself).  To do this
you need to be able to test whether the lookup succeeded,
for which there are various methods which depend on exactly
which version of sendmail you're running (not as between
5.64 and 5.65, or anything like that, but whether yours has
the IDA mods, or my mods, or no mods, ...)

Next, you should get a copy of the sendmail sources, check
that the lines that disable default domains being appended are
intact (grep for XXX, they stand out that way), and then compile
without the NO_WILDCARD_MX definition enabled.

These two together should fix any problems you're having
(actually, either of them should be sufficient, but doing
both is a useful safety measure).

kre

rickert@mp.cs.niu.edu (Neil Rickert) (01/31/91)

In article <6587@munnari.oz.au> kre@cs.mu.OZ.AU (Robert Elz) writes:
>I received the following message from a new Internet site here in
>Australia ...
>
>   From: r.liu@trl.OZ.AU (Richard Liu)
>   Subject: Re: MX records for hosts on oz.au
>
>[NWR: much text deleted]
>   ...
>
>I thought that perhaps my reply may be of some more general interest
>(none of this IMHO stuff here...) so I thought I would send it to
>the newsgroups/mailing lists this article is being sent to...
>
>[NWR:I have deleted a most of a lengthy, and generally correct but
>     occasionally misleading commentary which followed.]
>
>The next (more serious, and fortunately, less common) bug,
>is that sendmail deliberately disables the resolver default
>domain appending in several places where it calls the
>nameserver.  This is fine, and what should happen.  Unfortunately
>there are modified versions of sendmail around, in which that
>particular line of code has been commented out, so default
>domais get added in places where they shouldn't (under any
>circumstances) be added (it will work in some setups, but not
>in others).
>
 The real problem is that while sendmail deliberately disables local
domain qualification, it does it in the wrong place.  Or, more precisely,
it reenables domain qualification too late, so that header addresses are
not properly qualified.  This is why the code is often disabled.

 I believe the latest (5.65) IDA versions of sendmail do this correctly.

-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115                                   +1-815-753-6940

jf@ap.co.umist.ac.uk (John Forrest) (02/01/91)

I have to admit this was too long to read through quickly.
However, a couple of points:

My gut reaction is that this is a wildcard MX problem - it you
have a wildcard MX for *.dom1.dom2 and you are in ...dom1.dom2,
then sendmail will often throw the wildcard on during lookup.
To get around this, you either:

1) Get rid of the wildcard. (best in my opinion, but subject to
   admin problems).
2) Get hold of a sendmail binary that does not do domain
   resolution - you have to give it the full name. (less
   useful, but some people prefer the certainty).

This is worth checking first - although it is difficult since
lookups using host(1) for T_ALL or T_MX will return with the
wildcard bit filled in. Still worth trying with the names you
intend though.

Second point. If you want to stop name resolution, the best way
is via the binary - there is a resolver flag to stop it. You
have to be careful putting '.' on the end, since a bad lookup
might leave it there. Having said this, I wish we had a global
convention that '.' on the end indicates a fully resolved
address - having to choose between fully and partially resolved
addresses can be tricky.

John Forrest
Dept of Computation
UMIST