[comp.mail.sendmail] domain query subroutine res_search

markw@airgun.wg.waii.com (Mark Whetzel) (04/16/91)

I am working with another programmer on porting the IDA sendmail
to the IBM RT running AIX 2.2.1.  (yes its yucky IBM, but it works and
it's paid for :-)

So far so good on making it work, but we may have found a bug with the
AIX at the latest maint level, and code that works on one RT (2705 level) won't
work on  another RT (1773 level) at a higher maint level.   

The problem area of code is dealing with domain server queries in the
routine domain.c, in particular it is using the res_search system
subroutine.  I can't find any documentation about this routine, and referencing
both AIX V3 (RS6000) and SUNOS (4.0.1) and CONVEXOS, none of these systems
also have documentation about this system subroutine.  I can find
res_init, res_mkquery, res_send, but not res_search.  What is happening
is res_search, called for looking for MX records is returning a -1 return code
and h_errno is set with TRY_AGAIN (value 2) rather than NO_DATA (value 4).
The NO_DATA value indicates that the host record is valid, but no records
of the requested type could be found.

The nameserver is reachable, and a piece of test code that 
queries with type = T_A work properly and return a valid query record, but
types of T_MX fail, with this TRY_AGAIN failure.  This causes sendmail to
defer the mail, waiting for a nameserver positive response.
We currently do not have many MX records in our nameserver
and the system name that is being queried, does not have any MX records on
file.

Here is the code fragment from domain.c from the IDA sendmail:
        [some code deleted]
typedef union {
        HEADER qb1;
        char qb2[PACKETSZ];
} querybuf;
extern int h_errno;
querybuf answer;
	[some code deleted]
        errno = 0;
        n = res_search(host, C_IN, T_MX, (char *)&answer, sizeof(answer));
        if (n < 0)
        {
          if (tTd(8, 1))
              printf("getmxrr: res_search failed (errno=%d, h_errno=%d)\n",
                            errno, h_errno);
           switch (h_errno)
                {
# ifndef NO_DATA
#  define NO_DATA       NO_ADDRESS
# endif /* NO_DATA */
                  case NO_DATA:
                  case NO_RECOVERY:
                        /* no MX data on this host */
                        goto punt;

                  case HOST_NOT_FOUND:
                        /* the host just doesn't exist */
                        *rcode = EX_NOHOST;
                        break;

                  case TRY_AGAIN:
                        /* couldn't connect to the name server */
                        if (!UseNameServer && errno == ECONNREFUSED)
                                goto punt;

                        /* it might come up later; better queue it up */
                        *rcode = EX_TEMPFAIL;
                        break;
                }

Any pointers on what may be wrong? Where is this routine discussed, is
all these systems documentation lacking? I am going to report this to IBM,
but with an undocumented routine, it may be tricky.  As I indicate, on a 
different system, all works ok. 

PS. I have verified the /etc/resolv.conf file to verify proper contents,
it is identical to other systems at our site, and other hostname lookups
are correctly working (telnet, rlogin, host, ect..). 
I have tested this on the RS6000 and also get the h_errno=4 just like the
2705 level RT.

Thanks for any light you can shed on this funny routine and its orgins.
Markw
-- 
Mark Whetzel     My comments are my own, not my company's.
Western Geophysical - A division of Western Atlas International,
A Litton/Dresser Company           DOMAIN addr: markw@airgun.wg.waii.com
				   UUNET address:  uunet!airgun!markw

jackv@turnkey.tcc.com (Jack F. Vogel) (04/17/91)

In article <934@airgun.wg.waii.com> markw@airgun.wg.waii.com (Mark Whetzel) writes:
>I am working with another programmer on porting the IDA sendmail
>to the IBM RT running AIX 2.2.1.  (yes its yucky IBM, but it works and
>it's paid for :-)
 
>The problem area of code is dealing with domain server queries in the
>routine domain.c, in particular it is using the res_search system
>subroutine.  I can't find any documentation about this routine, and referencing
>both AIX V3 (RS6000) and SUNOS (4.0.1) and CONVEXOS, none of these systems
>also have documentation about this system subroutine. 

Hmmm, this is interesting I didn't think AIX on the RT or RS6000 even had
the res_search() code. res_search() was introduced in BIND 4.8 if memory
serves and the code on the RT or 6000 as shipped is much earlier than
that. And, by the way, the simplest way to get the information you want
is to get the BIND distribution, its available from Berkeley or most any
other of the Internet archive sites. AIX 1.2 on the PS/2 or 370 has this
code since I ported it, but since it was done in the service stream it is
not in the documentation. This will be remedied in the AIX 1.2.1 docs. Is
it possible that someone has hacked your system and added the later version
of BIND (and from the sound of it got it wrong :-})?? You might try running
sendmail with debug on and take a look at the info from the nameserver (that
assumes of course that your resolver was built with DEBUG defined).

Disclaimer: I don't speak for the company.




-- 
Jack F. Vogel			jackv@locus.com
AIX370 Technical Support	       - or -
Locus Computing Corp.		jackv@turnkey.TCC.COM