[comp.soft-sys.andrew] problem with console, help

bill@allegra.tempo.nj.att.com (Bill Schell) (08/21/90)

Has anyone else had problems like the following under patch level 6?

I just put up andrew from scratch (patch level 6).  All seems to work
fine except console and help.  When console is started, it will pause
for about two minutes only displaying "Initializing default.." in a
blank window. It then will work normally, after displaying

"console:<OpenConsoleSocket> gethostbyname failed  (12:54:00 PM )

in its text window.

Help does a similar thing.  After I type 'help'  it sits for a few minutes
and then prints out:

	"No 'localhost' found in host table; creating new window."

However, localhost is in my /etc/hosts as:
127.1           localhost loopback loghost

These look like they could be manifestations of the same problem.
Has anyone seen this before? BTW, I'm on a Sun-4 under 4.0.3.

Thanks,
	Bill Schell
	AT&T Bell Labs, Murray Hill, NJ
	bill@research.att.com

gk5g+@ANDREW.CMU.EDU (Gary Keim) (08/21/90)

I think this is a resolver problem.  From andrew/README:


    RESOLVLIB denotes the full path of the domain name resolver
    library.  It is used only if RESOLVER_ENV is defined, which it
    is unless your system.h or site.h file undefines it.  The
    default value (the empty string) is useful if the resolver code
    is in your libc.a.  If the resolver code is in a separate
    library, such as /usr/lib/libresolv.a, that name should be the
    definition for RESOLVLIB; define it in your site.mcr file.

Did you set RESOLVLIB?

Gary Keim
ATK Group

Craig_Everhart@TRANSARC.COM (08/21/90)

The RESOLVER_ENV and RESOLVLIB stuff are known sources of configuration
errors, but I don't think that this one is there.  That is, Andrew
doesn't provide its own version of gethostbyname(); it uses whatever one
is in the libraries linked to it.  (That being said, there is usually a
gethostbyname() in the resolver libraries, and you may need to set
RESOLVLIB correctly if you're a site that expects the name resolver to
work.)

> "console:<OpenConsoleSocket> gethostbyname failed  (12:54:00 PM )

> Help does a similar thing.  After I type 'help'  it sits for a few minutes
> and then prints out:

> 	"No 'localhost' found in host table; creating new window."

> However, localhost is in my /etc/hosts as:
> 127.1           localhost loopback loghost

The problem could be decomposed to ask:
	[a] what is being passed to gethostbyname()?
	[b] will your gethostbyname() find something reasonable with that result?
The answer to [b] depends on RESOLVLIB.  Given that you refer to
``/etc/hosts'' in your message, I'll assume that the names you expect to
hand to gethostbyname() are in your /etc/hosts file and that your name
resolver is disabled somehow.

Now, the question [a] remains: what's passed to gethostbyname()?  This
is usually the result of the Andrew GetHostDomainName() call, from
overhead/util/lib/hname.c .  This function will call the system function
gethostname(), but sometimes it will append the result of the system
function getdomainname() to it.  (It appends the name if there are no
dots in the gethostname() result (it's not fully qualified).)  Since
you're on a sun-4, you have the getdomainname() call, and by default
GETDOMAIN_ENV is defined.  If it doesn't make sense to append this
value, you can disable this behavior by defining the AndrewSetup value
``ThisDomainSuffix'' to something appropriate for your system, to turn
the local machine address into a fully-qualified domain name.  If you
really don't want anything to be appended, you can define
``ThisDomainSuffix'' as simply a dot (as in
    ThisDomainSuffix: .
) and nothing will be appended.  Make the target ``hname.test'' in your
overhead/util/lib build directory to test the behavior.

This addresses the probable issues with the Console error messages, but
it doesn't really get at the complaint from Help, that presumably really
does pass "localhost" to gethostbyname().  Some remarks:
[1] the line you quote from your /etc/hosts looks odd to me, since it
names 127.1 rather than 127.0.0.1, but maybe this works fine and I'm
just backward.
[2] your gethostbyname() may in fact be consulting the domain system
rather than /etc/hosts, so the ``...found in host table'' message may be
more metaphorical than literal, and the name ``localhost'' may not
resolve correctly via your local resolvers.
[3] are all those spaces after the ``127.1'' text OK, rather than tabs?

Hm.  After looking at the atk/help/src/helpa.c source file,
``localhost'' isn't passed to gethostbyname() after all.  Instead, a
``wmHost'' variable is assigned to the value of the environment value
``WMHOST'' (should there be any), or to the value returned from the
straight gethostname() call (no getdomainname() stuff).  If the value of
the ``wmHost'' variable can't successfully be passed to gethostbyname(),
help prints that error message.  There are several levels of bogosity
here: the error message is misleading, and the whole ``WMHOST'' protocol
should have been replaced by something more generic when the support for
X11 (and ``DISPLAY'') went in.  But at least you should be able to use
this information to make help work.

		Craig

jlevine@oracle.com (Jonathan Levine) (08/30/90)

In article <Added.Eao9gGW00Udf4Pe09g@andrew.cmu.edu> bill@allegra.tempo.nj.att.com (Bill Schell) writes:

>Has anyone else had problems like the following under patch level 6?

>I just put up andrew from scratch (patch level 6).  All seems to work
>fine except console and help.  When console is started, it will pause
>for about two minutes only displaying "Initializing default.." in a
>blank window. It then will work normally, after displaying
>
>"console:<OpenConsoleSocket> gethostbyname failed  (12:54:00 PM )
>
>in its text window.

Because I have the problem below, I tried using console for the first time
today, and found that it core dumps (bus error) for me.  dbx was of little
help in determining where the problem lies.

>Help does a similar thing.  After I type 'help'  it sits for a few minutes
>and then prints out:
>
>	"No 'localhost' found in host table; creating new window."
>
>However, localhost is in my /etc/hosts as:
>127.1           localhost loopback loghost

I have had this problem since I applied earlier andrew patches
(unfortunately, I applied patches 1-5 in a lump, so I don't know which  
patch actually caused the problem).  As far as I know, this problem does
not occur in the unpatched (X11R4) distribution.  BTW, my localhost is
127.0.0.1, in both my /etc/hosts and under yp.

Jon



--
-----------------------------------------------------------------
From the Oracle*Desk of:		"Paradise is exactly like
Jonathan Levine 			 where you are right now,
Oracle*Mail Development			 only much, much better."
4248 1 Ann B. Davis Drive			-- Laurie Anderson

gk5g+@ANDREW.CMU.EDU (Gary Keim) (08/31/90)

Excerpts from misc: 30-Aug-90 Re: problem with console, h.. Jonathan
Levine@uunet.uu (1588)

> >Help does a similar thing.  After I type 'help'  it sits for a few minutes
> >and then prints out:
> >
> >	"No 'localhost' found in host table; creating new window."
> >
> >However, localhost is in my /etc/hosts as:
> >127.1           localhost loopback loghost


I built Andrew (patch6) on a Sun4_40 machine here at CMU.  I had
RESOLVER_ENV defined and RESOLVLIB set to /usr/lib/libresolv.a.  The
machine had an /etc/hosts that was extensive and included an entry for a
remote named(8) server.  I was experiencing the problems described
previously.  The machine did NOT have an /etc/resolv.conf file.  I put
one on the machine and everything was golden.  That file maps named(8)
servers to internet dot-notation.

(fallscreek)gk5g> cat /etc/resolv.conf
nameserver	128.2.35.50
nameserver	128.2.84.1
nameserver	128.2.13.21

I don't know if this file is absolutely necessary in the default case. 
Documentation for resolver(5) says:

    DESCRIPTION 
    The resolver configuration file contains information that is
    read by the resolver routines the first time they are invoked by
    a process. The file is designed to be human readable and
    contains a list of name-value pairs that provide various types
    of resolver information. 

    On a normally configured system this file should not be
    necessary. The only name server to be queried will be on the
    local machine and the domain name is retrieved from the system. 

Anyone have additional knowledge on this subject?   The failure that has
been reported is with gethostbyname(3n):

    Gethostbyname and gethostbyaddr each return a pointer to an
    object with the following structure. This structure contains
    either the information obtained from the name server, named(8),
    or broken-out fields from a line in /etc/hosts. If the local
    name server is not running these routines do a lookup in
    /etc/hosts. 

My (admittedly limited) understanding:

First the resolver (via gethostbyname(3n)) tries a remote name server,
then one running on the local machine if it exists, and finally it'll
look in /etc/hosts.  If you have a name server on your local machine you
don't need /etc/resolv.conf.  If you have a remote name server you need
/etc/resolv.conf.  And finally, if you aren't using the resolver you
only need /etc/hosts.  It would seem to me that the resolver should
consult /etc/hosts if it can't access the name server.  It doesn't seem
to be doing that.

Will those of you who are experiencing this problem please mail me the
answers to these questions:

1) machine and configuration.
2) RESOLVER_ENV defined?
3) RESOLVLIB defined or using libc?
4) Running a local name server?
5) Using a remote name server?

Thanks,
Gary Keim
ATK Group