[comp.unix.wizards] trouble with getpwuid

roy@phri.UUCP (Roy Smith) (06/30/87)

	I've been having some strange problems with getpwuid(3) on a
Sun-3.2 system.  Here's a little program I wrote to demonstrate the
problem:

----------------
# include <pwd.h>
# include <stdio.h>

main ()
{
	int i;
	struct passwd *pw;
	extern int errno;

	i = getuid();
	printf ("i = %d, errno = %d\n", i, errno);
	perror ("");

	pw = getpwuid (i);
	printf ("pw = %#x, errno = %d\n", pw, errno);
	perror ("");

	if (pw != NULL)
		printf ("name = %s, dir = %s\n", pw->pw_name, pw->pw_dir);
}
----------------

	I've got Sun-3's running both 3.1 and 3.2.  Alanine is a 3.1
machine; inosine runs 3.2.  I can compile this program on either machine
and run it on the same machine with no problem.  If, however, I compile it
on alanine and run in on inosine (3.2 is supposed to run 3.[01] binaries),
I get:

inosine% pw.alanine
i = 101, errno = 0
Error 0
pw = 0, errno = 49
Can't assign requested address

	It correctly gets my uid, then the getpwuid() fails.  The kicker is
that if I login to another 3.2 machine and run the 3.1 binary, it works
fine (sort of; see below):

cytosine% pw.alanine
i = 101, errno = 0
Error 0
pw = 0x215ac, errno = 48
Address already in use		<-- huh?
name = roy, dir = /usr/goober/roy

	So, there is obviously something wrong with the way we have inosine
set up which allows 3.2 binaries using getpwuid to work properly, but not
3.1 binaries.  As far as I can tell, however, both inosine and cytosine are
configured the same.

	As long as the requested passwd entry is in the local /etc/passwd
(i.e. you don't need a yellow pages call), it works fine no matter what
machine I run it on; for example, a version of pw.c which has "getpwuid(7)"
hardwired into it (correctly) gives me back the local entry for ingres:

Error 0
pw = 0x215bc, errno = 0
name = ingres, dir = /usr/ingres

	I'm totally stumped.  What could I possibly have set up wrong on
inosine which would cause this behaviour?  Also, why do I get an "address
already in use" indication in errno when the getpwuid() call works?  I know
errno isn't cleared on non-error calls, but I don't understand where this
is comming from anyway.  It doesn't happen on non-yp calls (such as the
ingres version above).

	BTW, I got onto this because Adobe's "enscript" (a 3.0 binary
copied to the 3.2 server) started dumping core on inosine.  Some dbxing
showed that a getpwuid() call was failing, but enscript doesn't bother to
check for errors (grrrrr...) and later gets a segmentation violation
referencing through the null pointer that getpwuid() returns.

	The obvious and practical thing to do is to just recompile
everything under 3.2, but that's not very satisfying.
-- 
Roy Smith, {allegra,cmcl2,philabs}!phri!roy
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016

roy@phri.UUCP (Roy Smith) (07/11/87)

	About a week ago, I posted a somewhat confused query about problems
we were having with getpwuid() on SunOS-3.2 when running a 3.0 binary,
namely that when handed a valid uid, getpwuid would return NULL.  I now know
what was wrong.

	The key to the puzzle was supplied by Mark Plotnick (allegra!mp) who
reported that on some of his SunOS-3.2 machines, bind(2) would fail,
reproducably, in certain situations.  I passed this off as not being our
problem until I noticed that talk (straight off the 3.2 distribution tape)
didn't work, failing in a bind(2) call which should have worked.

	I'm not sure of all the details, but the problem was that /etc/hosts
on that machine was totally bogus.  When I put a proper hosts file in place
and rebooted, everything worked.

	What I still can't figure out is why things failed as they did.  YP
was clearly working because lots of things that depend on it worked (ls -l,
for example).  If YP is up, it shouldn't make any difference what's in (or
not in) the local hosts file.  But, in the getpwuid() example, if I used a
uid which was in the local /etc/passwd file, it worked fine, but one in the
YP passwd file failed.  In the case of talk, the bind that was failing gave
the local host's address (not "localhost", but the real local internet
address).  That address wasn't in the local /etc/hosts but was in the YP
/etc/hosts.  On the other hand, lots of applications which involved host
lookups using YP (rlogin, sendmail, rcp, NFS) worked fine.

	Well, as Andy Shore said, "I couldn't imagine the getuid or getpwuid
might fail, but in the YP world, you never know!"

Roy Smith, {allegra,cmcl2,philabs}!phri!roy
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016

Don't adjust the horizontal.  Don't adjust the vertical.  Your
system's not broken, you've simply entered the Yellow Pages!
-- 
Roy Smith, {allegra,cmcl2,philabs}!phri!roy
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016