[comp.sys.sun] FTP problem on 3/60's

dav@hplabs.hp.com (David L. Markowitz) (12/03/88)

I have an unusual problem on a small, isolated Sun network.  This network
has a Sun 3/180, a few /50's and /60's, and a vax w/ Excelan TCP/IP.  The
Suns are running 3.4.  This setup has been working fine for months, but a
recent system disk crash and rebuild seems to have fouled some things up.

The symptom is that when FTP'ing FROM any 3/60 TO anywhere (including
itself), FTP is unable to bind any sockets.  It fails with a "Can't assign
requested address" error.  FTP thus cannot do an "ls", "send" or "get", or
any other operation requiring data transfer.  "cd" succeeds.

All the appropriate entries are in the services and protocols YP maps (I
believe).  Any suggestions?


	David L. Markowitz		Rockwell International
	...!sun!sunkist!arcturus!dav	dav@arcturus.UUCP

white@cs.unc.edu (Brian T. White) (12/13/88)

We had this problem when a ypclient machine did not have its own host name
in its own version of /etc/hosts.  This also disabled the "talk" program.
To fix, add the host name to /etc/hosts and reboot.  Apparently, something
in the kernel looks at /etc/hosts at boot time.  The message in /etc/hosts
that "If the yellow pages is running, this file is only consulted when
booting" is true;  it just doesn't mention that a machine consults its own
/etc/hosts at boot time to get its own host name for the purposes of ftp
and talk.  Without the /etc/hosts entry, booting apparently proceeds
normally;  you just can't use talk or ftp.

[[ I assure you, nothing in the kernel looks in "/etc/hosts".  It would be
absurd to hardwire a file name into the kernel like that!  What is almost
certainly true is that the FTP DAEMON (ftpd) wants to find its own
hostname at initialization time and is not able to do so because the local
host's entry is not in the host table.  --wnl ]]

dav@hplabs.hp.com (David L. Markowitz) (12/14/88)

felix!arcturus!dav@hplabs.hp.com (David L. Markowitz) (me) wrote:
> I have an unusual problem on a small, isolated Sun network.  ...
> ... The symptom is that when FTP'ing FROM any 3/60 TO anywhere (including
> itself), FTP is unable to bind any sockets.  It fails with a "Can't assign
> requested address" error.  FTP thus cannot do an "ls", "send" or "get", or
> any other operation requiring data transfer.  "cd" succeeds.

Many people wrote to remind me to have a full copy of /etc/hosts on these
machines while booting them.  In particular, each machine's own hostname
must be listed (or rc.boot must use the internet address instead of a
symbolic name).  This did indeed solve the problem!  Thanks to those who
wrote (you know who you are).

	David L. Markowitz		Rockwell International
	...!sun!sunkist!arcturus!dav	dav@arcturus.UUCP
	The above opinions are merely that, and only mine.

jdh@bu-it.bu.edu (Jason Heirtzler) (12/16/88)

felix!arcturus!dav@hplabs.hp.com (David L. Markowitz) says:
>The symptom is that when FTP'ing FROM any 3/60 TO anywhere (including
>itself), FTP is unable to bind any sockets.  It fails with a "Can't assign
>requested address" error.  FTP thus cannot do an "ls", "send" or "get", or
>any other operation requiring data transfer.  "cd" succeeds.

This has happened often enough to me that I ought to mention it.

If your client's *own* /etc/hosts has an incorrect address for it's yp
server, things break with "bind: can't assign requested address," which is
confusing, since in all other respects the yp seems okay.

The first time it happened to me ( okay, it happened several times.. ) I
was trying to use vt100tool. This is SunOS 3.x, but probably 4.x also.

The next question is why it even works in the first place, under these
circumstances. The mind boggles.

---Jason Heirtzler
   Boston University
   jdh@bu-it.bu.edu

trinkle@purdue.edu (12/21/88)

> ...What is almost
> certainly true is that the FTP DAEMON (ftpd) wants to find its own
> hostname at initialization time and is not able to do so because the local
> host's entry is not in the host table.  --wnl ]]

(mild) Shame on you.  The problem is with ifconfig.  It looks in
/etc/hosts on the "local" root (how can it look anywhere else, the network
is not "running").  Because stderr is not directed to the console in the
rc script, you never see the error.  Ifconfig seems to show the right
stuff for the interface, but it is not completely initialized.  By the
time ftpd starts, YP is running.  This was discussed about a year ago I
think.  I know because we were on the receiving end of the info.

[[ Righto!  Sorry for being misleading there.  The fix is the same.
Thanks to everyone else who sent in similar (correct) information.
--wnl ]]

Daniel Trinkle			trinkle@cs.purdue.edu			ARPA
Department of Computer Sciences	trinkle%purdue.edu@relay.cs.net		CSNET
Purdue University		{ucbvax,decvax}!purdue!trinkle		UUCP
West Lafayette, IN 47907	(317) 494-7844				PHONE

dana@dino.bellcore.com (Dana A. Chee) (12/21/88)

white@cs.unc.edu (Brian T. White) writes:

> We had this problem when a ypclient machine did not have its own host name
> in its own version of /etc/hosts....

For those interested, what really happens is that ifconfig 'fails'.  When
ifconfig is called to set up the interface (in /etc/rc.boot), one of the
things it does is to try to set the internet address for the interface.
But this fails since the address can't be found for the host name.  The
effect of this is that the kernel says that there is no address for the
given interface and fails the 'bind()'.

The method we use to get around this is to do the ifconfig again in
/etc/rc.local after the yp stuff has been started (after the mount -vat).

Below is an example:

# re-do the ifconfig to put the hostname in
/etc/ifconfig ec0 `/bin/hostname` -trailers up > /dev/null
--
Dana Chee				(201) 829-4488
Bellcore
Room 2Q-250
445 South Street			ARPA: dana@bellcore.com
Morristown,  NJ  07960-1910		UUCP: {gateways}!bellcore!dana

slevy@nic.mr.net (Stuart Levy) (12/30/88)

> We had this problem when a ypclient machine did not have its own host name
> in its own version of /etc/hosts.  This also disabled the "talk" program.
> To fix, add the host name to /etc/hosts and reboot.  Apparently, something
> in the kernel looks at /etc/hosts at boot time.

What's happening here is that the reverse-ARP code, which diskless Suns
use to find their addresses at boot time, is sloppy.  The address is
stored in a 16-byte struct sockaddr_in, including a 4-byte internet
address and 8 bytes of zero padding.  The RARP code fills in the IP
address but doesn't zero the 8 pad bytes.  This works for many things, but
if you try to bind() a socket to a specific IP address as ftpd and talkd
do, it fails unless those pad bytes are really zero.

They normally get cleared by the "ifconfig ??0 $hostname" in /etc/rc.boot.
But if the local /etc/hosts doesn't list whatever you have for hostname --
or if it does list it but at an IP address *different* from the one picked
up by the RARP code -- the ifconfig fails, and ftpd etc. report strange
errors.

	Stuart Levy