[comp.bugs.4bsd] Serious problem in SunOS 3.3 subnetting w/r/t booting Sun-2 clients

earle@smeagol.UUCP (03/14/87)

I have just installed SunOS 3.3 (with subnetting support) on a net of
Sun-2's and Sun-3's; some diskless, most diskful.  I have encountered
a serious problem ...

I am on a Class B network, 128.149.0.0, on subnet 10, with host numbers 1-XX.
My main gateway, smeagol, is at 128.149.10.1.  Because of this setup, I
assumed that I wanted to use the subnet mask 255.255.255.0, and so I changed
all the invocations of ifconfig in /etc/rc.boot to add the 
`netmask 255.255.255.0' parameter.  So far so good.

Now, when booting, if a machine is a diskless client, after the message
`using nn buffers containing <humongous amount> bytes of main memory'
comes out, and before /etc/rc.boot is invoked (and the normal case fsck
begins), then the new version of ip_icmp puts out the message
	Setting subnet mask to 0x(nnnnnnnn) 
where the `nnnnnnnn' seems to be matched to the current mask used by the
disk server for the network interface (in my case, 0xffffff00).  On a
Sun-3 client, this message comes out, rc.boot is invoked, and everything
proceeds normally with no problem (as an aside, a diskful Sun-2 running
SunOS 3.3 has no trouble, either, but then again the `Setting subnet mask'
message does not appear).  However, with the 255.255.255.0 subnet mask
on the server, attempts to boot a Sun-2 diskless client *fail* after the
`Setting subnet mask to 0xffffff00'; with resulting error messages:
	nd: output error 51
Assuming this is from errno, we have nd claiming
>	51  ENETUNREACH  Network is unreachable
>	     A socket operation was attempted to an unreachable network.

All I have been able to determine so far is:
(1) If one sets the mask to 255.255.255.254 on the server (obviously
wrong, but an experiment), then the *server* begins emitting these
when the client begins its boot attempt.

(2) The ONLY way I have seen to enable the Sun-2 clients to boot is to
manually reset the netmask to `255.255.0.0' (i.e., the Class B default)
for the particular interface (in my case, ie0).  Then they will boot
properly; the Sun-3 clients don't give a fsck what the mask is, they'll
boot either way.  Now, the problem is that this is *wrong*, I want the
subnet mask to always be 255.255.255.0 (Indeed, on the Sun-2 clients I
invoke `/etc/ifconfig ie0 netmask 255.255.255.0 -trailers up' inside
rc.boot, and there is nary a peep - it only matters during that initial
boot phase), but if the Sun-2 clients should glitch and random crash
they will be hung out to dry forever if I do that!  So ...

- Hasn't anyone else seen this problem?  Unless I've grossly overlooked
something, this should have been caught before any 3.3 tapes were ever
produced!

- Why would (all things being equal) things be different between a Sun-2
and a Sun-3?

- Is there any way around this problem, so I can leave my netmask the way
I want it?

- Is there anything missing in the equation?  Anything that should be done
on the server that isn't obvious???   Arggghhh ...

Anxiously,
-- 
	Greg Earle	UUCP: sdcrdcf!smeagol!earle; attmail!earle
	JPL		ARPA: elroy!smeagol!earle@csvax.caltech.edu
AT&T: +1 818 354 4034	      earle@jplpub1.jpl.nasa.gov (For the daring)
- if it GLISTENS, gobble it!!

mkhaw@teknowledge-vaxc.ARPA (Michael Khaw) (03/19/87)

In article <975@smeagol.JPL.NASA.GOV> earle@smeagol.JPL.NASA.GOV (Greg Earle) writes:
>I have just installed SunOS 3.3 (with subnetting support) on a net of
>Sun-2's and Sun-3's; some diskless, most diskful.  I have encountered
>a serious problem ...
>
>I am on a Class B network, 128.149.0.0, on subnet 10, with host numbers 1-XX.
>My main gateway, smeagol, is at 128.149.10.1.  Because of this setup, I
[ continues with describing problems booting diskless sun-2's from a sun-3
server ]

Since I've been unable to reach smeagol.jpl.nasa.gov via internet
mail, I'm posting my request here:

My site has a similar setup:  mostly sun-3s, plus one diskless sun-2/50.
We are currently at OS 3.2 but will upgrade to 3.4 when it comes out (RSN).
Since we also have a class B internet, I suspect we will face the same
problem and are interested in solutions.

Thanks,
Mike Khaw
-- 
internet:	mkhaw@teknowledge-vaxc.arpa
usenet:		{hplabs|sun|ucbvax|decwrl|sri-unix}!mkhaw%teknowledge-vaxc.arpa
voice:		415/424-0500
USnail:		Teknowledge, Inc., 1850 Embarcadero Rd., POB 10119, Palo Alto, CA 94303