[comp.protocols.tcp-ip] Broadcast RPC question

barnett@vdsvax.steinmetz.ge.com (Bruce G. Barnett) (10/12/88)

We have a problem on our network with diskless Suns running SunOS 4.0.
At times they are unable to boot, due to some condition on our
network.

I have been working with Sun on this for several months and still do not have
a solution OR workaround.

Our environment:
	Company wide network:	3.0.0.0	(ge-net)
	Local Subnet:		3.1.4.0	(ge-crd-net)
		Netmask:	255.255.252.0
		Broadcast:	3.1.4.0
	About 800 hosts in the local subnet
	Type of systems on our network: You name it. We got it.

When a 4.0 diskless client boots up, it does a broadcast RPC, using UDP,
to the 3.0.0.0 network.

Since Sun hasn't fixed the problem yet, I am trying (in the meantime) to get
the other vendors to fix their software. (Desperate times indeed!)

However, one DEC representitive has told me that Sun is doing the
Wrong Thing, and it's not DEC's problem.

Here are the packets that *I Think* are causing the problem:

	Bruce G. Barnett

======================DATA FOLLOWS=====================

Example packet send from diskless client (via etherfind)

UDP from grymoire.1023 to ge-net.sunrpc  108 bytes
 ff ff ff ff ff ff 08 00 20 01 9c 27 08 00 45 00
 00 80 00 00 00 00 ff 11 b0 49 03 01 05 23 03 00
 00 00 03 ff 00 6f 00 6c 00 00 1f 73 91 f3 00 00
 00 00 00 00 00 02 00 01 86 a0 00 00 00 02 00 00
 00 05 00 00 00 01 00 00 00 18 1f 74 bb f0 00 00
 00 00 00 00 00 00 00 00 00 01 00 00 00 01 00 00
 00 01 00 00 00 00 00 00 00 00 00 01 86 ba 00 00
 00 01 00 00 00 01 00 00 00 14 00 00 00 01 00 00
 00 03 00 00 00 01 00 00 00 05 00 00 00 23

Typical response from several machines:

ICMP from ge-net to grymoire dst unreachable bad port
  bad packet was: UDP from grymoire.1023 to ge-net.sunrpc  108 bytes
 08 00 20 01 9c 27 aa 00 04 00 60 10 08 00 45 00
 00 38 02 dc 00 00 ff 01 ad c5 03 00 00 00 03 01
 05 23 03 03 a8 6c 00 00 00 00 45 00 00 80 00 00
 00 00 ff 11 00 00 03 01 05 23 03 00 00 00 03 ff
 00 6f 00 6c 00 00
-- 

cball@ishmael (10/14/88)

>/* Written 12:51 pm  Oct 12, 1988 by barnett@vdsvax.steinmetz.ge.com in ishmael:comp.protocols.tcp-ip */
>/* ---------- "Broadcast RPC question" ---------- */
>We have a problem on our network with diskless Suns running SunOS 4.0.
>At times they are unable to boot, due to some condition on our
>network.

What are the ifconfig command arguments in the boot file(/etc/rc.boot
in SunOS 3.5)?
We had an intermitant problem with diskless stations using any SunOS that
supports subnets (3.4 or later) on a net with Vaxes running Wollengong TCP.
The fix was to explicitly set the netmask in the ifconfig command.  Try
something like:

ifconfig le0 $hostname netmask 255.255.252 broadcast 3.1.4.0 -trailers up

This workaround was suggested in a discussion about 18-24 months ago.

Charles Ball
cball@Inmet.com
Intermetrics, Inc.

barnett@vdsvax.steinmetz.ge.com (Bruce G. Barnett) (10/17/88)

In article <155100002@ishmael>, cball@ishmael writes:
>What are the ifconfig command arguments in the boot file(/etc/rc.boot
>in SunOS 3.5)?

The netmask was added a year ago when we converted to subnets.
It was the first thing I tried when this problem surfaced with
SunOS 4.0.

I did get a message from Sun about this problem.

Make sure the ifconfig line has `hostname` on the line.
The standard rc.local is missing it.

	Bruce Barnett

08071TCP@MSU.BITNET (Doug Nelson) (10/19/88)

Your detailed trace allows me to confirm my suspicions.  The hosts which
reply with a "port unreachable" are definitely in error.  They are
making the following mistakes (I'm using the Host Requirements draft
RFC as my point of reference):

1.  A host should not send an ICMP error reply in response to a
    link layer broadcast (section 3.2.2).

2.  The Dec host either a) thinks that 3.0.0.0 is a valid IP broadcast
    address, in which case it should not send an ICMP error reply (section
    3.2.2), or b) doesn't think 3.0.0.0 is an IP broadcast address,
    in which case it should have discarded the datagram immediately upon
    receipt as not being destined for itself (section 3.2.1.9).

3.  The host is responding with an IP source address of 3.0.0.0, rather
    than its own IP address (sections 3.3.5 and 3.3.6).

To be fair, there are quite a number of TCP/IP implementations which make
one or more of these mistakes.  It will be nice to have the Host Requirements
RFC in hand to use as a definitive reference for such errant software.

Even now while this document is still in draft form, I'd recommend pointing
vendors in this direction if they aren't already aware of its existence, and
if they are, I'd suggest that they read it.

Doug Nelson
Michigan State University