[comp.unix.wizards] rwho problems between 4.2 and 4.3 hosts

esj%bikini.cis.ufl.edu@RELAY.CS.NET (Eric Johnson) (05/25/87)

	  On our local area network we have a some hosts running 4.2, 
	some running 4.3, and some sun workstations. Our 4.3 hosts dont
	see the rwho packets the 4.2 hosts send out, and our 4.2 hosts
	dont see ANY rwho packets.. even the ones they send out. The suns
	see everything fine. (I love em)
	  We recently changed our network numbers to our assigned class
	B address. Before we did this the 4.2 hosts saw all (including
	4.3 hosts) packets. 
	  A Excelan Lanalyzer shows that after each host broadcasts a 
	rwho packet, the 4.2 hosts send back a 
		icmp: UNREACH NET
	packet. 

	  In all other respects everybody communicates fine. rlogin, rcp
	smpt, and such function fine. 

	  What Gives? 

	  Thanks in Advance.
_________________________________________________________________
In Real Life:			UUCP: ..!akgua!ufcsv!esj
                                rfc733ish: esj%ufl.edu@CSNET-RELAY.ARPA
Eric S. Johnson II              rfc822: esj@ufl.edu
  University of                   -or-  esj%ufl.edu@relay.cs.net
    Florida                     Ma-Bell-Net: (904)-335-8000

hedrick@topaz.rutgers.edu (Charles Hedrick) (05/26/87)

I assume the things that 4.2 doesn't see are broadcasts.  4.3 will by
default send broadcasts to 128.x.255.255.  4.2 only understands
128.x.0.0.  However you can set the broadcast address in 4.3 using
ifconfig.  I'd try a few until you get through.  Reasonable things
to try are 128.x.0.0 and 0.0.0.0, both of which are legal under the
old standards, and 128.x.255.255 and 255.255.255.255, both of which
are legal under the new standards.  Also, you should set
ip_forward to 0 on all machines that are not gateways.  You are
getting icmp errors back from 4.2 because they try to forward the
packets and can't.  This is a mistake.  Finally, I believe certain
implementations of 4.2 don't compute UDP checksums correctly.  If
you have any of these, you may have to turn off UDP checksumming
on all of your systems.  There's a kernel variable for that too,
with some obvious name like udp_checksums.

pdb@sei.cmu.edu (Patrick Barron) (05/26/87)

In article <7523@brl-adm.ARPA> esj%bikini.cis.ufl.edu@RELAY.CS.NET (Eric Johnson) writes:
>	  We recently changed our network numbers to our assigned class
>	B address. Before we did this the 4.2 hosts saw all (including
>	4.3 hosts) packets. 
>	  A Excelan Lanalyzer shows that after each host broadcasts a 
>	rwho packet, the 4.2 hosts send back a 
>		icmp: UNREACH NET
>	packet. 

I've seen something like this just recently.  4.2 hosts didn't like broadcast
packets sent by hosts with 4.3-like networking (specifically, Ultrix 1.2).
One of the 4.2 hosts was sending ICMP network unreachable messages in response
to each, and the rest (Suns running Sun Unix 3.2) were trying to forward the
packet.  Problem was that the Ultrix machines and the 4.2 machines had differ-
ing ideas about what the IP broadcast address was.  The Ultrix machines thought
it was (in our case) 128.2.255.255, and the 4.2 machines thought it was
128.2.0.0.  You might want to check to make sure your broadcast addresses
agree.  4.3bsd will allow you to change to IP broadcast address with ifconfig.

--Pat.

forys@sigi.Colorado.EDU (Jeff Forys) (05/26/87)

In article <12249@topaz.rutgers.edu> hedrick@topaz.rutgers.edu
(Charles Hedrick) writes:
> However you can set the broadcast address in 4.3 using ifconfig.
> Reasonable things to try are 128.x.0.0 and 0.0.0.0 [...]

From what I've determined, this causes a problem iff the machine is
a gateway on a class B net where only the subnet addr is different.
Given an address of `128.net.0.0', and two interfaces `128.net.sub1'
and `128.net.sub2', it cant decide which one to use, and broadcast
packets never get out.  Packets get out if you set the broadcast
address to `128.net.sub.{0,1}', leaving a nasty kernel hack as the
only way to get `128.net.0.0' out of a gateway.

Can anyone confirm/deny the above behavior?
---
Jeff Forys @ UC/Boulder Engineering Research Comp Cntr (303-492-6096)
forys@Boulder.Colorado.EDU  -or-  ..!{hao|nbires}!boulder!forys

pdb@sei.cmu.edu (Patrick Barron) (05/26/87)

In article <12249@topaz.rutgers.edu> hedrick@topaz.rutgers.edu (Charles Hedrick) writes:
> [...]  Also, you should set
>ip_forward to 0 on all machines that are not gateways.  You are
>getting icmp errors back from 4.2 because they try to forward the
>packets and can't.  This is a mistake.  [...]

Just a slight correction:  the variable that controls IP packet forwarding
is "ipforwarding", not "ip_forward".  "ip_forward" is the routine that
actually does the work of forwarding a packet.

I'm not sure how it would work on 4.2 (I don't have access to 4.2 sources
anymore), but at least on Ultrix, if a host gets a packet that doesn't
belong to it, and ipforwarding == 0, then it will just generate an automatic
ICMP net unreachable message anyway, which may not be what you want.

--Pat.

chris@columbia.UUCP (05/26/87)

In article <1158@sigi.Colorado.EDU> forys@boulder.Colorado.EDU (Jeff Forys) writes:
>In article <12249@topaz.rutgers.edu> hedrick@topaz.rutgers.edu
>(Charles Hedrick) writes:
>> However you can set the broadcast address in 4.3 using ifconfig.
>> Reasonable things to try are 128.x.0.0 and 0.0.0.0 [...]

>From what I've determined, this causes a problem iff the machine is a gateway
>on a class B net where only the subnet addr is different.  Given an address of
>`128.net.0.0', and two interfaces `128.net.sub1' and `128.net.sub2', it cant
>decide which one to use, and broadcast packets never get out.  Packets get out
>if you set the broadcast address to `128.net.sub.{0,1}', leaving a nasty
>kernel hack as the only way to get `128.net.0.0' out of a gateway.

>Can anyone confirm/deny the above behavior?

You're mostly right; with 4.3bsd networking code, if you have subnets enabled,
128.x.0.0 is interpreted as a host on subnet zero, and the kernel won't
recognize that as a broadcast address regardless of what the per-interface
broadcast address is set to.

You can set up a fake route so that packets addressed to 128.x.0.0 would
normally be sent to one or another of your subnet interfaces, but most of the
programs that send broadcasts (rwhod, routed, etc) explicitly bypass the route
table.

If each interface has a unique broadcast address, or you don't care mind having
all the 128.x.0.0 broadcasts going only to one subnet, you can get around the
problem with some simple fixes to in_broadcast() and ip_output(), but in
general, trying to use non-subnet broadcast addresses throughout a subnetted
network will open up a real can of worms.
							Chris

dce@mips.UUCP (David Elliott) (05/26/87)

Previous responses to this have confused me, so I thought I'd give it
a try from a more "user-oriented" standpoint.

From my readings of the ifconfig(8C) manual page and from learning
about how our network is set up, here's what I understand:

	There are two quantities involved: netmask and broadcast address.
	The netmask is a mask of 32 bits that says what part of an address
	is the "network" (1s in the mask) and what part is the "host" (0s).
	The broadcast address is an address that is understood by all 
	hosts on a given network (which is why you can't broadcast into other
	networks) as a special address for "broadcast" packets.

For example, suppose there are 3 machines whose addresses are
97.0.0.1, 97.0.1.2, and 97.1.0.3. If the netmask is set to 0xff000000,
all of these machines are on the network numbered 97, so the
broadcast address must be an address of the form 97.x.y.z, but
it CAN be any address except for the ones already assigned to the
machines listed.

If the netmask is 0xffff0000, 97.0.0.1 and 97.0.1.2 are still on the
same network, and the broadcast address must be of the form 97.0.x.y.
The machine with address 97.1.0.3 is not on this network, and valid
broadcast addresses in its network are of the form 97.1.x.y (where
x and y do not have to be the same as on the other network).

Other netmask values should, according to the documentation, work
similarly. This allows you to have hierarchical networks within any
physical organization.

Now, to answer the real question. In 4.2BSD, the broadcast address
defaults to being

	{host address} & {netmask}

That is, all of the 0s in the netmask are zeroed. So, if your netmask
is 0xffff0000 and your host address is 128.3.7.86, the broadcast
address defaults to 12.3.0.0.

In 4.3BSD, the broadcast address defaults to

	{host address} | ^{netmask}

That is, all of the 0s in the netmask are changed to 1s. So, using
the same netmask and host address as before, the broadcast address
defaults to 128.3.255.255.

I believe that the default netmask in all BSD systems is 0xffff0000.

When you set up a 4.3BSD system to talk to 4.2BSD-based systems (like
Suns), you need to change the ifconfig command(s) in /etc/rc.local
to set the broadcast address. Of course, you may not want the 4.3
broadcast packets to mix with others, so decide beforehand and be
prepared to change back.

The user-oriented documentation on netmask and broadcast addresses
is pretty bad. I got a call from someone on Friday night that could
not get a MIPS machine to talk to a Sun, and it ended up that they
had set the broadcast address to the same as the host address, thus
making all host-specific requests broadcasts. Not only did rwho and
ruptime not work, but rlogin and rsh were broken as well!

Anyway, if any of you network jocks can point out flaws in the
above information, please do so publicly. I'd like to add it to
the documentation for ifconfig and to our system installation
documentation.
-- 
David Elliott		{decvax,ucbvax,ihnp4}!decwrl!mips!dce
"With an a) like that, you've got a lot of nerve asking for a b)!"-P. Schaeffer

forys@sigi.Colorado.EDU (Jeff Forys) (05/28/87)

In article <427@quacky.UUCP> dce@quacky.UUCP (David Elliott) writes:
> The netmask is a mask of 32 bits that says what part of an address
> is the "network" (1s in the mask) and what part is the "host" (0s).

For a more complete understanding of why `netmask' is necessary, it
might be helpful to mention subnets.  Netmask allows installations
to partition a large network into smaller, manageable subnets.  This
is done this by `stealing' bits from the host part of an address to
create a "subnet field".  For example, given a 32-bit Class B address
(below left), a specific site *may* choose to allocate 8 bits (up to
15 bits) from the *host* part as a `subnet field' (below right):

	    1          2          3		    1	       2	  3
 01234567 89012345 67890123 45678901	 01234567 89012345 67890123 45678901
+--------+--------+--------+--------+	+--------+--------+--------+--------+
|10    network    |       host      |	|10    network    | subnet |  host  |
+-----------------+-----------------+	+--------+--------+--------+--------+

The netmask on the left would be 0xffff0000, and on the right, 0xffffff00.
You are correct in saying that the 0's part of the mask is for the host.

> the broadcast address must be an address of the form 97.x.y.z, but
> it CAN be any address except for the ones already assigned to the
> machines listed.

RFC-943 defines an address of all 1's to mean "all hosts".  As a result,
to reach "all hosts" on your fictitious "97." net, you must use the addr
"97.255.255.255" (assuming no IP multicasting).  In the same way, an "all
1's" subnet field should send the packet to all subnets (however, 4.3BSD
doesnt implement broadcasting across subnet boundaries).  4.2BSD predates
the RFC and uses the wrong broadcast address (all 0's).

> I believe that the default netmask in all BSD systems is 0xffff0000.

Netmask was *not* implemented in stock 4.2BSD.  It was hacked in, and
then later became part of 4.3 as the need for subnets arose.  You might
wanna check out RFC-950 for information on subnets.
---
Jeff Forys @ UC/Boulder Engineering Research Comp Cntr (303-492-6096)
forys@Boulder.Colorado.EDU  -or-  ..!{hao|nbires}!boulder!forys