[comp.protocols.tcp-ip] rwhod protocol and >42 users

dave@fps.com (Dave Smith) (10/04/89)

Has there been any thought given to extending the rwhod protocol so that
more than 42 (1024/sizeof whoent (==24)) users on a system will be reported 
correctly?  This limit is entirely too small for today's systems.

If I read the sources to rwhod correctly, if I define a new revision of the
protocol, rwhod's with different versions of the protocol will discard
the packets.  Is this correct?  Will anyone have a fit (ok, I know _someone_
will have a fit, will anyone _important_ have a fit?) if I arbitrarily
define a new revision of the protocol?

David L. Smith
FPS Computing, San Diego
ucsd!celerity!dave or dave@fps.com
"Repent, Harlequin!," said the TickTock Man

dls@mentor.cc.purdue.edu (David L Stevens) (10/07/89)

	It isn't a protocol change to increase the max rwho packet size. It's
an implementation change. We've been running with large rwho packets for years
(our Sequents support a couple hundred users without even blinking).
	You need to increase the size in "rwho.h" and increase the max udp
packet size (udp_sendspace and udp_recvspace) in sys/netinet/udp_usrreq.c.
And FINALLY, you need to ifdef out the code that refuses to fragment broadcast
packets in sys/netinet/ip_output.c.
	All of the changes are trivial. I can provide context diffs if you
ask me.
-- 
					+-DLS  (dls@mentor.cc.purdue.edu)

loverso@Xylogics.COM (John Robert LoVerso) (10/10/89)

In article <1178@celit.fps.com> dave@fps.com (Dave Smith) writes:
> Has there been any thought given to extending the rwhod protocol so that
> more than 42 (1024/sizeof whoent (==24)) users on a system will be reported 
> correctly?  This limit is entirely too small for today's systems.

Yes.  I played with this at Encore.  I did two approaches.  The
first one, based on a hack we had at SUNY/Buffalo, added a new
rwhod packet type that was just a continuation of the previous
information.  This allowed `n' packets; a modified rwhod would just
issue as many packets as needed.  Modified rwhod's would collect
these packets and concatinate them to the spool/rwho file.  Unmodified
rwhod's would just ignore the new packet types (except on certain
SystemV machines, where it would print stupid, obnoxious messages
on the console about `unknown type', sigh).  This worked.

The second approach just up'd the sendspace for udp packets to 8K;
this works with no protocol modification.  With this, fragmentation
happens automatically over networks with an MTU less than the packet
size.  However, several bugs in 4.2BSD prevents this from working.
4.3 is no problem.

With either approach, rwho/ruptime needed to be modified so that
they read more than 41 whoents from the spool/rwho file (i.e., just
read all of the file).

However, this is all BAD.  Either approach causes rwhod to send
out multiple broadcast packets.  On a relatively fast machine,
allowing IP to do the fragmentation, the broadcasts will hit the
net almost back-to-back.  This can cause many more problems then
benefits.  Many people will tell you that rwhod is ugly enough as
it is, just producing one broadcast packet every 1 [4.2] or 3 [4.3]
minutes.

If you are really hot on producing a new protocol, how about this.
Simple observation shows that most of the information rwhod gives
out doesn't change all that much.  So, perhaps a more reasonable
approach, would be to use a delta protocol, where any given packet
would include some new whoents and a few bits to delete old whoents.
Thus, over a period of time (10 minutes?), a rwhod-listener might
pick up a complete list of users.

Finally, you can just chuck rwhod and use Sun's "rusers/rup" instead.
It works, albeit not very well.

John
-- 
John Robert LoVerso			Xylogics, Inc.  617/272-8140
loverso@Xylogics.COM			Annex Terminal Server Development Group

loverso@Xylogics.COM (John Robert LoVerso) (10/10/89)

In article <7370@xenna.Xylogics.COM>, I write:
% ...fragmentation happens automatically over networks with an MTU less
% than the packet size.  However, several bugs in 4.2BSD prevents[sic] this
% from working.  4.3 is no problem.

This is incorrect.  4.2, 4.3, and 4.3-tahoe all refuse to fragment a broadcast
packet.  Needless to say, I was playing with a modified kernel (i.e., fire).

John
(thanks to those who pointed this out)

dave@fps.com (Dave Smith) (10/11/89)

In article <7370@xenna.Xylogics.COM> loverso@Xylogics.COM (John Robert LoVerso) writes:
>If you are really hot on producing a new protocol, how about this.
>Simple observation shows that most of the information rwhod gives
>out doesn't change all that much.  So, perhaps a more reasonable
>approach, would be to use a delta protocol, where any given packet
>would include some new whoents and a few bits to delete old whoents.
>Thus, over a period of time (10 minutes?), a rwhod-listener might
>pick up a complete list of users.

This sounds like a reasonable idea.  Of course for it to be at all useful
multiple vendors would have to support it.  I'll try to put some work in
on it and then perhaps get it merged into the BSD stuff.

>
>Finally, you can just chuck rwhod and use Sun's "rusers/rup" instead.
>It works, albeit not very well.
>

I don't see any advantage to rusers over rwhod.  It generates a broadcast
packet (albeit a smaller one) and then it makes everyone on the network
do some work.  We had enough trouble already when people discovered
"perfmeter" and started hitting rstatd up for information every 10 seconds.

David L. Smith
FPS Computing, San Diego
ucsd!celerity!dave or dave@fps.com
"Repent, Harlequin!," said the TickTock Man

michaud@devax.dec.com (Jeff Michaud) (10/12/89)

> Finally, you can just chuck rwhod and use Sun's "rusers/rup" instead.
> It works, albeit not very well.

	Is rusers like remote finger (ie. like "f @some-host-name")?

	A more efficient method all around would be to have a dedicated system
	act as a "remote who" server.  On a timer the server system (lets call
	this one the master server) will send out a udp broadcast advertising
	itself as a "remote who" server.  Hosts that want to advertise who's
	logged on the system (call these slave servers) send the info directly
	to the "remote who" server.   Clients which want info on whos logged
	in on the network asks the local slave server (who knows who the master
	server is) who in turns asks the master server (or the slave server
	can store the master servers address/port away where clients can use
	it to request the info directly from the master server).

	To extend this just a bit further so improve on the availability
	problem if the master server is down, we can allow for multiple
	master servers to exist.  Slave servers will remember all the master
	servers and also send their "whos logged on me" messages to each
	of the masters.  3-4 servers on a lan should be sufficient if
	a high assurance of availability is needed.

	This reduces udp broadcast traffic for an equiv service to rwho
	down to 4/minute if there is 4 master servers which each advertise
	themselves once a minute.  It also removes the beating on the
	disk to keep writing out to the /usr/spool/rwho/whod.* files
	(my workstation has been up 7 days and rwhod already has 47 minutes
	of cpu time, and this is on a fast DECstation 3100!).

	This would probably be a good job for a location brokerage service?

/--------------------------------------------------------------\
|Jeff Michaud    michaud@decwrl.dec.com  michaud@decvax.dec.com|
|DECnet-ULTRIX   #include <standard/disclaimer.h>              |
\--------------------------------------------------------------/