[comp.sys.encore] Changing Annex packet size?

loverso@Xylogics.COM (John Robert LoVerso) (11/17/89)
In a comp.unix.wizards article, <9600@pyr.gatech.EDU>, David Brown writes:
> Sorry for posting this to comp.unix.wizards, but I was at a loss for a 
> better newsgroup (not to mention I'm desparate).

[Follow-ups to comp.sys.encore, for now]

> ...POOR performance using an Annex into a 750 running 4.3BSD...

Using rlogin or telnet from an Annex is no different than using it
from a BSD host.  I.e., remote echo of input is normally done, so
that any character typed needs to be sent [almost] immediately
(resulting in a 1 char packet) so that an echo will get back in
time to prevent the lose of a "responsive" feel.  While you can
tune the Annex so that it delays sending characters, this will
result in a slower "feel" (and is usually only used for data
transfers, not interactive use).

A better way to get the amount of per-packet traffic down is to
use local echoing.  There are two ways to accomplish this.  The
current Annex release supports the 4.3 "line-by-line" mode in
TELNET, which provides an (almost) workable way to handle the local
echo while in cooked mode.  However, an even better way is the new
TELNET Linemode option, which provides support for both local echo
and editting.  This should be available in the next Annex release.
In conjunction with the new BSD telnet daemon (see a recent article
in comp.protocol.tcp-ip by Dave Borman), which handles the Linemode
option, the Annex (or any other host) can handle the echo/editting
of cooked mode locally.

However, it might not be the local echo that is the cause of the
performance loss.  telnetd and rlogind can quickly chew up a host
with the processing of input characters.  This is because all they
are doing is act as a user-level data switch between two kernel-level
objects (a socket and a pty).  When another user-level process
(like an editor) is reading from the slave pty, the context switch
and syscall overhead alone can kill your machine.

The "fix" for this is a set of patches posted long ago to mod.sources
that provide an "in-kernel rlogind" and "in-kernel telnetd".  The
name is really a misnomer as the changes really just add a means
for the direct linkage of a socket and pty.  None of the telnetd
or rlogind code is actually moved into the kernel.  For example,
telnetd just makes an ioctl() to connect the socket and pty.  This
ioctl() doesn't return until an IAC is received on the telnet
connection, in which case normal telnet escape processing occurs
and the ioctl() is just called again.  I.e., this really just
provides some "kernel-assist" for rlogind/telnetd.  [Arguments on
"what should be in the kernel" please send somewhere else].

These patches were origally done for rlogin/rlogind by Rick Ace at
NYIT.  Charles Hedrick of Rutgers added the telnetd support.  At
the IETF meeting two weeks ago, Chuck told me this code is still
in use on all their Suns for access from their terminal servers.
Anyway, the original postings can be found in the archives on uunet
in comp.sources.unix/volume4/{rlogind.Z,telnet.Z}.  Note that the
diffs are for 4.2BSD; I spend the shorter part of an afternoon
getting a SunOS3.5 kernel to build and work with the patches.

As for how much of a difference it really makes, a while back I
set up a test where I used two Annexes to simulate rlogins directed
at a headless Sun3/260 running SunOS3.5.  The test faked users
typing at an even 5cps into cooked mode with echo ("cat > /dev/null").
I had some monitoring output on an ASCII terminal I was using as
the Sun's console.  Using a normal rlogind, the responsiveness of
the Sun was slightly degraded at 16 users, severly degraded at 32
users and unuseable at 48 users.  Using the kernel-assisted rlogind,
there was no noticable performance hit until there were over 40
"users" "typing" away!  A subset of the data I collected follows:

                  procs   faults    %cpu      net pkts
  type    users    r  b   sys  cs  us sy id    in  out

  normal   16      5  0  1126 192   6 84 10   145  152
  kernel   16      0  0    30   8   0 16 84   145  145

  normal   48     50  2   696 117   3 97  0   ***  *** (varied from 400-1400!)
  kernel   48      0  0    40  16   0 74 26   410  405

These numbers show a clear advantage with the kernel-assisted code,
which merely points out an implementation flaw that most (all?)
vendors using BSD-network code duplicate in their products.  More
importantly, with a VAX11/750, any performance improvement you can
get will be a big win.  (btw, MIPSwise, an Annex-II has more horse
power than a 750).

-- 
John Robert LoVerso                     Xylogics, Inc.  617/272-8140
loverso@Xylogics.COM                    Annex Terminal Server Development Group