[comp.sys.apollo] TCP/IP woes

wicinski@NRL-CSS.ARPA (tim wicinski) (11/23/87)

Yes, I too have had MASSIVE TCP/IP problems.  the problems I have had
have been with the Domain/IX side of it, I never tried the Aegis side,
though it looked no better.  

routed/rwhod:  never would work. routed would find other route daemons,
but other route daemons could not.  if we ran rwho, no one would pick us
up.  we bagged that quickly.

sendmail:  bad sendmail problems.  supossedly, according to our big
Eunuch weenies, sendmail should run as a user process and not a root
process.  (can any one confirm this ?).  also, mail was never forwarded
and was stuck in the queue.  example: if someone send me mail to my
apollo, and i had my .forward file set to forward it to another machine,
it would get stuck in my apollo's mail queue and never make it out.
running /usr/lib/sendmail -q as root and as joe user did no good, it
never attempted to run thru the queue. also mailq will never show you
any queued messages unless you were root.

ethernet storms:  from time to time we have these intense ethernet
storms when someone comes on the net with a bad broadcast address, and
starts running their rwho daemon.  then about 150 machines freak out
when the rwho daemon broadcasts, and 150 machines starting sending arp
packets everywhere.  The load is incredible, and the burst rate and the
rate of occurence (every minute) is enough to hang the Ethernet
controller within a half hour.  this is pretty frustrating, and
sometimes there is little we can do about this (this problem also
prevented our diskless suns from booting, but we did not figure out the
problem for awhile). 

nfs:  nfs seems to work fine for most stuff, but in one case i can't get
it working:  I mount a nfs directory (ie, only  a directory from an NFS
partition was exported) as a root file system (in the // directory), and
from the gateway node everything is fine, but from the others machines
in the ring, it does not recognize the partition.  when an attempt to do
an ls on partition from a non-gw node, all that shows up is .*

patch tapes:  not only have i not heard about the latest patch tape, but
i was told by the 800 # to call the local sales office for a patch
tape, and NO ONE their ever heard of one.  I called our service guy, and
talked to three other service people, some tech support people, and the
so-called "s/w librarian" all who said the same thing:  never heard of
one.  

oh well, can't have anything you want

tim wicinski 
naval research labs

nelson_r@apollo.UUCP (Rolf Nelson) (11/24/87)

   >> routed/rwhod:  never would work. routed would find other route daemons,
   >> but other route daemons could not.  if we ran rwho, no one would pick us
   >> up.  we bagged that quickly.

This is fixed in the October 2nd patch tape. Patch #66, #67. There is
a new /etc/rwhod.
    
   >> sendmail:  bad sendmail problems.  

There was a problem where sendmail using the SMTP protocol would die after
fork().  This is fixed in SR9.7 or on the October patch tape if you apply
patch #50 to SR9.6.  Your .forward problem sounds unrelated, please submit
a UCR to Apollo.
    
   >> ethernet storms:  from time to time we have these intense ethernet
   >> storms when someone comes on the net with a bad broadcast address,

This could be rwhod or routed which are both fixed in the patch tape as 
mentioned above.  An excerpt from the patch tape release notes desribes 
a similiar problem.

 " Rwhod will generate increasing volumes  of  broadcast  traffic.   Most
  of these  broadcasts  will  never  be received, but they will impose an
  increased load on the network.  These increased  broadcasts  will  cause
  rwhod  to  use excessive  cpu  time, upwards of 90%. "

   >>nfs:  nfs seems to work fine for most stuff, but in one case i can't get
   >>it working:  I mount a nfs directory (ie, only  a directory from an NFS
   >>partition was exported) as a root file system (in the // directory), and
   >>from the gateway node everything is fine, but from the others machines
   >>in the ring, it does not recognize the partition.  when an attempt to do
   >>an ls on partition from a non-gw node, all that shows up is .*

Does TCP work from the non-gateway node?  Does this work from all nodes if
you mount at / instead of // ?  The // directory is a special directory in
that a copy of it exists on every node, so you may have to either mount the
nfs file system at each node or use the ns_helper to make this work right.
    
  >>patch tapes:  not only have i not heard about the latest patch tape, but
  >>i was told by the 800 # to call the local sales office for a patch
  >>tape, and NO ONE their ever heard of one.  I called our service guy, and
  >>talked to three other service people, some tech support people, and the
  >>so-called "s/w librarian" all who said the same thing:  never heard of
  >>one.  

The patch tape is produced about once a month by Apollo's Customer Service.
I believe it is only sent out by request and orderable by field service 
personnnel.  The following info was available on-line here at Apollo on 
the patch tape.

  Apollo field personnel can order patch kits from the NACS Software Support
  Administrator as follows: 

  * by telephone, at 617-256-7159
  
  * by FAX, at 617-250-8022

  The information required to order a patch kit is the following:

  Customer name - site address
  Customer contact
  Contact phone number
  Node type(s) involved
  Software release
  Media type requested
  Patch required (e.g., October 2nd patch tape TCP/IP fixes #66 and #67
                        and sendmail fork fix #50)

. Instructions on how to install the patches are provided in hard copy 
  release notes supplied with the patch kit. A complete description of 
  the policies and procedures for the patch kit mechanism can be found 
  in North American Customer Services Bulletin # 293.

Have your field service person order one for you. They really should be aware
of the patch tape I'm sure it just slipped through the cracks. I hope this info
helps expedite the solution to your problems.

--  Rolf Nelson           UUCP: {mit-erl,yale,uw-beaver}!apollo!nelson_r 
    Apollo Computer       ARPA: apollo!nelson_r@EDDIE.MIT.EDU
    
    
    
-------

jen@mips.UUCP (Fred Jen) (11/24/87)

In article <8711231300.AA25389@nrl-css.ARPA> wicinski@NRL-CSS.ARPA (tim wicinski) writes:
>
>patch tapes:  not only have i not heard about the latest patch tape, but
>i was told by the 800 # to call the local sales office for a patch
>tape, and NO ONE their ever heard of one.  I called our service guy, and
>talked to three other service people, some tech support people, and the
>so-called "s/w librarian" all who said the same thing:  never heard of
>one.  
>
>oh well, can't have anything you want
>
>tim wicinski 
>naval research labs

Here is the content of the newest patch kit I've received:

Patch kit volume ID: patch_2oct87.

CHAPTER  1   PATCH KIT CONTENTS. . . . . . . . . . . . . . . . . . . . . . . 1-1

             1.1   Patch 1 DN560s with 16 MHz CPUs . . . . . . . . . . . . . 1-2
             1.2   Patch 2 Enhanced DEX for DN3000s. . . . . . . . . . . . . 1-3
             1.3   Patch 3 Improved AEGIS for DN3000s. . . . . . . . . . . . 1-4
             1.4   Patch 6 SR9.5.1 version of /SAU8/INVOL (11/11/86) . . . . 1-6
             1.5   Patch 9 SAU8/DISP8B (Revision 1.9, 10/1/86) . . . . . . . 1-7
             1.6   Patch 10 Text Stability on DN570s, DN580s and DN3000s . . 1-7
             1.7   Patch 12 Bourne Shells for BSD4.2 and System V. . . . . . 1-8
             1.8   Patch 14 /SYS/COLOR_MICROCODE for DN600s. . . . . . . . . 1-8
             1.9   Patch 16 /SAU6/SPAD.UC and /SAU6/WCS.UC (Version 2.2) . . 1-9
             1.10  Patch 18 New WIN8 Diagnostics and Driver. . . . . . . .  1-10
                            for DN3000s (Version 2.7)
             1.11  Patch 19 Updated FPU Diagnostics (Version 1.42) . . . .  1-11
             1.12  Patch 20 /SYSTEST/SAX.SLF (Revision 5.1). . . . . . . .  1-11
             1.13  Patch 21 Updated /LIB/PMLIB . . . . . . . . . . . . . .  1-12
             1.14  Patch 23 Domain/Dialogue Product Changes. . . . . . . .  1-13
             1.15  Patch 24 New /DIALOGLIB . . . . . . . . . . . . . . . .  1-16
             1.16  Patch 25 Changes to /SAU Files for SR9.2.6. . . . . . .  1-16
             1.17  Patch 26 /SAU7 Ring Diagnostics for DN4000s . . . . . .  1-17
             1.19  Patch 28 /SAU6 Display Diagnostic for DN5x0-T Nodes . .  1-17
             1.20  Patch 29 /SAU5 Display Diagnostic for DN5x0 Nodes . . .  1-18
             1.21  Patch 30 Changes to /SYSTEST Files for DN5x0-T Nodes. .  1-18
             1.22  Patch 31 Native Ethernet Network Controller . . . . . .  1-19
                            on DN3000s and DN4000s
             1.23  Patch 32 Changes to /LIB/GPRLIB for DN550s, DN560s, . .  1-19
                            DN600s, and DN660s
             1.24  Patch 33 Changes to /LIB/GPRLIB for GSR . . . . . . . .  1-20
                            Graphics Product
             1.25  Patch 34 Improvements to PRINTF Command . . . . . . . .  1-21
             1.26  Patch 35 Changes to /LIB/PMLIB and /SAU7/CONFIG . . . .  1-21
                            for DN4000s and DSP4000s
             1.27  Patch 36/Patch 37  Error Reporting for Turbos . . . . .  1-22
             1.28  Patch 39 Improvements to GMR2D Product. . . . . . . . .  1-22
             1.29  Patch 40 Improved VT100 Emulator. . . . . . . . . . . .  1-23
             1.30  Patch 41 Changes to BSD4.2 LINT Command . . . . . . . .  1-24
             1.31  Patch 42 Changes to SYSV CPP and LINT Commands. . . . .  1-25
             1.32  Patch 43 /SAU7/WIN.DEX for DN4000s. . . . . . . . . . .  1-25
             1.33  Patch 44 New AEGIS and AEGIS.MAP for DN3000s (Revision.  1-26
                            (Revision 9.5.1.1)
             1.34  Patch 45 Changes to /SYSTEST/GRTEST . . . . . . . . . .  1-26
             1.35  Patch 46 Changes to SAX at SR9.6/SR9.6.1. . . . . . . .  1-27
             1.37  Patch 48 New INVOL for /SAU6, /SAU7, /SAU8, and /COM. .  1-27
             1.38  Patch 49 Changes to /LIB/TFP. . . . . . . . . . . . . .  1-28
             1.39  Patch 50 Changes to /LIB/PMLIB and /SYS/ENV . . . . . .  1-29
             1.40  Patch 51 New Ring Controller Diagnostic . . . . . . . .  1-29
             1.41  Patch 52 New Memory Diagnostic for DN3000s and DN4000s.  1-30
             1.42  Patch 53 BSD4.2 TROFF and NROFF . . . . . . . . . . . .  1-30
             1.43  Patch 54 New SPE Driver (Rev 1.2) . . . . . . . . . . .  1-31
             1.44  Patch 55 New Version of /COM/PAS (Rev. 7.3804). . . . .  1-32
             1.45  Patch 56 New Version of /COM/FTN (Version 9.55) . . . .  1-33
             1.46  Patch 57 New Version of /COM/CC (Version 4.85). . . . .  1-34
             1.45  Patch 58 New COLOR2_MICROCODE for DN580s and DN580-Ts .  1-36
             1.46  Patch 59 New Ring Diagnostics for DN3000s . . . . . . .  1-36
             1.47  Patch 60 New Utility - PREPVOL. . . . . . . . . . . . .  1-36
             1.48  Patch 61 New AEGIS and AEGIS.MAP for DN4000s and. . . .  1-38
                            DSP4000s
             1.49  Patch 62 Changes to Memory Diagnostic for Turbos. . . .  1-39
             1.50  Patch 63 Changes to /SAUn/DEX . . . . . . . . . . . . .  1-39
             1.51  Patch 64 Changes to SAX Diagnostics . . . . . . . . . .  1-40
             1.52  Patch 65 Changes to /SAU7/CPU.DEX . . . . . . . . . . .  1-41
             1.53  Patch 66/Patch 67 Changes to TCP/IP for Aegis . . . . .  1-41
                            and Domain/IX

As for TCP/IP rev3.0 problem, patch 66 and 67 should fix the problem.

Here is the documentation for  patch 66 and 67:

1.53  Patch 66/Patch 67 Changes to TCP/IP for Aegis and Domain/IX        



Patches 66 and 67 contains changes to Domain-TCP and BSD4.2-TCP. It fixes  the
following  problems  occuring  with  Version  3.0  of TCP running on an SR9.5,
SR9.6, or SR9.7 node:

     o  TCP V3.0 Dynamic  Routing  does  not  work  properly.   The  Domain/IX
        /etc/routed  does  not  work  properly  at  sr9.5.   Routed is used on
        TCP/IP gateway nodes to implement the dynamic distribution of  routing
        information  across a TCP/IP internet.  Aegis TCP includes the program
        /sys/tcp/rip_server which is identical to the  Domain/IX  /etc/routed.
        The /etc/routed is a BSD4.3  implementation. 

        The failure mode for the TCP 3.0 Routed/Rip_Server is as follows:

        The  program  may  appear  to  work  for  some  period  of  time,  but
        eventually it  will  stop  working  correctly.   Gateways  may  inform
        neighboring  gateways  that  they  have  been disconnected from one of
        their  interfaces.   Gateways   may   also   broadcast   the   routing
        information  to  the  wrong port, so that the routing information will
        not be received by neighboring gateways.  When this happens,  gateways
        will drop routes involving Apollo gateways. 

        The  Routed/Rip_Server  will  generate increasing volumes of broadcast
        traffic on all networks to which  it  is  connected.   Most  of  these
        broadcasts  will  never be received, but they will impose an increased
        load on the network. 

        The tcp_server in this  kit  implements  a  -r  switch.   This  switch
        determines    which    routing    program   to   run   at   tcp_server
        initialization.  The default routing  when  using  the  -r  switch  is
        /etc/routed.   Users  wishing  to  run  Aegis tcp servers, can specify
        -r/sys/tcp/rip_server causing the rip_server  to  be  run  instead  of
        /etc/routed.   Without  the  switch, tcp_server runs /sys/tcp/makegate
        to acquire routing table entries.  

NOTE:     We recommend that you use the -r option on  gateways,  and  use  the
          default  (no  -r option) on non-gateway hosts.  For example, use the
          following line in your startup.19l  entry  for  tcp_server   gateway
          nodes running Domain-TCP:
          
                     cps /sys/tcp/tcp_server -r/sys/tcp/rip_server. 
          
          Note that there is no space between -r and the pathname. 
          

o  RWHOD  does not work properly in the BSD-TCP environment.  TCP3.0 Domain/IX
/etc/rwhod did not report all users logged in correctly. 

The Rwhod will generate increasing volumes  of  broadcast  traffic.   Most  of
these  broadcasts  will  never  be received, but they will impose an increased

load on the network.  These increased  broadcasts  will  cause  rwhod  to  use
excessive  cpu  time, upwards of 90%. The fixed version of RWHOD  doesn't load
the CPU up to 100%. 

o  TCP_SERVER fails to recognize the dr1 interface.  

The  dr1  interface  is  a  requirement   for   nodes   running   Etherbridge,
Domain/Bridge-A  or  -B,  or   SR9.6 DN3000/4000 bridges containing two AT-bus
Apollo Token Ring controllers (dr0 and dr1). 

The failure mode is that the tcp will not recognize  the  dr1  interface.   It
will  continue to function on the dr0 interface.  The new tcp_server need only
be installed on the bridge gateway node. 

Broadcast packets of 255.255.255.255 [ffff] are  recognized  by  the  3.0  fix
kit 

o  The  previous version of TCP_SERVER does not support the broadcast  address
255.255.255.255; however the version  provided  with  this  patch   recognizes
broadcast packets of 255.255.255.255 [ffff]. 

o  TRPT  does  not  supply  the  info  in  the  correct  format  for  detailed
debugging. 

The new TRPT has been  enhanced  to  provide  an  improved  trace  information
display format and the help file TRPT.HLP has been augmented. 

The  tcp_server  in  this  kit implements a -r switch.  This switch determines
which routing program  to  run  at  tcp_server  initialization.   The  default
routing  when  using the -r switch is /etc/routed.  Users wishing to run Aegis
tcp servers, can specify -r/sys/tcp/rip_server causing the  rip_server  to  be
run   instead   of   /etc/routed.    Without   the   switch,  tcp_server  runs
/sys/tcp/makegate to acquire routing table entries.  

NOTE:     We recommend that you use the -r option on  gateways,  and  use  the
          default  (no  -r option) on non-gateway hosts.  For example, use the
          following line in your startup.19l  entry  for  tcp_server   gateway
          nodes running Domain-TCP:
          
                     cps /sys/tcp/tcp_server -r/sys/tcp/rip_server. 
          
          Note that there is no space between -r and the pathname. 
          

Before  installing  Patch  66  or Patch 67, your workstation should be running
SR9.5, SR9.6, or SR9.7 of the Aegis  or Domain/IX operating system.  (use  the
BLDT   Shell   command   to   determine  which  version  is  running  on  your
workstation). Do not update your system software to one of the above  versions
AFTER  installing  Patch  66;  if  you  do,  you  will  reverse the changes to
/systest/ssr_util/trpt  and  /systest/ssr_util/trpt.hlp    effected   by   the
patch. 

Patch  66 should be installed on any node running TCP/IP, either Domain-TCP or
BSD4.2-TCP/IP.  Patch  67  should  be  installed   only   on   nodes   running
BSD4.2-TCP/IP. 

Patch 66 changes the following system files:

    sys/tcp/rip_server
    sys/tcp/tcp_server
    systest/ssr_util/trpt
    systest/ssr_util/trpt.hlp

Patch 67 changes the following system files:

    bsd4.2/etc/routed
    bsd4.2/etc/rwhod



This patch does not fix the problem related to UDP window size (this is a
problem related to Imagin printer on Ethernet).  It will be fixed in 
TCP/IP rev3.1 (rev3.1 currently is in beta test).  

Currently, I'm running TCP/IP rev2.1, so I'm not sure if rev3.0 fixed the 
trailer problem (a feature Apollo suppose to support).  Does anybody know
the answer?

fred

-- 
-Fred Jen
UUCP: {decvax,ucbvax,ihnp4,hplabs}!decwrl!mips!jen
USPS: MIPS Computer Systems, 930 Arques, Sunnyvale, CA 94086, (408) 991-0220

krowitz@mit-richter.UUCP (David Krowitz) (11/24/87)

Oddly enough, I just received a patch tape which has some 30 or
40 patches on it for every release from sr9.2 through sr9.6.1
and notes on which patches should be installed for which
release levels. My tape came in the mail without my ever
asking for it ... the tape label says "009757" (which I
assume is the part number) "CRTG_PATCH_02OCT87" (it's a
cartridge tape) and "AEGIS PATCH KIT 10/19/87". See if your
local office or the 800 number can track it via the part
number. This tape does contain a number of TCP/IP updates.


 -- David Krowitz

mit-erl!mit-kermit!krowitz@eddie.mit.edu
mit-erl!mit-kermit!krowitz@mit-eddie.arpa
krowitz@mit-mc.arpa
(in order of decreasing preference)

giebelhaus@hi-csc.UUCP (Timothy R. Giebelhaus) (11/27/87)

Here is what I understand about how to get a patch tape.  If you can't 
get a tape any other way, send me mail and I will see if I can put
you in touch with the very helpful people who gave me my tape(s). 

If you have a call open with the 1-800-2APOLLO number which involes a problem
the patch tape would solve, you should get sent the tape in the mail.  I
would make sure that they plan to send it out, though.  I find it hard to
believe that someone at the 1-800 number did not know about the tape.  
They *SHOULD* know about the tape.  Ask the person who does not know about
the tape to contact the UNIX group there.  If that does not work, you
can give me his name and I'll tell him who to talk to.

As far as I can tell, the local office has nothing to do with the patch
tapes.  I give copies of what ever fixes I have to the local office, though.

They are having problems with getting the patch tapes written at Apollo.
With the demand higher than the supply, it is not a piece of cake to get
the patch tape.
-- 
---------------------------------
UUCP: {uunet, ihnp4!umn-cs}!hi-csc!giebelhaus
ARPA: hi-csc!giebelhaus@umn-cs.arpa
Nobody I know admits to sharing my opinions.  I don't even have a pet.

mkhaw@teknowledge-vaxc.ARPA (Mike Khaw) (11/28/87)

I gripe a lot about Apollos, but I have to hand it to them this time.
We got our patch tape last week -- without having asked for it, and just
a day or so after I'd seen the first posting here about the patch tape.

Mike Khaw
-- 
internet:  mkhaw@teknowledge-vaxc.arpa
usenet:	   {uunet|sun|ucbvax|decwrl|uw-beaver}!mkhaw%teknowledge-vaxc.arpa
USnail:	   Teknowledge Inc, 1850 Embarcadero Rd, POB 10119, Palo Alto, CA 94303