[comp.sys.apollo] TCP/IP gateway troubles in 10.2

bonnetf@apo.esiee.fr (bonnet-franck) (08/29/90)

I need a TCP/IP guru.

We have a very borrowing problem with TCP/IP communications.

Our Gateway was a DN3500 which was running 10.1, has one 380 MB disk 
and 16 Mb of memory.                  

Our Domain Ring network is composed of 10.1 and 10.2 machines only.
Everything was working perfectly.
 
For several geographic reasons I had to change the TCP/IP gateway machine.
So I 've installed the 802.3 controller in the new machine which is also
a DN3500 with two 700 MB disks and 12Mb of memory.

BUT this machine is running 10.2 Domain_os ... and has two disk instead of one.

After installing the controller, copying the config-files ( identicals as before )
I've restarted the new gateway machine.          

Since this time , we've been borrowed ALL THE TIME by communications troubles 
between our machines, TCP/IP seems to hang after a few time and there are 
no way to restart it EXCEPT to make a TCP/IP request from ANOTHER machine 
than one of our apollos. We have also 3 HP machines (300,500,835), how lucky
we are ...

 - The config files are exactly the same( except the name of the gateway of course).
 - Nothing has changed on the others machines.
 - All the hardware and connections seems OK. 
 - The eth. controller is OK ( same one ).              
 - Using NAMED or not does not change the problem.
                                                  
Does anybody knows what happen to our gateway, is it a 10.2 specific BUG ?
The release-notes says TCP/IP is more efficient in 10.2 is it a joke ?

We are in troubles, help would be greatly appreciated.

-------------------------------------------------------------------------------|
bonnetf@apo.esiee.fr                     |                                     |
Frank Bonnet                             | Surfing ...                         |
E.S.I.E.E                                |                                     |
BP99 93162 Noisy le Grand cedex.FRANCE.  | the rest is details !               |
Fax   : 33 1 45 92 66 99                 |                                     |
-------------------------------------------------------------------------------|
 

dbfunk@ICAEN.UIOWA.EDU (David B Funk) (08/30/90)

In posting <9008291636.AA00462@apo.esiee.fr>, bonnetf@apo.esiee.fr (bonnet-franck) asks:

> We have a very borrowing problem with TCP/IP communications.
> Our Gateway was a DN3500 which was running 10.1, has one 380 MB disk 
> and 16 Mb of memory.                  
 [stuff deleted]
>  
> For several geographic reasons I had to change the TCP/IP gateway machine.
> So I 've installed the 802.3 controller in the new machine which is also
> a DN3500 with two 700 MB disks and 12Mb of memory.
> BUT this machine is running 10.2 Domain_os ... and has two disk instead of one.
> 
> Since this time , we've been borrowed ALL THE TIME by communications troubles 
> between our machines, TCP/IP seems to hang after a few time and there are 
> no way to restart it EXCEPT to make a TCP/IP request from ANOTHER machine 
> than one of our apollos. We have also 3 HP machines (300,500,835), how lucky
> we are ...
 [stuff deleted]
> Does anybody knows what happen to our gateway, is it a 10.2 specific BUG ?
> The release-notes says TCP/IP is more efficient in 10.2 is it a joke ?

Yes, there is a specific bug in the routing daemon '/etc/routed' that was
released with sr10.2. It would start up just fine but after some time (hours
to days) it could "time-out" a network interface and consider it down.
When this happened, it would drop that interface from its routing tables &
quit broadcasting any routing information about it. The net effect is that
the gateway would quit looking like a gateway to other machines & quit routing.
There was a patch on the January (& newer) patch tapes to fix this problem.
Quoting from the release notes:


     1.53  Patch m0108 /etc/routed

     Patch m0108 includes fixes to the /etc/routed command for nodes run-
     ning the SR10.2 version of Domain/OS. This patch is incompatible with
     all other versions of Domain/OS.

     Patch m0108 fixes the following problem (DDC72):

     The /etc/routed command was timing out active interfaces.  routed has
     been modified to prevent it from timing out and thus marking "down"
     interfaces that are configured "up". It does, however, time out inter-
     faces which have been configured "down" via /etc/ifconfig.

     Install patch m0108 on nodes running the SR10.2 version of Domain/OS
     (use the bldt command to determine the revision of the operating sys-
     tem running on your workstation).

     Patch m0108 includes the following file:

        /etc/routed        1989/11/10 20:56:21 EST (Fri)


BTW, I would strongly reccomend getting patch tape M68K_9007 (or newer) and installing
patches: m0139, m0162, & m0165. These, together with m0108, are necessary for reliable
tcp/ip service on a sr10.2 machine (fixes the infamous sr10.2 pty problem).

Dave Funk

kerr@tron.UUCP (Dave Kerr) (08/31/90)

In article <9008291636.AA00462@apo.esiee.fr> bonnetf@apo.esiee.fr (bonnet-franck) writes:
>I need a TCP/IP guru.
>
>We have a very borrowing problem with TCP/IP communications.
>

[ description deleted ]

There is a patch for the 10.2 /etc/routed tcp/ip router
program. It's patch number 108.
-- 
Dave Kerr (301) 765-4453 (WIN)765-4453
tron::kerr                 Internal WEC vax mail
kerr@tron.bwi.wec.com      from an Internet site
kerr@tron.UUCP             from a smart uucp mailer