[comp.sys.apollo] Apollo TCP/IP gateways

lori@hacgate.scg.hac.com (Lori Barfield + 1/2) (02/02/90)

In article <9001311857.AA15864@mwunix.mitre.org> art@AERA785.MITRE.ORG ("Art McClinton") writes:
>I am having difficulty setting up a TCP/IP network between 20 Apollos
>and the other 1300 workstations on our ethernet.  I have no difficulty
>with the Apollo's that have ethernet cards, but the ones that are on the token 
>ring and have to use the gateway apollo to reach the ethernet are unable to 
>communicate to anything off the ring.

Yeah, me too.  From one node I get a socket error when I try to talk
to a printer daemon on another node.  And with rlogin I get
"network unreachable" to/from anything but the gateway node.
A little fiddling made that a timeout error instead.

Odd thing is, I had this working at one time.  Don't know what got
changed.  (And my notes are about as helpful as TFM.)

crp on my internet (ironically, much more sophisticated than rlogin!)
works without a hitch from anywhere to anywhere.

Help!


...lori

dbfunk@ICAEN.UIOWA.EDU (David B Funk) (02/02/90)

Art & Lori

Your problems:

>>with the Apollo's that have ethernet cards, but the ones that are on the token 
>>ring and have to use the gateway apollo to reach the ethernet are unable to 
>>communicate to anything off the ring.
>
>Yeah, me too.  From one node I get a socket error when I try to talk
>to a printer daemon on another node.  And with rlogin I get
>"network unreachable" to/from anything but the gateway node.

all look to be related to tcp/ip routing not working.
However the causes/cures depend upon what version of software that
you are running. In this kind of case, it can depend upon the OS revision
on the nodes evidencing the problem & the revision on the gateway node.
So when you post a problem statement like this, Please state the
software revisions. It makes it a lot easier to try to help you.

Dave Funk

krowitz%richter@UMIX.CC.UMICH.EDU (David Krowitz) (02/02/90)

/com/crp does not rely upon TCP/IP for the underlying network
packet services. The most common error with TCP/IP utilities
(rlogin, lpr, telnet, ftp, etc.) is the TCP server (either
/etc/tcpd under SR10 or SR9.7 BSD4.2 or /sys/tcp/tcp_server
under SR9.7 Aegis) dying or loosing its host tables. A hung,
or dead, tcp server is frequently the cause of a tcp socket
error. The "network unreachable" error is usually caused by
a local tcp server which has lost its host table, but can
also be caused by a gateway that has crashed or lost its
tcp server.

Under Sr10, you can use the /etc/ping command to test your
network connections. If "/etc/ping <host on local net>" works,
but "/etc/ping <gateway> " doesn't work, then the problem is
that the gateway is either down or it's tcp server and/or
routed daemon is dead. If /etc/ping does work with any of
your local hosts, then the tcp server on your own machine
is the culprit.

Unless you have multiple gateways between your Apollo net and
the outside world, run /etc/routed *only* on your gateway. Use
the "/etc/route add default <gateway node> 1" command on your
non-gateway nodes instead of routed. The routing daemon
periodically flushes the tcp server's routing tables of "old"
routes, and if the routing daemon on the gateway fails to
update your local routing daemon you can lose all of the routing
info in your local tables. If you *must* run /etc/routed on your
local non-gateway nodes (ie. networks with multiple gateways
to the outside world) then use the "-q" switch so that the
local nodes will operate in "quiet" (ie. listen only) mode to
avoid unecessary network traffic. Imagine what would happen if
all 2000+ nodes on the MIT campus network were to run routing
daemons all broadcasting their route tables to each other
at once!


 -- David Krowitz

krowitz@richter.mit.edu   (18.83.0.109)
krowitz%richter.mit.edu@eddie.mit.edu
krowitz%richter.mit.edu@mitvma.bitnet
(in order of decreasing preference)

chen@digital.sps.mot.com (Jinfu Chen) (02/03/90)

In article <7122@hacgate.scg.hac.com> lori@hacgate.hac.com writes:
>
>Yeah, me too.  From one node I get a socket error when I try to talk
>to a printer daemon on another node.  And with rlogin I get
>"network unreachable" to/from anything but the gateway node.
>A little fiddling made that a timeout error instead.

The SR10 TCP/IP manual is quite helpful in terms of debugging/trouble-shooting
problem. With combination of /etc/ping, /usr/ucb/netstat, you can find out if
the router on your gateway behaves properly or not.

From our experience, message "network unreachable" usually means either 
/etc/hosts and /etc/networks don't have the right addresses, or the routed on
the gateway doesn't work. On SR10, the default /etc/rc.local uses -h option
for non-gateway node, which I was told is wrong (from Mentor CSB?). Right now
I use -f -q for non-gateway node, and -f -g for gateway node.

If you are able to get a message about time-out, start playing with netstat
(the /usr/ucb one, not /com/netstat!) options to see if your non-gateway node
can see the network pass-thru the gateway (option -r).

My biggest complaint about Apollo's TCP/IP software is that under SR10.x,
our DSP90 (3mb ram) gateway node can't handle the load as good as under
SR9.7. The tcpd occasionally crashes. Apollo's response is to use an AT-bus
type of machine to be the gateway. However, I still have a hard time to
convince my manager to shell off $1200 for an ethernet card. How can you
answer his question, "if ECMB (ethernet controller on multibus) works under
sr9.7, it should work under sr10, otherwise, ask Apollo for a free upgrade
to an AT-bus card"?

--
Jinfu Chen                  (602)898-5338      |       Disclaimer:
Motorola, Inc.  Logic IC Div., Mesa, AZ        | 
..{somewhere}!uunet!dover!digital!chen        | My employer doesn't pay
chen@digital.sps.mot.com                       | me to express opinions.

kwongj@caldwr.UUCP (James Kwong) (02/03/90)

On the token side (non-gateway) Apollos try using the command:

'/etc/route add default gateway_apollo 1'

Where gateway_apollo is the hostname of the gateway
machine.                                    

I've noticed with SR9.7, if you loaded the Aegis version of TCP/IP, 
it doesn't require explicit routing to the gateway. If you loaded just 
BSD TCP/IP or are using the SR10.x version, you have to use the
/etc/route command. 

Check the routing table on the token side Apollos with the UNIX 
(not Aegis) command: 

'netstat -r'.

You should see something like:

   Routing tables
   Destination     Gateway         Flags    Hops  Ref  Use        Interface
   <default>       cache.water.ca. USG      1     0    6          dr0 
                   ^^^^^^^
                   your gateway machine should be listed here

Also if you're using SR10.x, and have sub-nets, you need the
defaultmask in the /etc/hosts file.

With SR9.7, the netmask goes in the /sys/node_data/networks file
some thing like this:
xxx.yyy.192.11  on dr0 ;  mask 255.255.255.0
xxx.yyy.32.252 on eth0 ; mask 255.255.255.0

Since you are able to get to the ether side from the gateway Apollo,
routed is probably running OK on this machine.

Finally, tcp_server needs to be running. :-)
-- 
James Kwong  Calif. Depart. of H2O Resources, Sacramento, CA 95802
caldwr!kwongj@ucdavis.edu(Internet) ...!ucbvax!ucdavis!caldwr!kwongj (UUCP)
The opinions expressed above are mine, not those of the State of California or the California Department of Water Resources.