lori@hacgate.scg.hac.com (Lori Barfield + 1/2) (02/02/90)
In article <9001311857.AA15864@mwunix.mitre.org> art@AERA785.MITRE.ORG ("Art McClinton") writes: >I am having difficulty setting up a TCP/IP network between 20 Apollos >and the other 1300 workstations on our ethernet. I have no difficulty >with the Apollo's that have ethernet cards, but the ones that are on the token >ring and have to use the gateway apollo to reach the ethernet are unable to >communicate to anything off the ring. Yeah, me too. From one node I get a socket error when I try to talk to a printer daemon on another node. And with rlogin I get "network unreachable" to/from anything but the gateway node. A little fiddling made that a timeout error instead. Odd thing is, I had this working at one time. Don't know what got changed. (And my notes are about as helpful as TFM.) crp on my internet (ironically, much more sophisticated than rlogin!) works without a hitch from anywhere to anywhere. Help! ...lori
dbfunk@ICAEN.UIOWA.EDU (David B Funk) (02/02/90)
Art & Lori Your problems: >>with the Apollo's that have ethernet cards, but the ones that are on the token >>ring and have to use the gateway apollo to reach the ethernet are unable to >>communicate to anything off the ring. > >Yeah, me too. From one node I get a socket error when I try to talk >to a printer daemon on another node. And with rlogin I get >"network unreachable" to/from anything but the gateway node. all look to be related to tcp/ip routing not working. However the causes/cures depend upon what version of software that you are running. In this kind of case, it can depend upon the OS revision on the nodes evidencing the problem & the revision on the gateway node. So when you post a problem statement like this, Please state the software revisions. It makes it a lot easier to try to help you. Dave Funk
krowitz%richter@UMIX.CC.UMICH.EDU (David Krowitz) (02/02/90)
/com/crp does not rely upon TCP/IP for the underlying network packet services. The most common error with TCP/IP utilities (rlogin, lpr, telnet, ftp, etc.) is the TCP server (either /etc/tcpd under SR10 or SR9.7 BSD4.2 or /sys/tcp/tcp_server under SR9.7 Aegis) dying or loosing its host tables. A hung, or dead, tcp server is frequently the cause of a tcp socket error. The "network unreachable" error is usually caused by a local tcp server which has lost its host table, but can also be caused by a gateway that has crashed or lost its tcp server. Under Sr10, you can use the /etc/ping command to test your network connections. If "/etc/ping <host on local net>" works, but "/etc/ping <gateway> " doesn't work, then the problem is that the gateway is either down or it's tcp server and/or routed daemon is dead. If /etc/ping does work with any of your local hosts, then the tcp server on your own machine is the culprit. Unless you have multiple gateways between your Apollo net and the outside world, run /etc/routed *only* on your gateway. Use the "/etc/route add default <gateway node> 1" command on your non-gateway nodes instead of routed. The routing daemon periodically flushes the tcp server's routing tables of "old" routes, and if the routing daemon on the gateway fails to update your local routing daemon you can lose all of the routing info in your local tables. If you *must* run /etc/routed on your local non-gateway nodes (ie. networks with multiple gateways to the outside world) then use the "-q" switch so that the local nodes will operate in "quiet" (ie. listen only) mode to avoid unecessary network traffic. Imagine what would happen if all 2000+ nodes on the MIT campus network were to run routing daemons all broadcasting their route tables to each other at once! -- David Krowitz krowitz@richter.mit.edu (18.83.0.109) krowitz%richter.mit.edu@eddie.mit.edu krowitz%richter.mit.edu@mitvma.bitnet (in order of decreasing preference)
chen@digital.sps.mot.com (Jinfu Chen) (02/03/90)
In article <7122@hacgate.scg.hac.com> lori@hacgate.hac.com writes: > >Yeah, me too. From one node I get a socket error when I try to talk >to a printer daemon on another node. And with rlogin I get >"network unreachable" to/from anything but the gateway node. >A little fiddling made that a timeout error instead. The SR10 TCP/IP manual is quite helpful in terms of debugging/trouble-shooting problem. With combination of /etc/ping, /usr/ucb/netstat, you can find out if the router on your gateway behaves properly or not. From our experience, message "network unreachable" usually means either /etc/hosts and /etc/networks don't have the right addresses, or the routed on the gateway doesn't work. On SR10, the default /etc/rc.local uses -h option for non-gateway node, which I was told is wrong (from Mentor CSB?). Right now I use -f -q for non-gateway node, and -f -g for gateway node. If you are able to get a message about time-out, start playing with netstat (the /usr/ucb one, not /com/netstat!) options to see if your non-gateway node can see the network pass-thru the gateway (option -r). My biggest complaint about Apollo's TCP/IP software is that under SR10.x, our DSP90 (3mb ram) gateway node can't handle the load as good as under SR9.7. The tcpd occasionally crashes. Apollo's response is to use an AT-bus type of machine to be the gateway. However, I still have a hard time to convince my manager to shell off $1200 for an ethernet card. How can you answer his question, "if ECMB (ethernet controller on multibus) works under sr9.7, it should work under sr10, otherwise, ask Apollo for a free upgrade to an AT-bus card"? -- Jinfu Chen (602)898-5338 | Disclaimer: Motorola, Inc. Logic IC Div., Mesa, AZ | ..{somewhere}!uunet!dover!digital!chen | My employer doesn't pay chen@digital.sps.mot.com | me to express opinions.
kwongj@caldwr.UUCP (James Kwong) (02/03/90)
On the token side (non-gateway) Apollos try using the command: '/etc/route add default gateway_apollo 1' Where gateway_apollo is the hostname of the gateway machine. I've noticed with SR9.7, if you loaded the Aegis version of TCP/IP, it doesn't require explicit routing to the gateway. If you loaded just BSD TCP/IP or are using the SR10.x version, you have to use the /etc/route command. Check the routing table on the token side Apollos with the UNIX (not Aegis) command: 'netstat -r'. You should see something like: Routing tables Destination Gateway Flags Hops Ref Use Interface <default> cache.water.ca. USG 1 0 6 dr0 ^^^^^^^ your gateway machine should be listed here Also if you're using SR10.x, and have sub-nets, you need the defaultmask in the /etc/hosts file. With SR9.7, the netmask goes in the /sys/node_data/networks file some thing like this: xxx.yyy.192.11 on dr0 ; mask 255.255.255.0 xxx.yyy.32.252 on eth0 ; mask 255.255.255.0 Since you are able to get to the ether side from the gateway Apollo, routed is probably running OK on this machine. Finally, tcp_server needs to be running. :-) -- James Kwong Calif. Depart. of H2O Resources, Sacramento, CA 95802 caldwr!kwongj@ucdavis.edu(Internet) ...!ucbvax!ucdavis!caldwr!kwongj (UUCP) The opinions expressed above are mine, not those of the State of California or the California Department of Water Resources.