jec@iuvax.cs.indiana.edu (07/19/88)
I've experienced what I think is a bug with TCP/IP 3.1 and would like to know if anyone has noticed it and every better, if anyone has a fix. I have a script that I run that tries to do a rsh on each apollo from the VAX (Ultrix 2.2) in order to determine if TCP/IP services are functioning. The problem is that sometimes (not always), the rsh will hang forever. I do a: % rsh io.cs.indiana.edu /bin/echo UP and it will sit there for tens of hours. I've noticed that it only seems to occur on diskless nodes. I'm running SR9.7, Domain/IX 9.5, TCP 3.1, and the nodes boot from DSP90s with 3MBs, (one of the DSP90's is the gateway, but the other is a typical node in the network). Any ideas? III Usenet: iuvax!jec UUU I UUU ARPANet: jec@iuvax.cs.indiana.edu U I U Phone: (812) 335-7729 U I U U.S. Mail: Indiana University U I U Dept. of Computer Science UUUIUUU 021-E Lindley Hall I Bloomington, IN. 47405 III (Home of Bob Knight and the Indiana Hoosiers)
aad@stpstn.UUCP (Anthony A. Datri) (07/21/88)
I'd be happy if I could get 3.1 tcp to work at all. I install it, and get some error about the /lib/streams file being out of date, so I install it with the /lib/streams off of the 3.1 tcp tape, and get the same error.-- @disclaimer(Any concepts or opinions above are entirely mine, not those of my employer, my GIGI, or my 11/34) beak is beak is not Anthony A. Datri,SysAdmin,StepstoneCorporation,stpstn!aad
kwongj@caldwr.caldwr.gov (James Kwong) (07/21/88)
In article <1894@stpstn.UUCP>, aad@stpstn.UUCP (Anthony A. Datri) writes: > > I'd be happy if I could get 3.1 tcp to work at all. I install it, > and get some error about the /lib/streams file being out of date, so > I install it with the /lib/streams off of the 3.1 tcp tape, and get > the same error.-- Did you try shutting down the machine after you installed the new tcp stuff? I had a similar problem when i installed it over a modem. I couldn't shut it down after the installation and try to restart tcp my hand with no luck. it keep complaining about the stream being out of date. after I rebooted the machine everything was fine. Hope this helps. JK -- James Kwong Calif. Depart. of H2O Resources, Sacramento, CA 95802 caldwr!kwongj@ucdavis.edu(Internet) ...!ucbvax!ucdavis!caldwr!kwongj (UUCP) The opinions expressed above are mine, not those of the State of California or the California Department of Water Resources.
weber_w@apollo.uucp (Walt Weber) (07/22/88)
In article <1894@stpstn.UUCP> aad@stpstn.UUCP (Anthony A. Datri) writes: > >I'd be happy if I could get 3.1 tcp to work at all. I install it, >and get some error about the /lib/streams file being out of date, so >I install it with the /lib/streams off of the 3.1 tcp tape, and get >the same error.-- Anthony: As your message does not give any indication as to HOW the software is failing, I will answer as though it is a mis-understanding of messages in the install procedure. (If this is a bad assumption on my part, please follow up with what operations are being performed when you get the failures, and some of the error text from the failure.) The installation procedures for tcp3.1 include checking the release date of critical files like /lib/streams, and should give you an advisory message like "/lib/streams appears to be out of date..." or "/lib/streams appears to be a newer release and may not need to be updated..." and then asks if you wish to have the file updated. If you answer YES to have the file updated, it will update the file, but the update WILL NOT TAKE EFFECT UNTIL THE NEXT REBOOT. The release notes and installation procedures call this out clearly, I believe. You should, therefore, replace the file (if it is out of date), shut down & reboot the node, and then use tcp3.1. Please keep us posted (no pun intended) about your progress. ...walt... -- Walt Weber PHONE: (617) 256-6600 x7004 Apollo Computer GENIE: W.WEBER Chelmsford, People's Republic of Massachusetts
kts@quintro.UUCP (Kenneth T. Smelcer) (07/28/88)
In article <5400029@iuvax> jec@iuvax.cs.indiana.edu writes: > > I have a script that I run that tries to do a rsh on each apollo >from the VAX (Ultrix 2.2) in order to determine if TCP/IP services are >functioning. The problem is that sometimes (not always), the rsh will >hang forever. I do a: > > % rsh io.cs.indiana.edu /bin/echo UP > > and it will sit there for tens of hours. I've noticed that it only >seems to occur on diskless nodes. I'm running SR9.7, Domain/IX 9.5, TCP 3.1, >and the nodes boot from DSP90s with 3MBs, (one of the DSP90's is the gateway, >but the other is a typical node in the network). We have had the same problem talking to nodes within our Apollo network. On our system, (6 DN3000's and a DSP90 server) both rsh and rlogin have the same problem. rlogin will try for a while and then return an "error 0" message and rsh just hangs forever. If you kill the request (^C) and try again, the request always goes through. I talked to Apollo service when we first saw this problem (when we installed SR9.7 with TCP3.0), and they said it was a problem with the routing tables. After some length of time, the routing table would seem to be out of date, and therefore the request would fail. However, that request would update the table, so the next rsh or rlogin would work just fine. I was told the problem was a known bug with SR9.7 and TCP 3.0 and was supposed to be fixed in 3.1. Well, it doesn't happen as often as it used to, but it is still a problem. I would also be interested in any ideas on a work-around or fix for this problem. -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Ken Smelcer Quintron Corporation - Quincy, Il. UUCP: {elroy,lll-winken,laidbak}!spl1!quintro!kts or uunet!wucs1!wuibc!quintro!kts
jec@iuvax.cs.indiana.edu (08/11/88)
I've noticed that ping also has some problems: If you ping an Apollo the first attempt will usually fail, but after that ping seems to work. For instance the first time I try it: [root@io:33]ping charybdis PING charybdis.cs.indiana.edu: 56 data bytes Timed out (1 second) waiting for echo reply <--- fails 64 bytes from 98.0.0.38: icmp_seq=1. time=154. ms <--- passes 64 bytes from 98.0.0.38: icmp_seq=2. time=26. ms 64 bytes from 98.0.0.38: icmp_seq=3. time=13. ms 64 bytes from 98.0.0.38: icmp_seq=4. time=17. ms 64 bytes from 98.0.0.38: icmp_seq=5. time=13. ms The second time, however: [root@io:36]!ping ping charybdis PING charybdis.cs.indiana.edu: 56 data bytes 64 bytes from 98.0.0.38: icmp_seq=0. time=23. ms <--- passes 64 bytes from 98.0.0.38: icmp_seq=1. time=13. ms 64 bytes from 98.0.0.38: icmp_seq=2. time=14. ms 64 bytes from 98.0.0.38: icmp_seq=3. time=23. ms 64 bytes from 98.0.0.38: icmp_seq=4. time=14. ms 64 bytes from 98.0.0.38: icmp_seq=5. time=13. ms III Usenet: iuvax!jec UUU I UUU ARPANet: jec@iuvax.cs.indiana.edu U I U Phone: (812) 335-7729 U I U U.S. Mail: Indiana University U I U Dept. of Computer Science UUUIUUU 021-E Lindley Hall I Bloomington, IN. 47405 III (Home of Bob Knight and the Indiana Hoosiers)
feigin@batcomputer.tn.cornell.edu (Adam Feigin) (08/11/88)
In article <5400031@iuvax> jec@iuvax.cs.indiana.edu writes: > > I've noticed that ping also has some problems: If you ping an >Apollo the first attempt will usually fail, but after that ping seems >to work. > > For instance the first time I try it: > >[root@io:33]ping charybdis >PING charybdis.cs.indiana.edu: 56 data bytes >Timed out (1 second) waiting for echo reply <--- fails >64 bytes from 98.0.0.38: icmp_seq=1. time=154. ms <--- passes > .... > The second time, however: > >[root@io:36]!ping >ping charybdis >PING charybdis.cs.indiana.edu: 56 data bytes >64 bytes from 98.0.0.38: icmp_seq=0. time=23. ms <--- passes I dont seem to have this problem. Perhaps you need to set some options on your tcp_server when you start it up. You probably have the timeout option set too low. apollo.lap csh[7]: ping gulag.sovcen 56 10 PING gulag.sovcen.upenn.edu: 56 data bytes 64 bytes from 128.91.17.137: icmp_seq=0. time=674. ms 64 bytes from 128.91.17.137: icmp_seq=1. time=11. ms 64 bytes from 128.91.17.137: icmp_seq=5. time=11. ms ..... Adam ------------------------------------------------------------------------------ Internet: feigin@tcgould.tn.cornell.edu Adam Feigin Bitnet: feigin@crnlthry Workstation Consultant UUCP: {backbones}!cornell!batcomputer!feigin Cornell National Supercomputer MaBell: (607) 255-3985 Facility, Visualization Group "Sometimes a little brain damage can help" ------------------------------------------------------------------------------
dennis@PEANUTS.NOSC.MIL (Dennis Cottel) (08/12/88)
> From: jec@iuvax.cs.indiana.edu > > If you ping an > Apollo the first attempt will usually fail, but after that ping seems > to work. I've noticed that here as well. It doesn't happen every time, and seems more likely on the older nodes (DN320, DN550), so I attribute it to timing out before the appropriate part of the TCP server can be swapped in to answer. Dennis Cottel Naval Ocean Systems Center, San Diego, CA 92152 (619) 553-1645 dennis@nosc.MIL sdcsvax!noscvax!dennis
krowitz@RICHTER.MIT.EDU (David Krowitz) (08/12/88)
Hmm, this is odd. I can use the BSD4.3 ping from my Alliant FX/40 to ping several different Apollos (1: the gateway between the ethernet and the ringnet, 2: a node on the ringnet, 3: another apollo which is the gateway from an ethernet at the University of Washington to the ringnet there, 4: a node on the U. of W. ringnet) with no problems. Is the problem occurring the first time you ping the Apollo after it has been booted, or the first time you try after waiting for some amount of time? -- David Krowitz krowitz@richter.mit.edu (18.83.0.109) krowitz%richter@eddie.mit.edu krowitz%richter@athena.mit.edu krowitz%richter.mit.edu@mitvma.bitnet (in order of decreasing preference) P.S. Our nodes are at SR9.7 running TCP 3.1