gretzky@unison.larc.nasa.gov (Mitch Wright) (03/04/89)
H-E-L-P-!
I currently have 2 sun 3/60's running 4.0.1. One is a server with a 327Mb
disk and 4Mb of memory, the client is a 3/60 with a color monitor, and 8Mb
of memory. They have been running smoothly (as smooth as 4.0.1 can be).
Well, trouble hit Friday when the client just up and died (went down to
the monitor prompt '>'). Tried rebooting it ...
>b
EEPROM boot device ... le(0,0,0)
Using IP Address 128.155.2.94 = 809B025E
Booting from tftp server at 128.155.2.83 = 809B0253
Downloaded 126056 bytes from tftp server.
Using IP Address 128.155.2.94 = 809B025E
le: missed packet
le: missed packet
le: missed packet
No bootparam server responding; still trying
le: missed packet
le: missed packet
and will continue like this until you L1-a the client. I have double and
triple checked all files in /etc/... I have rebooted the server (several
times). I have deleted the client completely and then added the client
again. Network traffic was quite slow, and I'm fresh out of ideas. I am
beginning to think it is a hardware problem, but I would hate to jump the
gun. I have had NO problems in the past with rebooting.
Any help will be greatly appreciated.
-=>gretzky<=-
..mitch
gretzky@eagle.larc.nasa.gov
gretzky@uxv.larc.nasa.gov
ehrhart@aai8.istc.sri.com (Tim Ehrhart) (03/14/89)
> Well, trouble hit Friday when the client just up and died (went down to > the monitor prompt '>'). Tried rebooting it ... > > >b > > EEPROM boot device ... le(0,0,0) > Using IP Address 128.155.2.94 = 809B025E > Booting from tftp server at 128.155.2.83 = 809B0253 > Downloaded 126056 bytes from tftp server. > Using IP Address 128.155.2.94 = 809B025E > le: missed packet > le: missed packet > le: missed packet > No bootparam server responding; still trying > le: missed packet > le: missed packet I experienced the same problems when I upgraded to 4.0 months ago. After much head scratching and wire sniffing here is what I discovered: We had some VMS/VAXen on the wire running both DECnet and TCP/IP. There are various version of TCP/IP available for VMS, so your mileage may vary. But nonetheless, most of them ~seem~ to be based on the PD version of RPC from Sun. What appears to happen is that when the client is requesting his bootparam server (which corresponds to an indirect RPC request from the portmapper to bootparamd), the portmapper process running on the VAX sends back the wrong response. If it can't satisfy the request, it should simply NOT ANSWER, instead it sends back an RPC error message. (I can't remember exactly what the message was, it has been a while, but I think it was "RPC service unavailable"). We have/had quite of few of these beasts, so the poor diskless was inundated with bogus RPC replies from the VAXen. The client didn't like this, so it proceeded to send ICMP messages back to the VAXen ????. Just about at the timeout of the request, the appropriate file server would FINALLY respond (about 9ms later), but the client timed out his request, dropped the repsonse packet from the file server, which then started the process all over again. Try to prove this by isolating your client and it's file server from the rest of the net and attempting the boot again. This is simple for me to do because we make copious use of multi-port boxes. In lieu of this, get out either tcpdump or etherfind and watch for all packets coming and going to/from the affected client. It was AMAZING to watch how fast the VAXen were pummeling the poor client (reply time was about ~1ms), then finally about 9ms later the file server replied. In my case, the file server was a Sun-4 on the same multi-port box right beside the client, and the VAXen were on distant parts of our campus ethernet. Tim Ehrhart ehrhart@spam.istc.sri.com SRI International
rsd@iroquois.dal.utexas.edu (Shane Davis) (04/07/89)
Been slow to deal with mail this month...better late then never, I reckon... Tim Ehrhart <ehrhart@aai8.istc.sri.com>: ... >What appears to happen is that when the client is requesting his >bootparam server (which corresponds to an indirect RPC request from the >portmapper to bootparamd), the portmapper process running on the VAX sends >back the wrong response. If it can't satisfy the request, it should simply >NOT ANSWER, instead it sends back an RPC error message.... VAXen aren't the only culprits. We had 2 diskless 3/60's attempting to boot off a 3/280 that was only 15 feet away from them and a TI Explorer from the other side of the campus stuck its nose into the boot process in the same unfriendly manner. We isolated the Cabletron all of the Suns were on from the rest of the net and they booted normally. --Shane Davis VM and UNIX Systems Programmer Univ. of Texas at Dallas Academic Computer Center SHANE@UTDALVM1{.BITNET|.dal.utexas.edu} or rsd@dal.utexas.edu