patrick@casbs.stanford.edu (Patrick Goebel) (02/19/91)
We have a diskless 4/60 client feeding off a 4/330 server (both running SunOS 4.0.3). For almost a year now I have had virtually no problems with this arrangement. This past weekend I rebooted the diskless client and discovered to my dismay that it couldn't make it through the boot sequence! Specifically: (1) The Power On memory test is passed. (2) The bootstrapping program is loaded from the server disk (as it evidenced by the counting in hex up to 21a00 on the client console, and the appearance of the tftp daemon on the server). (3) The freshly loaded boot program successfully retrieves the client's IP address from the server. (4) The client hangs! The resulting display on the client's console looks like this: Ethernet 8:0:20:8:85:a7 Host ID 5100dc39 Testing Booting from: le(0,0,0)vmunix 21a00 Using IP Address 36.30.0.11 = 241E000B This is what I have tried so far: (1) at least 10 reboot attempts including several power cycles with up to 24 hours between retries (2) swapped out the transceiver cable (3) swapped out the transceiver (4) switched ports on the (David Systems) concentrator (5) swapped out the CPU (thanks to Habib from Sun) including an upgrade of the boot PROM to Rev 1.3 (6) tried booting with vmunix.generic Here are some oddities about our system: pcnfs (Network file system for PC's) afs (Andrew File System--however, all of SunOS is obtained from local disk) erpc (for the Xylogics Annex II terminal server) shared libraries (/usr/lib/libc.so.1.3.1-for domain name service) We are NOT running NIS (formerly YP) I should point out that I have previously rebooted the 4/60 with all the above in full force. Furthermore, the 4/330 server is running as smooth as ever with the exception of an inexplicable crash when exiting the X11/NeWS server a few days ago. Finally, I should note an observation concerning an intermittent pair of console messages we received on the 4/60 about once a day for the two weeks prior to the boot problem: Feb 8 17:35:58 quercus.Stanford.EDU vmunix: le0: Transmission stopped Feb 8 17:35:58 quercus.Stanford.EDU vmunix: le0: csr: 0x2e3<TINT,INTR,INEA,RXON,STRT,INIT> It was this pair of messages that led us to focus so much on the ethernet hardware. However, note that the bootstrapping program is apparently being loaded from the server disk and the client's IP address is being found also from the server. I can't explain it. Any ideas? R. Patrick Goebel E-MAIL: patrick@casbs.Stanford.EDU Network Administrator VOICE: (415) 321-2052 CASBS, 202 Junipero Serra Blvd. FAX: (415) 321-1192 Stanford, CA 94305 BEEPER: Temporarily Out of Order...