rich@eddie.MIT.EDU (Richard Caloggero) (07/02/88)
We recently encountered problems booting one of our diskless nodes. It seemed that the node couldn't find its partner. We checked all the relavant files and made sure netman was alive on the partner. All seemed in order to us, but the diskless node just refused to cooperate with us. Finaly, we decided to boot it explicitly from another disked node (one other than its original partner). This seemed to work. We then tried booting explicitly from its original partner. Again, this worked fine. So, the question remains: why can't this poor diskless node get help from its partner? Is the partner not responding to the nodes plea for help, or is the node not asking for help? Does anyone out there have any ideas ... Please e-mail your responses to me and I'll post the results. -- -- Rich (rich@eddie.mit.edu). The circle is open, but unbroken. Merry meet, merry part, and merry meet again.
jec@iuvax.cs.indiana.edu (James E. Conley) (07/02/88)
We've had something like this happen on occasion and it has usually turn out to be either: (1) The /sys/net/diskless_list was wrong on the boot partner, or (2) The node was listed in multiple /sys/net/diskless_list files. III Usenet: iuvax!jec UUU I UUU ARPANet: jec@iuvax.cs.indiana.edu U I U Phone: (812) 335-7729 U I U U.S. Mail: Indiana University U I U Dept. of Computer Science UUUIUUU 021-E Lindley Hall I Bloomington, IN. 47405 III (Home of Bob Knight and the Indiana Hoosiers)
jec@iuvax.cs.indiana.edu (James E. Conley) (07/02/88)
I should point out that we upgraded our boot proms about a year ago for all of our diskless nodes. Seems there were some bugs a while back, but I doubt that Apollo has been shipping these prom recently. If you have an older diskless (DN3000 in our cases) machine, you might check the prom version (do an 'RE' at the '>' prompt). We are now running with version 5.3 proms and things work better. III Usenet: iuvax!jec UUU I UUU ARPANet: jec@iuvax.cs.indiana.edu U I U Phone: (812) 335-7729 U I U U.S. Mail: Indiana University U I U Dept. of Computer Science UUUIUUU 021-E Lindley Hall I Bloomington, IN. 47405 III (Home of Bob Knight and the Indiana Hoosiers)
tmac@caen.engin.umich.edu (thomas allen mcleary) (07/02/88)
In article <9622@eddie.MIT.EDU>, rich@eddie.MIT.EDU (Richard Caloggero) writes: > We recently encountered problems booting one of our diskless > nodes. It seemed that the node couldn't find its partner. We > checked all the relavant files and made sure netman was alive on the > partner. All seemed in order to us, but the diskless node just > refused to cooperate with us. If you send me e-mail detailing exactly what the screen display in your attempts, I can probably help you. Some things to check: 1) See if the node you're booting off of has been netsvc'ed 2) Make sure the disked node has a /sys/node_data.<nodeid> 3) Try RE then DI N <disked node id> then EX AEGIS Best way for me to help you is if you e-mail me. -------------------------------------------------------------------------------- ARPAnet: tmac@caen.engin.umich.edu USMAILnet: Tom McLeary "It's not whether you Computer Operations Support win or lose; Univ. of Michigan/CAEN it's what you drive 231 Chrysler Center home in." Ann Arbor, Mi. 48109 BELLnet: (313) 936-3497 --------------------------------------------------------------------------------
rees@A.CC.UMICH.EDU (Jim Rees) (07/02/88)
We recently encountered problems booting one of our diskless nodes. It seemed that the node couldn't find its partner. We checked all the relavant files and made sure netman was alive on the partner. All seemed in order to us, but the diskless node just refused to cooperate with us. One way to narrow this down is to kill off the netman on the disked partner, then restart it in a window (just run /sys/net/netman). Then you'll be able to see if it actually got the request, and if so, what it did with it. ------- -------
achille@cernvax.UUCP (achille) (07/04/88)
I've seen another problem with diskless nodes appearing at sr9.5 (sr9.6 actually): When a diskless crashes (or is reset), it leaves its `node_data/boot_shell file locked and then will fail the boot sequence with a 'uid request failed' message. The workaround (???) I've found is to "dlf -du" the boot_shell file at the end of the `node_data/startup.whatever file. This seems to work fine. Hope this helps, Achille Petrilli