tpm@eng.cam.ac.uk (tim marsland) (05/09/90)
We just had an NFS fileserver crash bring down an entire cluster because (it seems) the cluster server got locked up waiting for the (hard mounted) server to come back up, and the cluster clients panicked after 30 seconds of waiting. We're going to investigate the circumstances some more, and try out ways around this particular problem, but it reminded me of the question I'd been meaning to ask an HP wizard for a while which is: Why _does_ a cluster client invoke panic() when its cluster server stops responding? Why does it simply not sleep-and-retry? If it's important to sync everything at cluster server boot time, then why not get the cluster server to reboot its clients whilst it's coming up? Apologies in advance if this is a frequently asked question, or is already somewhere in the fine manual. Just curious. tim marsland, information engineering division, cambridge university engineering dept.
markl@hpbbi4.HP.COM (#Mark Lufkin) (05/29/90)
> > Why _does_ a cluster client invoke panic() when its cluster server > > stops responding? Why does it simply not sleep-and-retry? > > > > The idea of doing this is to take into account the possibility > > that the LAN hardware on the client is bad. SAy the client could > > SEND packets (so the server thinks it is alive) but cannot RECEIVE > > packets (so it sees the server as being dead). In this case it holds > > resources on the server but is effectively inoperable. To be certain, > > the client simply panics if it loses contact with the server (after > > going through a reasonable retry period and running the landiag > > routines to see if there is a broken cable - in which case it WILL > > wait indefinitely). > > Are you serious? You mean that in order to guard against an unlikely > failure mode of your Ethernet hardware (mtbf anyone?) HP-UX chooses to > always crash the machine? To paraphrase, what you're saying is that if > the server loses contact for whatever reason (perhaps it crashes, is being > dumped, or, say, freezes because of a nfs hard mount on a dead/dumping > server, though not from broken cables) ==all the clients suicide on the > off-chance that their Ethernet hardware has half broken. > Wouldn't the alternative of simply waiting for something rather > more likely to happen (e.g. the server recovers) be friendlier to your > average user who was rather hoping to just wait five minutes for the > server to reboot.. before getting back to what he/she was doing? > > Sun, DEC (and other vendors) diskless workstations happily survive server > crashes/dumps -- do we assume they have significantly more reliable > Ethernet hardware ;-) The HPUX diskless implementation is stateful - that is to say, the server keeps info on resources being used by the client machines. Thus if the server crashes and comes back up again it will not have info that it requires. The information kept is things like which nodes have which files open and whether they are open for read or write. This info is used to allow synchronisation of the buffer caches and provides an altogether more robust system. Because of this state information the client MUST panic if it loses contact with the server. Note also that it does not panic immediately (the number of retries can be set in the kernel) and it does a check to make sure that the network has not been temporarily opened or whatever. The discussion of what is the right implementation - stateful or stateless - is open for discussion and people have different opinions depending on who you talk to. Stateless is robust in the face of server crashes however the stateful implementation allows true un*x semantics. > > tim marsland, <tpm@eng.cam.ac.uk> > information engineering division, > cambridge university engineering dept. > ---------- Mark Lufkin WG-EMC OS Technical Support HP GmbH, Boeblingen, W.Germany
cph@zurich.ai.mit.edu (Chris Hanson) (06/01/90)
In article <1720008@hpbbi4.HP.COM> markl@hpbbi4.HP.COM (#Mark Lufkin) writes:
From: markl@hpbbi4.HP.COM (#Mark Lufkin)
Newsgroups: comp.sys.hp
Date: 28 May 90 17:20:40 GMT
The HPUX diskless implementation is stateful - that is to say, the
server keeps info on resources being used by the client machines.
Thus if the server crashes and comes back up again it will not have
info that it requires. The information kept is things like which
nodes have which files open and whether they are open for read or
write. This info is used to allow synchronisation of the buffer
caches and provides an altogether more robust system. Because of
this state information the client MUST panic if it loses contact
with the server. Note also that it does not panic immediately (the
number of retries can be set in the kernel) and it does a check to
make sure that the network has not been temporarily opened or
whatever. The discussion of what is the right implementation -
stateful or stateless - is open for discussion and people have
different opinions depending on who you talk to. Stateless is
robust in the face of server crashes however the stateful
implementation allows true un*x semantics.
Mark Lufkin
WG-EMC OS Technical Support
HP GmbH, Boeblingen, W.Germany
In general I have found that HP's diskless implementation works very
well, outperforming NFS by a considerable margin -- indeed I suspect
that the "statefulness" of this implementation is necessary to achieve
this kind of behavior. We have been using the software for quite
awhile and are generally very happy with it. But having all of the
machines crash whenever the server goes down or the network is broken
is really a pain.
It seems to me that the diskless protocol could be extended in an
upwards-compatible way so that the client supplied the necessary
information to the server to permit resynchronization when the server
had lost the information for one reason or another. Certainly the
client knows all of the relevant state, such as what files are open.
Presumably there are some situations in which complete
resynchronization is impossible -- such as the client having an
enforcement-mode lock on some file which the server has given away to
another client while the first client was out of touch -- and in such
a case the client's processes that depend on that particular bit of
state ought to get errors and lose, but the remaining processes should
still be able to win. And I suspect that "impossible
resynchronization" is really quite rare, so that even if it has fairly
catastrophic consequences, that may not be much of a problem.
A little cleverness in the design of this software could eliminate a
lot of headaches for customers.
keith@picard.eng.ohio-state.edu (Keith M Boyer) (06/01/90)
In article <CPH.90Jun1021135@kleph.ai.mit.edu> cph@zurich.ai.mit.edu (Chris Hanson) writes: >It seems to me that the diskless protocol could be extended in an >upwards-compatible way so that the client supplied the necessary >information to the server to permit resynchronization when the server >had lost the information for one reason or another. Certainly the >client knows all of the relevant state, such as what files are open. > >A little cleverness in the design of this software could eliminate a >lot of headaches for customers. While I would also like to see the clients behave more robustly I can see that the problems of true resynchronization might be difficult. In our environment here at OSU it would be Damn Nice if the client would AT LEAST try to reboot automatically after it fails. We have approx. 80 HP workstations in 5 different locations. Having to walk from one building to another to cycle power is a best, inconvienient. ++keith -=- - Keith M. Boyer Department of Computer and Information Science -- THE Ohio State University 2036 Neil Ave. Columbus OH USA 43210-1277 - keith@cis.ohio-state.edu or ...!osu-cis!cis.ohio-state.edu!keith EVERYTHING SHOULD BE MADE AS SIMPLE AS POSSIBLE,BUT NOT SIMPLER-Albert Einstein
perry@hpfcdc.HP.COM (Perry Scott) (06/02/90)
>In our environment here at OSU it would be Damn Nice if the client would >AT LEAST try to reboot automatically after it fails. We have approx. 80 >HP workstations in 5 different locations. Having to walk from one building >to another to cycle power is a best, inconvienient. > >- Keith M. Boyer Department of Computer and Information Science >-- THE Ohio State University 2036 Neil Ave. Columbus OH USA 43210-1277 Nice enhancement. There are a few problems with implementation. panic() actually does some nice things, like savecore() for "real" panics. It also puts something on the screen for the support people. So clearly, panic() isn't where the problem lies. Maybe instead of calling panic("lost contact"); we need to simply printf("lost contact"); followed by a delay, followed by a call to reboot(). I'll put in an enhancement request, and see where it goes. Perry Scott HP Ft Collins
perry@hpfcdc.HP.COM (Perry Scott) (06/07/90)
Re: I'll put in an enhancement request, and see where it goes. I did just that, and it's already fixed in the development kernel that's currently under test, and is slated for the next 300/800 HP-UX release. Perry Scott
rml@hpfcdc.HP.COM (Bob Lenk) (06/08/90)
In article <CPH.90Jun1021135@kleph.ai.mit.edu> cph@zurich.ai.mit.edu (Chris Hanson) writes: > It seems to me that the diskless protocol could be extended in an > upwards-compatible way so that the client supplied the necessary > information to the server to permit resynchronization when the server > had lost the information for one reason or another. Certainly the > client knows all of the relevant state, such as what files are open. The relevant information can include things the client doesn't keep track of, such as what data was in the server's buffer cache and what the layout of the server's swap space was. > A little cleverness in the design of this software could eliminate a > lot of headaches for customers. I think it's possible to address this, (as well as making sense of the information the client does have) but it would take significant effort. Bob Lenk rml@hpfcla.hp.com hplabs!hpfcla!rml