[comp.sys.hp] why do cluster clients panic after cluster servers die

tpm@eng.cam.ac.uk (tim marsland) (05/09/90)

We just had an NFS fileserver crash bring down an entire cluster because
(it seems) the cluster server got locked up waiting for the (hard mounted)
server to come back up, and the cluster clients panicked after 30 seconds
of waiting.  We're going to investigate the circumstances some more, and
try out ways around this particular problem, but it reminded me of the
question I'd been meaning to ask an HP wizard for a while which is:

	Why _does_ a cluster client invoke panic() when its cluster server
        stops responding?  Why does it simply not sleep-and-retry?

If it's important to sync everything at cluster server boot time, then why
not get the cluster server to reboot its clients whilst it's coming up?

Apologies in advance if this is a frequently asked question, or is already
somewhere in the fine manual.

Just curious.

tim marsland,
information engineering division,
cambridge university engineering dept.

markl@hpbbi4.HP.COM (#Mark Lufkin) (05/29/90)

> >	Why _does_ a cluster client invoke panic() when its cluster server
> >        stops responding?  Why does it simply not sleep-and-retry?
> >
> >	The idea of doing this is to take into account the possibility
> >	that the LAN hardware on the client is bad. SAy the client could
> >	SEND packets (so the server thinks it is alive) but cannot RECEIVE
> >	packets (so it sees the server as being dead). In this case it holds
> >	resources on the server but is effectively inoperable. To be certain,
> >	the client simply panics if it loses contact with the server (after
> >	going through a reasonable retry period and running the landiag 
> >	routines to see if there is a broken cable - in which case it WILL
> >	wait indefinitely).
> 
> Are you serious?  You mean that in order to guard against an unlikely
> failure mode of your Ethernet hardware (mtbf anyone?) HP-UX chooses to
> always crash the machine?  To paraphrase, what you're saying is that if
> the server loses contact for whatever reason (perhaps it crashes, is being
> dumped, or, say, freezes because of a nfs hard mount on a dead/dumping
> server, though not from broken cables) ==all the clients suicide on the
> off-chance that their Ethernet hardware has half broken.
> 	Wouldn't the alternative of simply waiting for something rather
> more likely to happen (e.g. the server recovers) be friendlier to your
> average user who was rather hoping to just wait five minutes for the
> server to reboot.. before getting back to what he/she was doing?
> 
> Sun, DEC (and other vendors) diskless workstations happily survive server
> crashes/dumps -- do we assume they have significantly more reliable
> Ethernet hardware ;-)

	The HPUX diskless implementation is stateful - that is to say, the
	server keeps info on resources being used by the client machines.
	Thus if the server crashes and comes back up again it will not have
	info that it requires. The information kept is things like which
	nodes have which files open and whether they are open for read or
	write. This info is used to allow synchronisation of the buffer
	caches and provides an altogether more robust system. Because of
	this state information the client MUST panic if it loses contact
	with the server. Note also that it does not panic immediately (the
	number of retries can be set in the kernel) and it does a check to
	make sure that the network has not been temporarily opened or
	whatever. The discussion of what is the right implementation - 
	stateful or stateless - is open for discussion and people have
	different opinions depending on who you talk to. Stateless is
	robust in the face of server crashes however the stateful 
	implementation allows true un*x semantics.

> 
> tim marsland, <tpm@eng.cam.ac.uk>
> information engineering division,
> cambridge university engineering dept.
> ----------

Mark Lufkin
WG-EMC OS Technical Support
HP GmbH, Boeblingen, W.Germany

cph@zurich.ai.mit.edu (Chris Hanson) (06/01/90)

In article <1720008@hpbbi4.HP.COM> markl@hpbbi4.HP.COM (#Mark Lufkin) writes:

   From: markl@hpbbi4.HP.COM (#Mark Lufkin)
   Newsgroups: comp.sys.hp
   Date: 28 May 90 17:20:40 GMT

	   The HPUX diskless implementation is stateful - that is to say, the
	   server keeps info on resources being used by the client machines.
	   Thus if the server crashes and comes back up again it will not have
	   info that it requires. The information kept is things like which
	   nodes have which files open and whether they are open for read or
	   write. This info is used to allow synchronisation of the buffer
	   caches and provides an altogether more robust system. Because of
	   this state information the client MUST panic if it loses contact
	   with the server. Note also that it does not panic immediately (the
	   number of retries can be set in the kernel) and it does a check to
	   make sure that the network has not been temporarily opened or
	   whatever. The discussion of what is the right implementation - 
	   stateful or stateless - is open for discussion and people have
	   different opinions depending on who you talk to. Stateless is
	   robust in the face of server crashes however the stateful 
	   implementation allows true un*x semantics.

   Mark Lufkin
   WG-EMC OS Technical Support
   HP GmbH, Boeblingen, W.Germany

In general I have found that HP's diskless implementation works very
well, outperforming NFS by a considerable margin -- indeed I suspect
that the "statefulness" of this implementation is necessary to achieve
this kind of behavior.  We have been using the software for quite
awhile and are generally very happy with it.  But having all of the
machines crash whenever the server goes down or the network is broken
is really a pain.

It seems to me that the diskless protocol could be extended in an
upwards-compatible way so that the client supplied the necessary
information to the server to permit resynchronization when the server
had lost the information for one reason or another.  Certainly the
client knows all of the relevant state, such as what files are open.
Presumably there are some situations in which complete
resynchronization is impossible -- such as the client having an
enforcement-mode lock on some file which the server has given away to
another client while the first client was out of touch -- and in such
a case the client's processes that depend on that particular bit of
state ought to get errors and lose, but the remaining processes should
still be able to win.  And I suspect that "impossible
resynchronization" is really quite rare, so that even if it has fairly
catastrophic consequences, that may not be much of a problem.

A little cleverness in the design of this software could eliminate a
lot of headaches for customers.

keith@picard.eng.ohio-state.edu (Keith M Boyer) (06/01/90)

In article <CPH.90Jun1021135@kleph.ai.mit.edu> cph@zurich.ai.mit.edu (Chris Hanson) writes:
>It seems to me that the diskless protocol could be extended in an
>upwards-compatible way so that the client supplied the necessary
>information to the server to permit resynchronization when the server
>had lost the information for one reason or another.  Certainly the
>client knows all of the relevant state, such as what files are open.
>
>A little cleverness in the design of this software could eliminate a
>lot of headaches for customers.

While I would also like to see the clients behave more robustly I can see
that the problems of true resynchronization might be difficult.
In our environment here at OSU it would be Damn Nice if the client would
AT LEAST try to reboot automatically after it fails. We have approx. 80
HP workstations in 5 different locations. Having to walk from one building
to another to cycle power is a best, inconvienient.

++keith


-=-
-  Keith M. Boyer                Department of Computer and Information Science
-- THE Ohio State University          2036 Neil Ave. Columbus OH USA 43210-1277
-  keith@cis.ohio-state.edu       or       ...!osu-cis!cis.ohio-state.edu!keith
EVERYTHING SHOULD BE MADE AS SIMPLE AS POSSIBLE,BUT NOT SIMPLER-Albert Einstein

perry@hpfcdc.HP.COM (Perry Scott) (06/02/90)

>In our environment here at OSU it would be Damn Nice if the client would
>AT LEAST try to reboot automatically after it fails. We have approx. 80
>HP workstations in 5 different locations. Having to walk from one building
>to another to cycle power is a best, inconvienient.
>
>-  Keith M. Boyer                Department of Computer and Information Science
>-- THE Ohio State University          2036 Neil Ave. Columbus OH USA 43210-1277

Nice enhancement.

There are a few problems with implementation.  panic() actually does
some nice things, like savecore() for "real" panics.  It also puts
something on the screen for the support people.  So clearly, panic()
isn't where the problem lies.

Maybe instead of calling panic("lost contact"); we need to simply
printf("lost contact"); followed by a delay, followed by a call to
reboot().

I'll put in an enhancement request, and see where it goes.

Perry Scott
HP Ft Collins

perry@hpfcdc.HP.COM (Perry Scott) (06/07/90)

Re:  I'll put in an enhancement request, and see where it goes.

I did just that, and it's already fixed in the development kernel that's
currently under test, and is slated for the next 300/800 HP-UX release.

Perry Scott

rml@hpfcdc.HP.COM (Bob Lenk) (06/08/90)

In article <CPH.90Jun1021135@kleph.ai.mit.edu> cph@zurich.ai.mit.edu (Chris Hanson) writes:

> It seems to me that the diskless protocol could be extended in an
> upwards-compatible way so that the client supplied the necessary
> information to the server to permit resynchronization when the server
> had lost the information for one reason or another.  Certainly the
> client knows all of the relevant state, such as what files are open.

The relevant information can include things the client doesn't keep
track of, such as what data was in the server's buffer cache and what
the layout of the server's swap space was.

> A little cleverness in the design of this software could eliminate a
> lot of headaches for customers.

I think it's possible to address this, (as well as making sense of the
information the client does have) but it would take significant effort.

		Bob Lenk
		rml@hpfcla.hp.com
		hplabs!hpfcla!rml