[comp.databases] Fault Tolerance vs. High Availability

sweiger@sequent.UUCP (Mark Sweiger) (08/08/90)

Tandem computer systems have achieved the laudable goal of both
hardware and software fault-tolerance, meaning that a transaction
will continue in the face of any *single* point of failure, whether
that failure is in hardware or in software.  Stratus computers
achieve (at least) hardware fault tolerance by replicating hardware
so that any given hardware component can fail without effecting uptime
and (presumably) throughput.  (It is not clear to me what kind
of software fault tolerance Stratus RDBMS offerings possess.
Can some type of hardware failure (power failure?) 
cause an in-progress transaction to be aborted, 
despite the redundant hardware?  Does the hardware fault tolerance
preclude the need for some software fault tolerance (hard to believe
it does, what happens if a disk fails in the middle of a transaction's
multi-page disk write, for example.))  Finally, Sequoia offers a fault
tolerant Unix implementation.  I don't know much about this one at
all; does anyone have a thumbnail description out there?  How
is fault tolerance spread between hardware and software?

One interesting observation about the different approaches
to fault tolerance is that Tandem's fault-tolerant 
implementation seems much more software-based than the Stratus 
implementation.  Tandem's Non-Stop systems typically have
only one additional component of each type (with the exception of disk
drives) and the fault tolerance is built mostly in the software.
Stratus, on the other hand, has fully redundant hardware throughout.
It seems that fully redundant hardware can substitute 
up to a point for a more difficult (Tandem) fault-tolerant 
software implementation, but if the power fails, it seems that you still
need to have at least logging and recovery software to maintain
data integrity.  Given that, does hardware redundancy really
give something more akin to high availability, rather than 
non-stop fault tolerance?

And then there is the claim of high availability, made by many
vendors, especially those without fault tolerance.  What does high
availability mean?  From what I have been able to figure out,
high availability means non-redundant hardware components like
memory, CPU, and bus with very long mean times between failure.
Also, dual-ported disk drives with mirroring capability.  And some RDBMS
subsystem with logging and recovery.  Some vendors offer battery-backed-up
memory.  (What happens to these systems when power fails and
transaction writes are in progress?  Can RDBMS recovery really be
avoided upon warm recovery?  Seems unlikely.  If recovery can't
be avoided, what good is battery backup?)   Are other features
required for high availability?
-- 
Mark Sweiger			Sequent Computer Systems
Database Software Engineer	15450 SW Koll Parkway
				Beaverton, Oregon  97006-6063
(503)526-4329			...{tektronix,ogicse}!sequent!sweiger

dhepner@hpcuhc.HP.COM (Dan Hepner) (08/10/90)

From: sweiger@sequent.UUCP (Mark Sweiger)

Nice posting.

>Tandem computer systems have achieved the laudable goal of both
>hardware and software fault-tolerance, meaning that a transaction
>will continue in the face of any *single* point of failure, whether
>that failure is in hardware or in software.

Note that such claims about continuing in the face of a software
failure require a very specific definition of software failure.  If
we assume all software failures are the result of some bug or
another, it can be easily seen that some bugs will be fatal,
and that no "architecture" will be able to fix them on the fly.

The kind which can be "sustained" are transient os / transaction manager
type bugs which are timing dependent, and infrequent in nature, where
the basic technique is to fall back and try again.

It should also be noted that such techniques are independent of
hardware fault tolerance or Tandem Computers, and could be logically 
asserted as being the basis for any software described as "robust".

  Stratus computers
>achieve (at least) hardware fault tolerance by replicating hardware
>so that any given hardware component can fail without effecting uptime
>and (presumably) throughput.  (It is not clear to me what kind
>of software fault tolerance Stratus RDBMS offerings possess.

Stratus and Sequoia both attempt to use the same restart techniques
as Tandem, although neither uses the particular implementation 
that Tandem Guardian systems use. 
 
>Can some type of hardware failure (power failure?) 
>cause an in-progress transaction to be aborted, 
>despite the redundant hardware? 

Aborting the current transaction(s) is typically seen as an option
in response to all manner of ills.  The extent that this affects
users is dependent upon more software, which could be used to
transparently restart the txn.  Long term power failure has the
intuitive effect: the system quits, with restart being more or
less painful depending on yet more software.  Many applications 
simply cannot be subjected to long term power failures, and
are backed up by generators, batteries, uninterruptible power supplies
(UPS) of various sorts.

 Does the hardware fault tolerance
>preclude the need for some software fault tolerance (hard to believe
>it does, what happens if a disk fails in the middle of a transaction's
>multi-page disk write, for example.))

To the last point, there are two disk drives, and the failure
of one has little effect.

  Finally, Sequoia offers a fault
>tolerant Unix implementation.

So does Tandem, and Stratus either does or has announced that they do.

  I don't know much about this one at
>all; does anyone have a thumbnail description out there?  How
>is fault tolerance spread between hardware and software?

Sequoia is a symmetric multi-processor (68030) which schedules
jobs onto a processor/cache where they run more or less in isolation
until they need to affect main memory.  At that point, the cache
is flushed, which constitutes a checkpoint more or less analogous
to Tandem's checkpoint.  Processors are lockstepped pairs, memory
is redundant.  [More upon request.  HP OEMs Sequoia]

>One interesting observation about the different approaches
>to fault tolerance is that Tandem's fault-tolerant 
>implementation seems much more software-based than the Stratus 
>implementation. 

This distinction is commonly noted.  The more important difference
is that Tandem's Guardian FT is "programmer assisted" fault tolerance,
where the others, including Tandem's own S2 Unix offering, offer FT 
generally transparent to the programmer.

 Tandem's Non-Stop systems typically have
>only one additional component of each type (with the exception of disk
>drives) and the fault tolerance is built mostly in the software.
>Stratus, on the other hand, has fully redundant hardware throughout.

Well, "one additional component of each type" could also be phrased
"fully redundant hardware throughout".  In HW, the striking difference
is that Tandem uses proprietary self-checking processors as compared
to Stratus' Quad Modular Redundancy, which uses two sets of lock-step
merchant microprocessors.   Effectively, two processors for Tandem,
four for Stratus. 

>It seems that fully redundant hardware can substitute 
>up to a point for a more difficult (Tandem) fault-tolerant 
>software implementation, but if the power fails, it seems that you still
>need to have at least logging and recovery software to maintain
>data integrity.

Power failure should be seen as a different problem than fault
tolerance.  It is quite imaginable to have a non-FT system which
handles power failures very gracefully, and a FT system which
fails dismally.  In practice of course, FT vendors have worked
on their power failure plan quite a bit. 

All modern OLTP systems use logging and recovery software, and this also
is independent of FT.

  Given that, does hardware redundancy really
>give something more akin to high availability, rather than 
>non-stop fault tolerance?

>And then there is the claim of high availability, made by many
>vendors, especially those without fault tolerance.  What does high
>availability mean?  From what I have been able to figure out,
>high availability means non-redundant hardware components like
>memory, CPU, and bus with very long mean times between failure.

A common definition of "high availability" has one non-FT processor
standing by ready to take over for another in the event that the
first one fails.  This has sometimes been called a "warm standby".

We at HP call our normal systems, which we market as having very 
long mean times between failures, as "highly reliable".

>Also, dual-ported disk drives with mirroring capability.  And some RDBMS
>subsystem with logging and recovery.  Some vendors offer battery-backed-up
>memory.  (What happens to these systems when power fails and
>transaction writes are in progress?  Can RDBMS recovery really be
>avoided upon warm recovery? Seems unlikely.

Why does it seem unlikely?  While I can't speak for other battery
backed up RAM vendors, this is quite possible on current HP systems.  
The writes which needed completion are completed upon powerup.

  If recovery can't
>be avoided, what good is battery backup?)

Good point.

   Are other features
>required for high availability?

Fast fsck.

>Mark Sweiger			Sequent Computer Systems

Dan Hepner
Disclaimer: Not a statement of Hewlett-Packard