[comp.sys.ibm.pc.hardware] Disk Arrays

james@bigtex.cactus.org (James Van Artsdalen) (10/01/90)

In <1970@sixhub.UUCP>, davidsen@sixhub.UUCP (bill davidsen) wrote:

| In article <5777@holston.UUCP> barton@holston.UUCP (Barton A. Fisk) writes:

| The catch is drive failure causes loss of pieces of all files. Compaq
| offers several solutions to guard against this including mirroring
| and controller duplexing and data guarding. [...]

>   I can see how that would work using any of several schemes, but all of
> them seem to require not using the other drives until the failed unit is
> replaced. [...]

(Hopefully I didn't take Bill's comments too far out of context)

Actually, a key goal of data guarding is to keep running in the event
of a failure.  For example, with the Dell drive array, you can have a
drive completely fail (yank the power cable out) without any errors.
This can be any drive, not just one group (ie, nothing magic about the
guard drive).  I'm not familiar with the Compaq scheme, but I assume
they use something similar.

The cost of this is one extra drive: if you have N drives of storage
capacity, you need N+1 drives to do data guarding with the Dell
scheme.  But for someone who needs continuous availability, it's
pretty useful.

Until the defective drive is replaced, there is no data guarding and
performance is reduced.  When the drive is finally replaced, the array
controller will rebuild the drive and go back to full speed and
guarding mode.  The drive rebuild can be done in "background" while
the system is running.  Performance is reduced until the rebuild is
finished, but the system is fully available.
-- 
James R. Van Artsdalen          james@bigtex.cactus.org   "Live Free or Die"
Dell Computer Co    9505 Arboretum Blvd Austin TX 78759         512-338-8789

davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (10/02/90)

In article <48109@bigtex.cactus.org> james@bigtex.cactus.org (James Van Artsdalen) writes:

| (Hopefully I didn't take Bill's comments too far out of context)

  I have no complaints.

| Actually, a key goal of data guarding is to keep running in the event
| of a failure.  For example, with the Dell drive array, you can have a
| drive completely fail (yank the power cable out) without any errors.

	[ ... ]

| Until the defective drive is replaced, there is no data guarding and
| performance is reduced.  

  I guess I sounded as though the whole thing would be dead with one
drive down. If you can afford to run with no guarding you can keep
going. This sounds somewhat like doing a XOR of all the sectors on N
disks and writing the result sector to the N+1 disk. You can also do it
with schemes like Hamming or fire codes (error recovery schemes). In any
of these methods you can run doing error recovery on every read, and I
hope I didn't mislead anyone on that. *I* wouldn't want to, but I am
highly paranoid.

  By the time you buy a machine with N+1 drives and a fancy controller,
and decide to leave one Nth of your capacity for error protection, you
probably will spend enough to have a spare drive on site, tested,
formatted, and ready to drop in and run.
-- 
bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen)
    sysop *IX BBS and Public Access UNIX
    moderator of comp.binaries.ibm.pc and 80386 mailing list
"Stupidity, like virtue, is its own reward" -me