james@bigtex.cactus.org (James Van Artsdalen) (10/01/90)
In <1970@sixhub.UUCP>, davidsen@sixhub.UUCP (bill davidsen) wrote: | In article <5777@holston.UUCP> barton@holston.UUCP (Barton A. Fisk) writes: | The catch is drive failure causes loss of pieces of all files. Compaq | offers several solutions to guard against this including mirroring | and controller duplexing and data guarding. [...] > I can see how that would work using any of several schemes, but all of > them seem to require not using the other drives until the failed unit is > replaced. [...] (Hopefully I didn't take Bill's comments too far out of context) Actually, a key goal of data guarding is to keep running in the event of a failure. For example, with the Dell drive array, you can have a drive completely fail (yank the power cable out) without any errors. This can be any drive, not just one group (ie, nothing magic about the guard drive). I'm not familiar with the Compaq scheme, but I assume they use something similar. The cost of this is one extra drive: if you have N drives of storage capacity, you need N+1 drives to do data guarding with the Dell scheme. But for someone who needs continuous availability, it's pretty useful. Until the defective drive is replaced, there is no data guarding and performance is reduced. When the drive is finally replaced, the array controller will rebuild the drive and go back to full speed and guarding mode. The drive rebuild can be done in "background" while the system is running. Performance is reduced until the rebuild is finished, but the system is fully available. -- James R. Van Artsdalen james@bigtex.cactus.org "Live Free or Die" Dell Computer Co 9505 Arboretum Blvd Austin TX 78759 512-338-8789
davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (10/02/90)
In article <48109@bigtex.cactus.org> james@bigtex.cactus.org (James Van Artsdalen) writes: | (Hopefully I didn't take Bill's comments too far out of context) I have no complaints. | Actually, a key goal of data guarding is to keep running in the event | of a failure. For example, with the Dell drive array, you can have a | drive completely fail (yank the power cable out) without any errors. [ ... ] | Until the defective drive is replaced, there is no data guarding and | performance is reduced. I guess I sounded as though the whole thing would be dead with one drive down. If you can afford to run with no guarding you can keep going. This sounds somewhat like doing a XOR of all the sectors on N disks and writing the result sector to the N+1 disk. You can also do it with schemes like Hamming or fire codes (error recovery schemes). In any of these methods you can run doing error recovery on every read, and I hope I didn't mislead anyone on that. *I* wouldn't want to, but I am highly paranoid. By the time you buy a machine with N+1 drives and a fancy controller, and decide to leave one Nth of your capacity for error protection, you probably will spend enough to have a spare drive on site, tested, formatted, and ready to drop in and run. -- bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen) sysop *IX BBS and Public Access UNIX moderator of comp.binaries.ibm.pc and 80386 mailing list "Stupidity, like virtue, is its own reward" -me