dtynan@altos86.Altos.COM (Dermot Tynan) (08/31/90)
In article <1990Aug27.183821.13518@ico.isc.com>, rcd@ico.isc.com (Dick Dunn) writes: > > Even "reliable" disks eventually die. > > True. So do reliable controllers. I don't know what your hardware background is, but let me assure you that the following statement is Law: MTBF(controllers) >> MTBF(disks) ..........................(i) No-one can claim to produce a completely fault-free system. Most of the rhetoric is exactly that. "Fault Tolerant", "Fault Resilient", etc. No matter what you do, as long as there is a probability (no matter how small), of something failing, your system is not fault-free. The whole idea behind disk mirroring, is not to replace disk backups (which can also be faulty), but to reduce the fault probability by a considerable margin. In general terms, if you want to make a system more resilient to failure, the first place to look is in any non-solid-state system. Ie, anything with moving parts. In the average system, this means the disk drives. While mirroring won't eradicate the probability of failure, it will reduce it considerably. At least from the users point of view. > What I want to get at--and it's something I didn't say at all in my previous > posting--is that if you're looking for a certain level of reliability, it's > a lot harder than just tossing on extra disks and mirroring. See above. Nobody is trying to produce a fault-free system. We are just trying to reduce the likelihood of having to restore a filesystem. Believe me. Disk mirroring will slow down disk writes (which aren't the bulk of disk operations, anyway), but it will double your disk reliability. > - Is there another way to get comparable recovery capability? > To the second question, I'll suggest "journaling" as providing a lot of > what you need, possibly at much less cost. I'm more interested in the > first question. Certainly "journaling" is another approach. However, it puts the onus on the person writing the application, rather than hiding it in the OS, and furthermore, it is as valid to label "journaling" as a marketing bullet item, as it is disk mirroring. It is a question of what the user community wants. Altos, like most companies is a slave to its user community. Most product development is based on what our customers want. They want mirroring. We implemented it. It has nothing to do with bullet items. It has to do with what the market wants. > I had pointed out that it takes extra I/O bandwidth to handle mirroring; > someone responded that if you have the right sort of controller, it will > write both disks at once for you. OK, fine, now you've made the controller > a single-point-of-failure. MTBF(controller) >> MTBF(disks) Get it? > I've seen as many motherboard and controller > failures as disk failures. I don't pretend my experience is typical, but > suppose that it might be. The disks are not the only failure points in the > system. I suggest that you have some serious design flaws here. See Law (i). Furthermore, even if the controller *does* die, you can snap on a new controller, and continue, a lot faster than you can replace a disk, and restore from backups. Assuming, of course, that your backups were done *right* before the disk died, or that you log all transactions to tape. > If you're essentially running on one disk and just writing the > other as a backup mirror, you're not getting the ongoing check that you > really need for reliability. Again, the reliability gained from even the simplest of mirroring schemes far exceeds not doing *any* mirroring. If, indeed, reliability is a concern. If this isn't enough, there are other things you can do. This sort of falls into the standard Cache argument, which goes like this... "With a 256K cache, you can get a 95% hit rate. So why bother only using a 64K cache?". The correct answer, of course, is that the 64K cache may only give you an 80% hit rate (arbitrary figure), but its still a lot better than 0%. And its one quarter the cost! > In this case, I'm not arguing that > mirroring is worthless, but I do argue that it's inordinately expensive > and only addresses one small part of the overall reliability problem. A > single system with mirrored disks on one controller has only one element of > redundancy. A third time: MTBF(controller) >> MTBF(disks) What exactly do you mean when you say "expensive". Since Altos doesn't charge anything for disk mirroring, and for the most part, is developed in conjunction with disk striping (which is worth its weight in gold), doesn't require any noticeable NRE. As for its performance expense, this is *only* borne by those who enable it (SCO and C2 could learn something here :), therefore, there is *no* expense to those people (the majority, probably) who don't use it. For those who do, you've failed to convince me that the performance expense is not worth the gain. - Der -- Dermot Tynan, Altos Computer Systems, San Jose, CA 95134 dtynan@altos86.Altos.COM (408) 432-6200 x4237 "Five to one, baby, one in five. No-one here gets out alive."
truesdel@sun418.nas.nasa.gov (David A. Truesdell) (08/31/90)
dtynan@altos86.Altos.COM (Dermot Tynan) writes: [ Quite a bit about mirrored filesystems, which I won't repeat here. ] Disk mirroring IS a relatively inexpensive method of "hardening" modest amounts of data. However, when you want to protect more than just a few disks worth, the costs of buying duplicate drives can quickly get out of hand. A less expensive approach, if you have a LOT of data, is to use a RAID (Redundant Array of Inexpensive Disks) style system, which can use a single spare disk to protect the data on several others. When you are talking about 100's of gigabytes of data, that's a lot of disk drives you won't have to buy. (Purists may note that mirroring is considered a simple form of RAID.) >In article <1990Aug27.183821.13518@ico.isc.com>, rcd@ico.isc.com (Dick Dunn) writes: >> I had pointed out that it takes extra I/O bandwidth to handle mirroring; >> someone responded that if you have the right sort of controller, it will >> write both disks at once for you. OK, fine, now you've made the controller >> a single-point-of-failure. > MTBF(controller) >> MTBF(disks) Get it? >> I've seen as many motherboard and controller >> failures as disk failures. I don't pretend my experience is typical, but >> suppose that it might be. The disks are not the only failure points in the >> system. >I suggest that you have some serious design flaws here. Another design flaw would be to use a single controller to run both disks. Separate controllers running, running separate disks, could allow the system to continue running in spite of the failure of a controller or a disk. If you get the software right, you would only have to come down long enough to replace a controller. (If you get the hardware right, wouldn't have to do that!) -- T.T.F.N., dave truesdell (truesdel@prandtl.nas.nasa.gov)
meissner@osf.org (Michael Meissner) (08/31/90)
In article <3895@altos86.Altos.COM> dtynan@altos86.Altos.COM (Dermot Tynan) writes: | See above. Nobody is trying to produce a fault-free system. We are just | trying to reduce the likelihood of having to restore a filesystem. Believe | me. Disk mirroring will slow down disk writes (which aren't the bulk of | disk operations, anyway), but it will double your disk reliability. If both mirrors are operational, it can speed up reads, since the system will get the data from which ever disk's read head is closer (assuming a smart OS and/or controller). Another win with disk mirroring is the trick they used internally on at least one machine at Data General. The main OS machine had a disk farm that was getting to the point that backups could no longer be done in a reasonable time period. What they did was mirror some/all of their critcal drives. Then they would break the mirror, and start backups on one side of the mirror (they could break the mirror without any disruption or taking the disk offline). Meanwhile, the users would be busily writing to the other (now non-mirrored) disk. This way backups did not have data changing underneath, they could use the much faster raw disk backup procedure (dump instead of tar in UNIX-speak), and the system did not have to be taken down. When the backups finished, they regrafted the mirrored disks back together, and the system would resync the disks during the idle loop. The downside of any mirroring scheme of course, is that you have to buy twice as many disk drives as you did previously (and I never was in a group that could afford it :-). -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142 Do apple growers tell their kids money doesn't grow on bushes?
frank@rsoft.bc.ca (Frank I. Reiter) (09/01/90)
In article <3895@altos86.Altos.COM> dtynan@altos86.Altos.COM (Dermot Tynan) writes: > >See above. Nobody is trying to produce a fault-free system. We are just >trying to reduce the likelihood of having to restore a filesystem. Believe >me. Disk mirroring will slow down disk writes (which aren't the bulk of >disk operations, anyway), but it will double your disk reliability. Maybe I've missed something, but it seems to me that the results should be orders of magnitude better than that. Let's say that the odds of a particular drive failing on a particular day are 1 in 1000. The odds of both drives failing on that day are then 1 in 1000000 are they not? Does not mirroring mean that both drives must fail simultaneously in order for there to be loss of data? -- _____________________________________________________________________________ Frank I. Reiter UUCP: {uunet,ubc-cs}!van-bc!rsoft!frank Reiter Software Inc. frank@rsoft.bc.ca, a2@mindlink.UUCP Surrey, British Columbia BBS: Mind Link @ (604)576-1214, login as Guest
rcd@ico.isc.com (Dick Dunn) (09/01/90)
dtynan@altos86.Altos.COM (Dermot Tynan) writes, starting from the following: > > > Even "reliable" disks eventually die. > > True. So do reliable controllers. > I don't know what your hardware background is... Hmmm...you probably don't want a bio right now, but I did spend some fair time working in a disk test engineering group. I won't make any great claim based on that, only that my experience with disk failures is more than casual and anecdotal. Whatever... >...but let me assure you that the > following statement is Law: > > MTBF(controllers) >> MTBF(disks) ..........................(i) Now, see, here's how flame-fests get started...you assert something as a "Law" when I "know" it's not so. In the past (ten years or so, let's say) you were close enough to right. It's really no longer true. Depending on a handful of factors, either MTBF(controllers) > MTBF(disks) or MTBF(controllers) ~ MTBF(disks) > No-one can claim to produce a completely fault-free system. Most of the > rhetoric is exactly that. "Fault Tolerant", "Fault Resilient", etc... We agree there, and so we move on (as you suggest) to trying to find the hot spots for failures. > ...In general > terms, if you want to make a system more resilient to failure, the first > place to look is in any non-solid-state system. Ie, anything with moving > parts. In the average system, this means the disk drives... This is a good place to start. It's conventional wisdom and common sense. (I'll add that the second place to look is wherever you've got true analog circuits--which is *also* in the disk subsystem, though it may be split between controller and drive.) But now consider: *Every*body knows that the disks are potentially a serious weak point--not only are they mechanical, but they hold your "per- manent" data. Even the disk manufacturers know it, and they don't like being the fall guys for every system failure. So they find ways to make their disks more reliable. Now, it's not exactly news that the disk boys are in the hot seat, but in the past it was relatively harder to make reliable disks at a decent price, so we accepted higher failure rates and did other things to mitigate them. Disk manufacturers are doing a much better job these days. It's not cheap--the price of disk is one of the larger chunks of the total price of most systems. What's really happened is that the disk manufacturers and system architects have agreed that disk reliability is important enough that they are spending enough money there to bring the reliability of the disk subsystem in line with the reliability of the rest of the system. That's just good engineering--it doesn't make sense to have one part of a system (particularly a critical part) far less reliable than the rest of the system...you go spend money on the unreliable part until it's good enough or until it's not wise to spend any more on it. The change in recent years is that it's possible to buy good enough reliability without screwing up the overall system cost. The true MTBF of small disks has probably increased by almost a factor of 10 in the last decade. > ...Disk mirroring will slow down disk writes (which aren't the bulk of > disk operations, anyway), but it will double your disk reliability. 1. Yes, writes aren't the bulk of the operations. However, they can commonly vary from about 1/3 (two reads for every write) to 1/10 of the total load. Your point is good, but you have to be a little careful about how much weight you give it. 2. Disk mirroring will double the reliability of the disks themselves, but that doesn't translate into a doubling of the reliability of even the disk subsystem, let alone the whole machine. >...Certainly "journaling" is another approach. However, it puts the onus on > the person writing the application, rather than hiding it in the OS... Not necessarily. For an application writer, you might do that if the system doesn't support it. But you folks are system designers; you'd put it in the system. (Nothing novel about that...after all, you've modified the file system for mirroring, right? You could just as easily have implemented journaling.) > ...Altos, like most companies is a slave to its user community... > ...They want mirroring. We implemented it... All understood...design by customer is uncomfortable. But I'm more inter- ested in looking at the real technical aspects of mirroring. > MTBF(controller) >> MTBF(disks) Get it? Now, now, don't get too pushy...:-) I still say "get better disks." MTBF of good modern disks is many years of power-on time. You will get card failures in that amount of time based on connector oxidation, if nothing else. > > I've seen as many motherboard and controller > > failures as disk failures. I don't pretend my experience is typical... >...I suggest that you have some serious design flaws here. See Law (i). I don't design hardware, and Law (i) isn't a law. But while we're talking about MTBF, let's note that MTBF(hardware) >> MTBF(software) for most systems. That's another reason I suggested journaling; it gives a second version of your data created by different code than the first. > Furthermore, even if the controller *does* die, you can snap on a new > controller, and continue, a lot faster than you can replace a disk, and > restore from backups... *After* you figure out that you've got a bad controller. Depending on the failure mode, you might have done some real damage in the meantime. > > In this case, I'm not arguing that > > mirroring is worthless, but I do argue that it's inordinately expensive > > and only addresses one small part of the overall reliability problem... > A third time: > MTBF(controller) >> MTBF(disks) while (strcmp(grab_input(),"MTBF(controller) >> MTBF(disks)") == 0) puts("buy better disks!); > What exactly do you mean when you say "expensive"... I mean that the cost of disk mirroring is a doubling of the cost of disk drives in the system...and they're already a major part of the cost of the system. -- Dick Dunn rcd@ico.isc.com -or- ico!rcd Boulder, CO (303)449-2870 ...I'm not cynical - just experienced.
dtynan@altos86.Altos.COM (Dermot Tynan) (09/01/90)
In article <89@rsoft.bc.ca>, frank@rsoft.bc.ca (Frank I. Reiter) writes: > In article <3895@altos86.Altos.COM> dtynan@altos86.Altos.COM (Dermot Tynan) writes: > >but [disk mirroring] will double your disk reliability. > > Maybe I've missed something, but it seems to me that the results should be > orders of magnitude better than that. > Frank I. Reiter UUCP: {uunet,ubc-cs}!van-bc!rsoft!frank This is true. However, I didn't want to be accused of using overrated figures. You can certainly guarantee doubling reliability, but on this network, you'd get flamed from on high if you took that too far... - Der -- Dermot Tynan, Altos Computer Systems, San Jose, CA 95134 dtynan@altos86.Altos.COM (408) 432-6200 x4237 "Five to one, baby, one in five. No-one here gets out alive."
allbery@NCoast.ORG (Brandon S. Allbery KB8JRR/KT) (09/02/90)
As quoted from <3895@altos86.Altos.COM> by dtynan@altos86.Altos.COM (Dermot Tynan): +--------------- | In article <1990Aug27.183821.13518@ico.isc.com>, rcd@ico.isc.com (Dick Dunn) writes: | > I've seen as many motherboard and controller | > failures as disk failures. I don't pretend my experience is typical, but | > suppose that it might be. The disks are not the only failure points in the | > system. | | I suggest that you have some serious design flaws here. See Law (i). +--------------- ISC doesn't make hardware. That's the key to this discussion; you're discussing apples and oranges. I've seen many a Taiwanese clone motherboard and disk controller die in my time. I've also seen some Altos CPU boards and file processors die --- but only about once a year (over some fifty systems that I am de-facto system administrator for) and generally on older equipment. Not that the integrated approach automatically makes such problems rare --- I saw quite a few hardware failures on the Plexus equipment I used to manage --- but when done right, the integrated approach minimizes such problems. Altos has had its problems, certainly; the best hardware and software won't help when it's not appropriate for the market, which has been one of the biggest problems I've seen with Altos, but the 5000 series looks like it can/will address many of those problems. This much I will say about Telotech, Inc. and Altos: we're picky. In particular, *I'm* picky; if Altos hardware and software weren't up to snuff, I'd recommend dropping it, with a pretty good probability that it would be done. But I haven't, and we haven't, because it works. (I'm technical, not sales; I could care less about hype, all I care about is if it works.) +--------------- | > In this case, I'm not arguing that | > mirroring is worthless, but I do argue that it's inordinately expensive | > and only addresses one small part of the overall reliability problem. A | > single system with mirrored disks on one controller has only one element of | > redundancy. | | A third time: | MTBF(controller) >> MTBF(disks) | | What exactly do you mean when you say "expensive". Since Altos doesn't charge | anything for disk mirroring, and for the most part, is developed in conjunction | with disk striping (which is worth its weight in gold), doesn't require any | noticeable NRE. As for its performance expense, this is *only* borne by those | who enable it (SCO and C2 could learn something here :), therefore, there is | *no* expense to those people (the majority, probably) who don't use it. For | those who do, you've failed to convince me that the performance expense is not | worth the gain. +--------------- I've said enough on SCO and C2 security, so I'll let that one pass. Granted, most people won't care about disk mirroring. None of Telotech's customers, with perhaps one exception (and that only in the long term), will care about it. But Ti Kan mentioned airline ticket systems and ATM systems. In the one, disk mirroring prevents major frustration to employees and users (think about that next time you're waiting for a plane ticket...) and in the other, I would consider it essential. And mirroring is truly *optional*: it costs NOTHING if you don't enable it. I've been evaluating an AMS-5000 at work; I'm happy with it, modulo the stuff Altos has no control over (C2...). If I weren't happy with it, I'd not be complaining about C2 security --- I'd be installing another computer in its place. Again, I care nothing about hype or "brand loyalty", I care about machines that do what they're designed to do. ++Brandon -- Me: Brandon S. Allbery VHF/UHF: KB8JRR/KT on 220, 2m, 440 Internet: allbery@NCoast.ORG Delphi: ALLBERY uunet!usenet.ins.cwru.edu!ncoast!allbery America OnLine: KB8JRR
allbery@NCoast.ORG (Brandon S. Allbery KB8JRR/KT) (09/02/90)
As quoted from <truesdel.652083402@sun418> by truesdel@sun418.nas.nasa.gov (David A. Truesdell): +--------------- | dtynan@altos86.Altos.COM (Dermot Tynan) writes: | >In article <1990Aug27.183821.13518@ico.isc.com>, rcd@ico.isc.com (Dick Dunn) writes: | >> I've seen as many motherboard and controller | >> failures as disk failures. I don't pretend my experience is typical, but | >> suppose that it might be. The disks are not the only failure points in the | >> system. | | >I suggest that you have some serious design flaws here. | | Another design flaw would be to use a single controller to run both disks. | Separate controllers running, running separate disks, could allow the system | to continue running in spite of the failure of a controller or a disk. If you | get the software right, you would only have to come down long enough to replace | a controller. (If you get the hardware right, wouldn't have to do that!) +--------------- Worst case uses the standard controller; one disk goes, the controller switches to the other disk. No down-time. If the controller goes, you're screwed. But if you need disk mirroring that badly, you are using one or two HPFP boards as well as the standard controller. A controller goes, the mirror disk on another controller takes over. No downtime. And since disk striping and disk mirroring are the same mechanism in recent Altos OS'es (including the current OS for the Series 1000), you can also configure for RAID. And since all three controllers are capable of independent operation, you lose very little time doing the mirroring. ++Brandon -- Me: Brandon S. Allbery VHF/UHF: KB8JRR/KT on 220, 2m, 440 Internet: allbery@NCoast.ORG Delphi: ALLBERY uunet!usenet.ins.cwru.edu!ncoast!allbery America OnLine: KB8JRR