[net.unix-wizards] disk-block integrity after system crashes

shekita@crystal.UUCP (07/12/85)

The problem is this: We have a database file system that
sits on a Unix raw disk. Our current goal is to add recovery
to the database. In order to do this we need to know some
things about disk controllers.

Suppose a write operation is initiated (i.e., the controller
begins processing the write request) and a system crash
occurs. 

1) Will the write finish? It seems that it shouldn't, since
   RAM will probably get flakey as power drops, and then
   a block of garbage will get written to disk.

2) If the write doesn't finish, will the block be detectably
   bad? For example, would the block's CRC be wrong, causing
   the controller to return an error on subsequent reads.

In essence, we'd like to know if a block write can be considered
atomic, and if it's not atomic, we'd like to know if there's
a way to detect whether the write was interrupted and/or whether
garbage got written.

Granted, the answers to these questions will be device dependent,
but we (unfortunately) seek general information. Any particular
expertise that you could share would certainly be useful, though.
Incidently, we currently run on an Eagle drive.
			Eugene Shekita
			Computer Science Department
			University of Wisconsin

hahn@AMES-NAS.ARPA (Jonathan Hahn) (07/13/85)

> Suppose a write operation is initiated (i.e., the controller
> begins processing the write request) and a system crash
> occurs. 
>
> 1) Will the write finish? It seems that it shouldn't, since
>    RAM will probably get flakey as power drops, and then
>    a block of garbage will get written to disk.

There is a significant difference between a "system crash" (i.e.
software crash) and an unexpected power failure (or other hardware
catastrophe)...

>  2) If the write doesn't finish, will the block be detectably
>    bad? For example, would the block's CRC be wrong, causing
>    the controller to return an error on subsequent reads.

In the event of a software crash, the disk sector(s) should be
written properly (i.e. data and ecc written out in proper format).
Of course, there's no telling how corrupted the data may have gotten
as a result of the crash.  The best protection against this is
one or more internal consistency checks of some sort.

In the event of a hardware failure such as a power failure during
a write, I think it's pretty much undefined and depends a lot on the 
hardware in question and timing particulars of the incident.

A formatted sector is made up of read-only, writable, and gap regions.
If the power went out while the disk head was over the read-only
or gap regions, the write would probably terminate successfully.
If the power went out during the writable region, you would probably
end up with a bad sector that returned hard ECC errors when read.

I believe that most controllers are wired such that if they loose
power, all disk operations are immediately disabled since the disks
may still be powered.  You should check the technical manuals for
your controller and disk drive.


-jonathan hahn

rpw3@redwood.UUCP (Rob Warnock) (07/17/85)

Just how bad can a power failure be? Well, how about wiping out
the formatting (and therefore the data, to say the least) on several
(even "many") cylinders on the disk? (Under Unix, might as well just
reformat and hope your backup tapes are healty!)

This can happen even if the disk drive has power-fail protection,
if the drive is in an "expansion box" and powered by a separate
power supply. As the power to the disk controller (in the main box)
goes down, so does the power to the drive cable terminating resistors
(which are normally pulled to +5 volts). This can, if you are unlucky,
cause the "WRITE ENABLE L" signal to drop below the TTL threshold and
start writing on the disk BEFORE the power to the disk (in the expansion
box) drops enough to shut off the write amps in the disk. It all depends
on the relative "hold-up" time of the two power supplies in the two boxes.
Conversely, if you have TWO disks on the same controller sharing a bussed
"control" cable, if the expansion box power drops first, you can wipe the
data & formatting on the disk in the main box.

One of the nice things about large DEC systems is that they have a power-
fail line which is bussed between all of the expansion boxes (if the
installer hooked them up correctly) and which causes ALL the boxes to
panic and protect themselves if ANY of the boxes loses power. (Of course,
if you are having troubles with power supplies, this bussed line makes it
hard to figure out which box is causing the problem, so sometimes it gets
unhooked for debugging and never gets put back...)

It is possible (and not too expensive) to protect disks fairly well from
this sort of thing, but a lot of the current low-cost "desktop" computers
don't bother. (*sigh*)


Rob Warnock
Systems Architecture Consultant

UUCP:	{ihnp4,ucbvax!dual}!fortune!redwood!rpw3
DDD:	(415)572-2607
USPS:	510 Trinidad Lane, Foster City, CA  94404