[comp.sys.mac.programmer] Error Handling and Recovery

rs4u+@andrew.cmu.edu (Richard Siegel) (04/04/88)

Has anyone come up with a bulletproof error recovery scheme, to recover
from things like disk and memory errors?

How does Mac App do it?

Rich

lsr@Apple.COM (Larry Rosenstein) (04/12/88)

In article <Added.MWJebfy00Vs18Fqk8u@andrew.cmu.edu> rs4u+@andrew.cmu.edu (Richard Siegel) writes:
>
>Has anyone come up with a bulletproof error recovery scheme, to recover
>from things like disk and memory errors?

I think MacApp has an error recovery scheme that can be used to write
bulletproof programs, but it requires some work on the part of the
programmer.  There are many aspects to the error handling in MacApp, so I
will just outline them.

First, we implemented a an exception handling mechanism in Pascal.  The
programming model is that there is a stack of exception handlers.  A call to
the routine CatchFailures pushes a handler on the stack.  A call to Success
pops a handler off the stack.  The CatchFailures & Success calls have to
nest within a particular procedure.

If an exception occurs, then the program can call Failure.  This pop the top
handler off the stack, restores the necessary registers, and calls the
handler.  If the handler returns normally, Failure is called automatically,
further unwinding the stack.  The handler can also do a non-local GOTO if it
wants to continue processing.

The Failure calls takes 2 parameters a 2-byte error code, and a 4-byte
message.  These values are passed to the exception handler.  (This gives a
3rd alternative for the handler; it can call Failure can pass different
parameters.)

In general, the message is used to indicate what operation failed, and the
error code to indicate why it failed.  The main event loop of MacApp
contains a handler that looks these values up in a set of tables, displays
and appropriate alert, and continues handling events.

A typical error alert in MacApp might read "Could not save 'Foo' because the
file is locked. Use the Get Info command in the Finder to unlock the file."

The "save 'Foo'" part describes what failed, and is based on the mesage
value.  The "the file is locked part" is derived directly from the error
code, as is the last sentence.

We use a generic alert for all error messages, and substitute in the
strings.  This allows us to provide very detailed messages, instead of just
"Could not save the file."  

From the programmer's point of view all s/he has to do it check for errors.
There are not very many error codes that makes sense to report (others are
converted into a generic message), and MacApp generally takes care of
providing the right message values.

In order to encourage progammers to check for errors we supply utilities.
For example FailNIL takes a pointer or handle and calls Failure(memFullErr,
0) if the parameter is NIL.  Similarly, FailOSErr takes an OSErr parameter
and calls Failure if it is non-0.

This leads to code such as: x := NewHandle(...); FailNIL(x); or
FailOSErr(FSWrite(...));  After a while you get in the habit of checking for
errors.

A recent article of mine talked about MacApp's memory management strategy.
Basically, we try to reserve enough memory in advance to ensure that the
program will be able to function, save documents, and quit without bombing.
This requires the programmer to tell MacApp the amt of memory to reserve.

For disk errors, MacApp by default saves a document to a temporary file, and
only deletes the original if the save succeeds.  This requires the
programmer to tell MacApp the number of bytes needed to save the file.
MacApp will check to see if there is enough disk space, create the temp
file, tell the program to save into it, and rename the temp file to the
correct name.  It also copies the Finder information from the old file so
that the icon doesn't move, etc.

If there is not enough disk space to make a copy, then MacApp itself will
call Failure, and the user will be told that s/he is out of disk space.
There is one interesting situation.  There might not be enough disk space to
make a copy, but if the original was deleted first, there would be.  In that
situation, MacApp puts up an alert asking if it is OK to delete the original
file first.  

MacApp also checks for other cases that happen in an environment with file
servers.  For example, before saving a file we check to see if the
modification date is the same as when we read the file.  If not, someone
else may be using the file, and the user is given a message.  

If the user tries to revert and the file doesn't exists or the file type has
changes s/he also gets a message, and the revert is aborted.

We tested the error handling in MacApp by adding all the necessary checks to
2 sample programs, and testing them as if they were real products.  We
didn't ship MacApp until these sample programs could pass this stress test. 

That's some of the highlights, if there are any questions let me know.

-- 
		 Larry Rosenstein,  Object Specialist
 Apple Computer, Inc.  20525 Mariani Ave, MS 27-AJ  Cupertino, CA 95014
	    AppleLink:Rosenstein1    domain:lsr@Apple.COM
		UUCP:{sun,voder,nsc,decwrl}!apple!lsr