[comp.software-eng] program correctness

hughes@math.Berkeley.EDU (eric hughes) (02/15/89)

I am sure I will ramble more than Dave, so ...

In article <1859@goofy.megatest.UUCP>, djones@megatest (Dave Jones) writes:
>So one has a choice: write the invariants in terms of the computer
>model, in which case you have tacitly agreed that the model is correct --
>or write the invariants in terms of the problem, and somehow "teach"
>the computer about those terms, in which case you tacitly agreee on
>a model.  Hmmm...

The notion of program correctness is slightly different that the
correctness of a mathematical theorem.  The basic difference is that
programs are meant to be run on physical hardware, which is based on
IC's, which are based on physics, which is based on observation.  Each
of these layers pushes the question of truthfulness back a bit, and
finally back to our observations of physical reality.  Now we can
prove and prove our software and hardware all we want, but in the end
we rely on reality to prove, i.e. put to trial, our software.
In mathematics, however, we don't presume that the math per se is
less valuable because it is based on what seem at the same time
obvious and arbitrary axioms.  

Thus there is no way around agreeing that "the model is correct."
Program correctness can only be used to increase the confidence that
the software will work.

The same problem exists in trying to teach the computer about the
problem.  Here you have substituted one model for another.  You
just can't get away from it.  But that's OK, because no else can
either.  Just remember:

     Epistemology is the NP-complete problem of philosophy.

You heard it here first. :-)

>And there's another problem: The assertions often have universal 
>quantifiers and such that can not reasonably be calculated at runtime.

>One can perhaps program around some of this by programming a basic
>understanding of mathematical induction into the machine.  But when you 
>start to attempt to do proofs about all possible histories of events and
>such, things seem to get a little sticky.

This brings up what I consider to be *the* main hurdle for program
correctness right now: lack of definitions.  A mathematician does not
remember that a number has no other divisors than itself and one, but
rather remembers that the number is prime.  Prime numbers have emergent
properties of their own.  Now the definition of prime is a Pi_1 sentence,
meaning (roughly) that it uses a single universal quantifier.  But in
order to deal with this definition, there are all sorts of lemmas and
little theorems about prime numbers that permit one never to see the
quantifier in actual use.

The effect of this is similar to data hiding.  The quantifier, which
truly could be a very messy thing to work with, has been encapsulated
by a range of other known facts about it.  In practice, this
movement of the complexity to the proximity of the definition makes
math possible.  The concepts, though complex, easy for the mind
to process, much more easily that the complexity which increases
the size of predicates very fast.

The point is you should never need to reason about "all possible
histories."  Upon choosing good definitions and proving useful
lemmas, such histories can usually be bound up into a single
property, which may be hard to figure out and discover (generally
true), but which is direct and flexible to work with.

Here is a real-life example to illustrate the amazing lack of such
understanding.  I was writing an allocator to function on top of
malloc(), and I couldn't for the life of me figure out how the
postcondition for malloc() could be expressed.  We all know what
it is, that the block returned is owned by the caller, and no
one else can use it.  Strictly speaking this is not true, since
unless each block is individually memory protected, a stray
pointer can "unlawfully" use the block.  I struggled with
this for a long time, and finally decided to just defined the
property "vacant" to the blocks that malloc() returned.
I wrote down some of its characteristics and left it at that.
So, is my program correct?  Well, I documented a precondition
that "nothing else writes to the blocks malloc() gives me" and
knowing that my subroutines handled the blocks correctly, I
assume so.  But that kind of precondition is unnerving.

But I was left troubled that something so basic (seemingly) should
be so little understood explicitly, not by intuition.

Does anybody know if the ANSI library specifications include
preconditions and postconditions?  They sure should.

Eric Hughes
hughes@math.berkeley.edu   ucbvax!math!hughes