[comp.ai] Re^2: Reasons why you don't prove your programs are correct

kempf@tci.bell-atl.com (Cory Kempf) (01/11/90)

bzs@world.std.com (Barry Shein) writes:

>Yes, the flap in the early 80's was over automatic program
>verification. It seemed that DOD was considering putting $150M into
>this so the debate became quite bloody.

I should think that the idea would be silly on the face of it... after
all, wouldn't it need to have a subroutine that could solve the
halting problem?  Last time I looked, it was still considered
impossible.  Maybe on the new Cray IV??*

+C

* Have you heard about the new Cray IV?  It can solve an infinate loop
in just 7 seconds!
-- 
Cory Kempf		Technology Concepts	     phone: (508) 443-7311 x341
uucp:	{anywhere}!uunet!tci!kempf, kempf@tci.bell-atl.com
DISCLAIMER: TCI is not responsible for my opinions, nor I for theirs

simpson@saturn.ind.trw.com (Scott Simpson) (01/12/90)

In article <548@tci.bell-atl.com> kempf@tci.bell-atl.com (Cory Kempf) writes:
>* Have you heard about the new Cray IV?  It can solve an infinate loop
>in just 7 seconds!

Yah, and it takes six halt instructions just to get it to stop!
Scott Simpson    TRW Information Networks Division    simpson@trwind.trw.com

bzs@world.std.com (Barry Shein) (01/12/90)

>I should think that the idea would be silly on the face of it... after
>all, wouldn't it need to have a subroutine that could solve the
>halting problem?  Last time I looked, it was still considered
>impossible.  Maybe on the new Cray IV??*
>
>+C

The halting problem states (very roughly) "There exists at least one
program who's halting conditions cannot be determined".

People don't often write that program*, the fact that there was one
(or a few) theoretically possible counter-example would hardly be
sufficient to invalidate the concept of automatic program
verification.

Computer science education makes far too much of this interesting
observation and seems to leave people forever without hope that any
automatic examination of a loop can be fruitful (e.g. in a debugging
compiler.)

* People do write infinite loops frequently, but those are usually
easy to detect automatically with some inequality manipulations.
-- 
        -Barry Shein

Software Tool & Die, Purveyors to the Trade         | bzs@world.std.com
1330 Beacon St, Brookline, MA 02146, (617) 739-0202 | {xylogics,uunet}world!bzs

kers@hplb.hpl.hp.com (Kers) (01/12/90)

joshua@athertn.Atherton.COM (Flame Bait) writes:

|   "Proving your programs are correct" seems to mean different things
|   to different people:

|   One group thinks it means that your program does exactly what your
|   spec says it should.

   ...

|   The first sort of proof is obviously useless, the spec becomes the
|   program, and bugs in the spec become bugs in the program.  Also,
|   specifing most programs is impossible; and if you could, then you
|   could write a compiler for the specification language.

I've seen this kind of remark several times in this multi-group discussion.
*It just isn't true*. Just because you can write a specification for a program
doesn't mean you can write a compiler for the specification language which
generates code, let alone acceptably efficient code.

Suppose I specify an square root function "sqrt" by:

    abs( sqrt( x ) ** 2 - x ) < 0.01

[Er - I'm not a numerical analyst; this fixed bound of 0.01 looks very suspect
to me, and I suspect that it's not the right kind of specification. But this
is a different issue.]

I claim this is a good specification (modulo the [...]) of the behaviour of 
"sqrt". Now tell me how you're going to generate code for sqrt from it, and
convince me that the code will be efficient, and that you have a *general*
mechanism that will do this for examples further removed from the Opening Book
of moves in the Specification Game.

I'll also think that the remark "specifying most programs is impossible" is
wrong. "bloody difficult, especially after the fact" is the predicate that
springs to mind.

Regards,
Kers.
--

Regards, Kers.
"If anything anyone lacks, they'll find it all ready in stacks."

ian@oravax.UUCP (Ian Sutherland) (01/13/90)

bzs@world.std.com (Barry Shein) writes:

>Yes, the flap in the early 80's was over automatic program
>verification. It seemed that DOD was considering putting $150M into
>this so the debate became quite bloody.

In this discussion, does the term "automatic program verification"
mean "FULLY automatic", that is, you just hand it the program and the
specification and it says "yes" or "no", or are we talking about
USER-GUIDED verification?

>I should think that the idea would be silly on the face of it... after
>all, wouldn't it need to have a subroutine that could solve the
>halting problem?

It depends on whether your specification language allows you to state
that the program halts, and it depends on whether your verifier is (in
the above sense) fully automatic.  If it does, and it is
(respectively) then there will be some programs that it gives an
incorrect answer for.  If the verifier is also allowed to answer "I
don't know", then it can always answer correctly but if so, must
sometimes answer "I don't know".
-- 
Ian Sutherland		ian%oravax.uucp@cu-arpa.cs.cornell.edu

Sans Peur

ian@oravax.UUCP (Ian Sutherland) (01/13/90)

In article <1990Jan12.030424.1397@world.std.com> oravax!cornell!batcomputer!rpi!zaphod.mps.ohio-state.edu!swrinde!cs.utexas.edu!yale!mintaka!bloom-beacon!bu.edu!bu-cs!xylogics!world!bzs bzs@world.std.com (Barry Shein) writes:
>The halting problem states (very roughly) "There exists at least one
>program who's halting conditions cannot be determined".

This a little TOO rough for me.  To speak roughly, but a little less
roughly than above, the unsolvability of the halting problem states
that there is no recursive procedure which will take a program P and an
input I and tell you whether P halts if started on I.  For a given
recursive procedure, there is always some P and some I such that the
given recursive procedure gives the wrong answer.  The particular P
and I depend on the recursive procedure, they are not fixed for all
time.

>Computer science education makes far too much of this interesting
>observation and seems to leave people forever without hope that any
>automatic examination of a loop can be fruitful (e.g. in a debugging
>compiler.)

Yes, I agree wholeheartedly.  In fact, I think the recursive
unsolvability of the halting problem has little to do with the
practical concerns of computing or verification.
-- 
Ian Sutherland		ian%oravax.uucp@cu-arpa.cs.cornell.edu

Sans Peur

kers@hplb.hpl.hp.com (Kers) (01/16/90)

joshua@athertn.Atherton.COM (Flame Bait) writes:

| kers@hplb.hpl.hp.com (Kers) [that's me] writes:
|  >I'll also think that the remark "specifying most programs is impossible" is
|   >wrong. "bloody difficult, especially after the fact" is the predicate that
|   >springs to mind.
|
|   No problem.  Specifiy the following programs, in a machine readable
|   format useful to a program verifier.  These should be commerical products,
|   so they must look good and feel good so that people want to use them:

Let me admit that I don't have a verifier to hand, so the machine-readability
of any specification I might supply is a little beside the point. (There are
any number of specification languages one might use, with varying degrees of
accessibilility and surface-level checking.)

Let me also admit that "look good" and "feel good" are things I have no idea
how to specify. Can we agree that much of the program behaviour is independent
of the look-and-feel?

[Lest this should seem a cop-out, there's no reason why one shouldn't specify
the "look" in a formal fashion. It's just that "looking good" would have to be
a property that was asserted about particular - eg - screen layouts; it can't 
be derived from mathematics. I just happen to think that this isn't a
particularly important problem *for verification* - it's part of the harder
problem of whether the specification is *appropriate* to the problem at hand.
Discussion encouraged.]

|       A rule based expert system shell.
|	   Remember to specify input and output for an window based system.

What makes the I/O for a window system harder than for a terminal?

|       A source code control facility like SCCS or RCS
|	   If you use system calls you must have some way of proving that
|	   they do what your spec says they do.

No, *I* don't.

I hope *someone* has; one day I'd like there to be a formal proof that they
*do* do what they're supposed to. The behaviour of system calls is given, like
the behaviour of "+"; the correctness of the program is relative to the
correctness of the system calls.

In fact, I can state the correctness of the program in just those relative
terms; *if* these system calls S1 ... Sn have meanings M1 ... Mn, then the
program is correct (has the specified behaviour). You can then examine my
assumptions about the system and decide whether they are justified.

|       <Put most any real world program here>
|
|   My last example points out a current problem with proving program correct.
|   They must only use libraries which have been proven correct, must run on 
|   an operating system which has been proven correct, and must use hardware
|   which is proven correct.  All of this is impossible right now.

True, but ... it depends what you're trying to accomplish. I agree that if we
demand complete proofs - in the sense that the proof extends all the way to
the bedrock - we can't do it [ever - see [*1]]. If we're happy to prove things
relative to some set of base assumptions (system calls behave this way, and if
they don't it's the system call that's broken, not the program) we can practice
separation of concerns.

|   I'm serious about this challenge.  I hold that these program can not
|   be specified in any kind of useful way, and that the set of programs 

How much specification is enough? How little would help? In the case of the
expert system shell, surely a specification of the inference engine (and a
proof that the code satisfied it) would be useful, even if the I/O was on 
shaky ground. After all, if the system misbehaved, you could concentrate your
efforts away from the engine.

|   which can be proven correct is so small as to be uninteresting.

Before I disagree any more I think we ought to decide just what limits we're
prepared (or not) to place on specifications and proofs. For example, do you
think that specifying a "real world program" is *in principle* impossible; if
so, why? Or just that it cannot be done *now*? Do you think that working on
verification is a complete waste of time, or just that some of its proponents
are hypemasters? Would proofs of components strike you as an improvement over
no proofs at all, or do you want the whole cat in the bag?

-- Footnoties follow ----------------------------------------------------------

[*1] You can't prove hardware correct; its behaviour is physics. You can only
prove it relative to your assumptions about the physics of electrons in
semiconductors (or whatever). 

Similarly, you can't prove that an expert system is "correct", if by that you
mean that it gives genuinely "expert" answers. But you could show that it will
give answers that are derivable (in whatever sense is appropriate) from its
rulebase. Proving that the rulebase has some correspondance with reality is
*not* program verification.







--

Regards, Kers.
"If anything anyone lacks, they'll find it all ready in stacks."