[net.cse] correctness

ladkin@kestrel.ARPA (Peter Ladkin) (03/21/86)

Some comments have recently been made in this group to the 
effect that it's not feasible to *prove* correctness of a
term project program.
Prima facie, this is false. If the program is specified in
terms of its exact output on a small finite number of test
cases, then the program may be *proved* correct by exhaustive
testing. It is indeed the case that an assumption has to be
made about the environment (e.g. the Pascal compiler does
not generate code that will side-effect any processes other
than I/O and memory management,......) but it is standard
to make this assumption for serial processes.
Of course, this might not be an *interesting* program
specification in many cases, but it does have pedagogical
value.
One usually wants to make some extra conditions on the 
program, such as, it must be generalisable (which in this
case would ensure that the program didn't just case-test-
and-print). But this is not a question of correctness
from this specification. It would be in others (*parse
all Pascal programs...*)

Peter Ladkin

deller@vrdxhq.UUCP (Steven Deller) (03/24/86)

In article <6042@kestrel.ARPA>, ladkin@kestrel.ARPA (Peter Ladkin) writes:
> Some comments have recently been made in this group to the 
> effect that it's not feasible to *prove* correctness of a
> term project program.

Being one who stated that it was not feasible to ask for correctness
proofs in my class, I feel compelled to reply.

I stated it was not feasible to ask for proof of correctness only in the 
course I am teaching currently.  This is primarily because the students do not 
have the necessary background courses to perform program proving, and my 
course is supposed to teach Ada programming, not program proving (it is 
debatable whether any programming should be taught before students understand 
proofs of correctness, but that is another issue).

> Prima facie, this is false. If the program is specified in
> terms of its exact output on a small finite number of test
> cases, then the program may be *proved* correct by exhaustive
> testing. 

Exhaustive testing is not a proof of correctness -- the two are equivalent 
only to the extent that a proof of correctness contains no errors and the 
specification being proved properly captures the intention of the program.  
Exhaustive testing on the other hand is a more solid foundation for trust 
in a program, because the set of input/output pairs captures the intended 
program action (or else there is no basis for the adjective "exhaustive").  
If exhaustive testing is performed by a program that generates and checks 
the data, then it too is subject to errors in performance.  

The inescapable conclusion is that exhaustive testing and program proving are
not equivalent.  If forced, I would place my money on exhaustive testing, but
I would much rather have a program proof AND some testing.  Exhaustive testing
is the most desirable (in terms of confidence of correctness of the program),
but there are other high-confidence test methods (such as path testing, branch 
testing, or domain testing) that are applicable to most real programs.
Exhaustive testing is applicable to only a few programs since any program that 
inputs even two 32-bit integers will not permit exhaustive testing (it would 
take over 1000 years to test, even if each test were run in 1 microsecond).

> One usually wants to make some extra conditions on the 
> program, such as, it must be generalisable (which in this
> case would ensure that the program didn't just case-test-
> and-print). But this is not a question of correctness
> from this specification. It would be in others (*parse
> all Pascal programs...*)

I believe this is counter to all good programming methods.   What Peter is
advocating is to "call" the test set the complete program specification, 
while in actuality he envisions some criteria of "generalisability", which 
is intended so that the program will work for other inputs as well, i.e. this
generalisability is intended to make the student write a program for the 
"intended" specification.

I think it is just a case of calling a tail a leg (the old Twain joke:
"If you call a tail a leg, how many legs does a dog have.  Four.  Calling
a tail a leg doesn't make it one.").  Calling the test set the complete
specification doesn't make it so if other "extra conditions" make it not so.

It would be much better to teach the students to work with and prove the
intended specification.  It is only with regret that I reached the decision
that it is not feasible within the course I am teaching.

-- 
Steven Deller      {verdix,seismo,umcp-cs}!vrdxhq!deller
Verdix Corporation (the Ada makers)   (the usual "solely my own opinions"...)