[comp.software-eng] Some Issues Lying Beneath a Recent Exchange of Views

phil@umn-d-ub.D.UMN.EDU (Philosophy Dept) (02/27/90)

hal and others have observed that my exchange with Charles Martin has
become been something less than a model of academic discourse.  I want
to make an attempt to salvage something worthwhile from this "debate"
that might possibly shed some light on the real issues that lie below
the surface here.  In his article 2961, for example, Martin acknowl-
edges the importance of the distinction between pure and applied math-
ematics.  He also endorses the difference between absolute and relative
forms of support (proofs, demonstrations, verifications, what have you).
Whether or not he identifies the sources of these distictions relative
to this debate is less important than that these distinctions be ack-
nowledged as relevant--even fundamental--to the questions considered.

Much of the distress in the program verification community seems to me
to have arisen from a failure to adequately distinguish between vari-
ous different positions that are possible here.  I freely admit that I
was less concerned with drawing distinctions between various positions
at the time than I was to assess those that concerned me.  Let me there-
fore attempt to make a modest contribution toward that end by specify-
ing several different positions that no doubt deserve to be separated,
where some concern logic, others methodology, and others verification:

Positions of LOGIC:

(T1) Formal proofs of program correctness can provide an absolute, con-
clusive guarantee of program performance;

(T2) Formal proofs of program correctness can provide only relative, in-
conclusive evidence concerning program performance.

Positions of METHODOLOGY:

(T3) Formal proofs of program correctness should be the exclusive method-
ology for assessing software reliability;

(T4) Formal proofs of program correcness should be the principal method-
ology for assessing software reliability;

(T5) Formal proofs of program correctness should be one among various
methodologies for assessing software reliability.

Positions on VERIFICATION:

(T6) Program verifications always require formal proofs of correctness;

(T7) Program verifications always require proof sketches of correctness;

(T8) Program verifications always require the use of deductive reasoning.

My original paper was devoted to the distinction between (T1) and (T2) as
matters of logic.  I contend that (T1) is false but that (T2) is true.  In
relation to questions of methodology, I also maintain that (T3) is false,
and at the ACM CSC 90 debate, I argued further that (T4) is also false.  I
have no doubt, however, that (T5) is true.  With respect to verificaiton in
the broad sense, it is my view that (T6) is clearly false and that (T8) is
clearly true, but that the truth-value of (T7) is subject to debate.  Much
might hinge upon whether these proof sketches had to be written down, could
be merely thought-through, or whatever.  In the former case, (T7) becomes
close to (T6), in the latter case, closer to (T8).  However these matters
may finally be decided, these seem to be the principal distinctions which
need to be drawn to get at the issues that lie beneath the surface here.

Let me conclude with one further gesture of agreement with Charles Martin.
He acknowledges that there are important differences between accounting
procedures, for example, and life-critical situations.  On this we could
not more strongly agree.  The greater the seriousness of the consequences
that would ensue from making a mistake, the greater our obligation to in-
sure that it is not made.  This suggests that the role for prototyping and
testing increases dramatically as life-critical tasks come into play.  We
can afford to run a system that has only been formally assessed when the
consequences of mistakes are relatively minor (Nintendo games, for example),
but not when they are serious (radiation therapy, for example).  So it be-
comes a matter of how confident we need to be that a system will perform
as it is intended to perform.  We cannot know without testing the system.

James H. Fetzer

crm@romeo.cs.duke.edu (Charlie Martin) (02/27/90)

I just want to say that this summary Dr Fetzer has posted seems a really
good summary of the substance of the debate; I find I can disagree with
effectively none of it.

In article <3241@umn-d-ub.D.UMN.EDU> phil@umn-d-ub.D.UMN.EDU (Philosophy Dept) writes:
..Positions of LOGIC:
..
..(T1) Formal proofs of program correctness can provide an absolute, con-
..clusive guarantee of program performance;
..
..(T2) Formal proofs of program correctness can provide only relative, in-
..conclusive evidence concerning program performance.
..
..Positions of METHODOLOGY:
..
..(T3) Formal proofs of program correctness should be the exclusive method-
..ology for assessing software reliability;
..
..(T4) Formal proofs of program correcness should be the principal method-
..ology for assessing software reliability;
..
..(T5) Formal proofs of program correctness should be one among various
..methodologies for assessing software reliability.
..
..Positions on VERIFICATION:
..
..(T6) Program verifications always require formal proofs of correctness;
..
..(T7) Program verifications always require proof sketches of correctness;
..
..(T8) Program verifications always require the use of deductive reasoning.

..My original paper was devoted to the distinction between (T1) and (T2) as
..matters of logic.  I contend that (T1) is false but that (T2) is true.  

Agreed.

..In
..relation to questions of methodology, I also maintain that (T3) is false,
..and at the ACM CSC 90 debate, I argued further that (T4) is also false.  I
..have no doubt, however, that (T5) is true.  

Also agreed.
(I'd still like to know, someday, who came up with the thesis statement
you all started with.  I'd have hated to be Ardis and Gries.)

..With respect to verificaiton in
..the broad sense, it is my view that (T6) is clearly false and that (T8) is
..clearly true, but that the truth-value of (T7) is subject to debate.  Much
..might hinge upon whether these proof sketches had to be written down, could
..be merely thought-through, or whatever.  

Absolutely.  My use of hand-proof of programs is somewhere in the middle
of the T6..T8 spectrum, attempting to present a "literate program" that
is comparable with good mathematical prose in level of detail and depth
of argument.  One of the unfortunate things with most mechanical theorem
provers is that the level of proof needed is so much more complex
(deeper, less abstract) that it's kind of hard to follow the proofs at
all.  As I've said, I think this technique seems very promising, but more
experimentation is needed.  (Funding agencies please note!)

(One problem is the question of what "formal proof" means.  If I were
feeling disputatious (:-)) I'd argue that a hand proof checked by
several others *is* a formal proof.  But let's not start that debate
'till next week, okay?)

..
..Let me conclude with one further gesture of agreement with Charles Martin.
..He acknowledges that there are important differences between accounting
..procedures, for example, and life-critical situations.  On this we could
..not more strongly agree.  The greater the seriousness of the consequences
..that would ensue from making a mistake, the greater our obligation to in-
..sure that it is not made.  This suggests that the role for prototyping and
..testing increases dramatically as life-critical tasks come into play.  We
..can afford to run a system that has only been formally assessed when the
..consequences of mistakes are relatively minor (Nintendo games, for example),
..but not when they are serious (radiation therapy, for example).  So it be-
..comes a matter of how confident we need to be that a system will perform
..as it is intended to perform.  We cannot know without testing the system.

I'm with you here, in spades.  Especially since the thing that formal
methods usually give little insight into is the APPROPRIATENESS of the
specifications; does it REALLY do what we want?

For that reason, I question whether only formal assessment is EVER
appropriate.  The Nintendo game analogy is an apt one: while we can
formally get quite good assurance that our program realizes the program
for a game that was formally specified (formally specifying a game
program like trek might make an interesting exercise, at that... hmmm)

anyway, while we can get good assurance of the correspondence between
the formally-specified game and the realization, this formal assurance
tells us little about the "goodness" of the human-machine interface;
however, the user's satisfaction with the game relies heavily on this
very "goodness" as perceived.
Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)