[comp.ai] Semantics & Provability

markh@csd4.csd.uwm.edu (Mark William Hopkins) (01/16/90)

This response, which might be of general interest, has been forwarded to
this newsgroup (also the mail to the person indirectly mentioned in the letter
bounced).

* From lewis@godzilla.eecg.toronto.edu Mon Jan 15 08:09:55 1990
* From: david lewis <lewis@eecg.toronto.edu>
* To: markh@csd4.csd.uwm.edu
* Subject: Re: Program verification is a proven concept
* Date: 	Mon, 15 Jan 90 09:08:59 EST
* 
* You implied that there are algebraic rules for C. That would
* surprise me, due to the incredible aliasing problems. I would
* believe that you could express the semantics of C using denotational
* semantics, but that is not the easiest thing to manipulate. Can
* you point out any description of the semantics of C that can acutally
* be used?
* 
* For my own programs, I never bother to prove anything, since it is
* just too complex for any real program. I'm curious to hear a little
* bit more about your experience.
* 
* David Lewis

Date: Mon, 15 Jan 90 18:08:26 -0600
From: Mark William Hopkins <markh>
To: lewis@godzilla.eecg.toronto.edu, markh
Subject: Semantics & Proof

     A typical application, for me, would be to carry out an algebraic analysis
of a program or program segment either to verify it or to verify that some
assertion holds over it.  The end result can either be to prove that the
program does what it is supposed to do (i.e. demonstrate that the semantics
derived from the code matches the ever-vague "intent" -- with the latter
expressed in natural language or non-linguistically via diagrams), to
optimize the code -- by using algebraic rules to simplify code, or to
translate from one language to another.

     This letter is off the top of my head, so instead of trying to
deal with this in detail (especially with aliasing), I think it might be
more appropriate to forward you a response to another letter I received
(slightly edited).

     I'm assuming you're familiar with the lambda calculus, since it makes
quite a bit of reference to it.

------------------------------------------------------------
* markh@csd4.csd.uwm.edu (Mark William Hopkins) [14 Jan 90 02:21:43 GMT]:
* > Programs in even languages like C, Pascal can be manipulated via a set of
* > (complete) algebraic rules -- which form a foundation for the semantics
* > of the language in question.  I use such rules all the time.
* 
* I'm quite interested in this.  Do you have any examples online that
* you could send me (or do you use a pencil :-); can you send me any
* references?  Like so many others here, I am working in program semantics
* (CCS/lambda calculus), and am always interested in program proofs.

...
...

A simple rule in Pascal would look like this:

while B do S == if B then begin S; while B do S end

where B is any boolean expression, and S any statement, and "==" means
that both statements have precisely the same input-output semantics.

Another, more interesting, set of rules relating to pointers and to
structured types would allow you to prove that the Pascal sequence

A[I] := 3; B := A; B[I] := 2; writeln(A[I]:1)

will write out the value 3 (i.e. it is equivalent to the statement sequence
write("3"); writeln), and the C sequence

A[I] = 3; B = A; B[I] = 2; printf("%d\n", A[I]);

will write out the value 2 (it is equivalent to the statement printf("2\n");)

A key to the whole process is to directly translate the assignment statement
(involving simple variables) into the let statement of the extended
lambda calculus, and to translate assignments involving data structure
components or pointer references by using "updatable" functions.  For
example:

A[I] := 3; ...

would be translated into

let (A I) <- 3 in ...

and

P^ := 3; ...

as

let (VAL P) <- 3 in ... where VAL is a programmer-invisible updatable function
of type (^int -> int)

where

let (f a) <- b in E     means     let (f x) = if (= x a) b (f x) in E

or
	   (lambda f.E) (lambda x.if (= x a) b (f x))

Notice the difference between the results of

     let x = ... in
     ...
     let (f x) <- 3 in 
     let g = f in
     let (g x) <- 2 in (f x)
and
     let x = ... in
     ...
     let (x f) <- 3 in
     let g = f in
     let (x g) <- 2 in (x f)

One returns 3, and the other 2 (that is what underlies the difference between
the Pascal and C examples above).  The lambda calculus can model the
difference between "copying" and "sharing" (aliasing).

This correspondence between imperative and functional languages forms the
theoretical basis for the set of algebraic rules, but is not, itself, used as
a tool in the verification process.   The whole idea is to obviate the
need to have to bring in the lambda calculus by using a set of relatively
easy-to-use algebraic rules in its place.

Using algebraic rules alone, the above Pascal example would reduce like
this:
        A[I] := 3; B := A; B[I] := 2; writeln(A[I]:1)
==
	B := A{I <- 3}; B[I] := 2; writeln( A{I <- 3}[I]:1)
==
	A{I <- 3}[I] := 2; writeln( A{I <- 3}[I]:1)
==
	writeln( A{I <- 3}{I <- 2} {I <- 3}[I]:1)
==
	writeln(3:1)

where A{I <- x} is an extra-Pascal notation used to denote the lambda
expression:
	                 let (A I) <- x in A

One property of it used in the last reduction is:

	          A{I <- x}{I <- y} = A{I <- y}

This one reduction example illustrates that algebraic rules for a programming
language need not produce results that lie exclusively in that language.  It's
often more efficient to develop a set of rules that "preprocess" a program
segment into something outside that language, and subsequentially simplify
further analysis.

markh@csd4.csd.uwm.edu (Mark William Hopkins) (01/16/90)

     This is another forwarded response to a letter, which I feel might be of
general interest (the mail to this user bounced too).

From: gateley@m2.csc.ti.com (John Gateley)
Date: Mon, 15 Jan 90 10:20:15 CST
Subject: Re: Program verification is a proven concept

Hi Mark,

In article <1888@uwm.edu> you write:
>I must've missed the boat: I always prove my programs correct.  Pathologies
>in provability (which are theoretically guaranteed) almost always indicate
>flaws in design.  Therefore undecibeability is actually an AID in program
>verification, not an obstacle.

I am very curious: which language(s) do you program in, and could you
send me an example of a program plus proof?

Thanks,

john
gateley@m2.csc.ti.com

------------------------------------------------------------

The imperative languages I program in are C, and Pascal (and also the
assembly for the Intel 8051 family).

I'll send you a proof of a program segment illustrating some very basic rules:

PROVE:
N = 2; X = 4;
for (I = 0, Y = 1; I < N; I++; Y *= X);
printf("%d", Y);

PRINTS OUT THE VALUE 16.

(1) A for-loop in C is equivalent to a while loop (as specified in the K&R
reference).  Making the transformation specified in the source here gives:

N = 2; X = 4; I = 0; Y = 1;
while (I < N) {I++; Y *= X;}
printf("%d", Y);

(2) Under certain conditions (not to be delineated here) as assignment
                              V op= C
is equivalent to
                             V = V op C
These conditions are met here.  Likewise I++ transforms to I = I + 1 in this
context.  This results in:

N = 2; X = 4; I = 0; Y = 1;
while (I < N) {I = I + 1; Y = Y * X;}
printf("%d", Y);

The conditions, or course, relate to the possibility to side-effects, such as
in
			      V += V++ - --V;

An algebraic rule to handle this would be implementation dependent.  For
example, if a compiler evaluates from left-to right, the assignment above
might be reducible successively to:

       x1 = V++;            x2 = --V;          V += x1 - x2;

   x1 = V; V = V + 1;   V = V - 1; x2 = V;    V =  V + x1 - x2;

	x1 = V;                x2 = V;          V = V + x1 - x2;
(notice the cancellation)

		           V = V + V - V;

			       V = V;

				 ; (the null statement)

where x1, and x2 are variables not occurring elsewhere in the program.

(3) A while-loop is equivalent to an INFINITELY nested sequence of
if-statements expressed recursively by the following rule:

while (B) do S  ==  if (B) {S; while (B) do S}
i.e.
      if (B) {S; if (B) {S; if (B) {S; ...}}}

This rule will be used in the following reductions below.

(4) The following are equivalent:

if (B) S1; else S2; T  ==  if (B) {S1; T} else {S2; T}

applied in the case of the sequence:

while (I < N) {I = I + 1; Y = Y * X;}
printf("%d", Y);

this produces an infinitely-nested series of branches defined recursively by:

xx = if (I < N) {I = I + 1; Y = Y * X; xx} else printf("%d", Y);

(5) A rule analogous to the BETA REDUCTION rule applies for assignment
statements of the form: Simple-Variable = Expression.  When used on the
sequence:

N = 2; X = 4; I = 0; Y = 1; xx

(with xx defined as above), this gives the successive reductions:

N = 2; X = 4; I = 0; Y = 1; xx
->
X = 4; I = 0; Y = 1; xx'
->
I = 0; Y = 1; xx''
where xx', xx'' are defined by

xx' = if (I < 2) {I = I + 1; Y = Y * X; xx'} else printf("%d", Y);
xx'' = if (I < 2) {I = I + 1; Y = Y * 4; xx''} else printf("%d", Y);

I = 0; Y = 1; xx''
->
Y = 1; if (0 < 2) {I = 0 + 1; Y = Y * 4; xx''} else printf("%d", Y);
->
if (TRUE) {I = 1; Y = 1 * 4; xx''} else printf("%d", 1);
->
I = 1; Y = 4; xx''
->
Y = 4; if (1 < 2) {I = 1 + 1; Y = Y * 4; xx''} else printf("%d", Y);
->
if (TRUE) {I = 2; Y = 4 * 4; xx''} else printf("%d", 4);
->
I = 2; Y = 16; xx''
->
Y = 16; if (2 < 2) {I = 2 + 1; Y = Y * 4; xx''} else printf("%d", Y);
->
if (FALSE) {I = 3; Y = 16 * 4; xx''} else printf("%d", 16);
->
printf("%d", 16);
->
printf("16");

These are very fine-grained steps, however.  In their place, and based directly
on the rules used in manipulating the loop, there is a more general theorem
that can be used to generate a system of Diophantine equations from this
loop. 
   More generally, the rules only provide a theoretical foundation, and are
not to be used on a practical level -- so there are more powerful tools of
analysis that can be developed, based directly on them, that can be used in
their stead.
   For example (3), (4) and other identities involving control-constructs can
be used as the basis for transforming and optimizing control-constructs.  I've
used this once to transform a C code segment from:

NameInsert(N, Head, Tail)
struct NameList *N, **Head, **Tail;
{
   if (*Tail == 0) {
      N->Prev = N->Next = 0; *Head = *Tail = N;
   } else {
      (*Head)->Prev = N; N->Prev = 0;
      N->Next = *Head; *Head = N;
   }
}

NameDelete(N, Head, Tail)
struct NameList *N, **Head, **Tail;
{
   if (N->Prev == 0)
      if (N->Next) {
         *Head = N->Next; N->Next->Prev = 0;
      } else *Tail = *Head = 0;
   else if (N->Next == 0)
      if (N->Prev) {
         *Tail = N->Prev; N->Prev->Next = 0;
      } else *Tail = *Head = 0;
   else {
      N->Prev->Next = N->Next; N->Next->Prev = N->Prev;
   }
}

to:

NameInsert(N, Head, Tail)
NameList N, *Head, *Tail;
{
   if (*Tail == 0) *Tail = N; else (*Head)->Prev = N;
   N->Prev = 0; N->Next = *Head;
   *Head = N;
}

NameDelete(N, Head, Tail)
NameList N, *Head, *Tail;
{
   if (*Head == N) *Head = N->Next; else N->Prev->Next = N->Next;
   if (*Tail == N) *Tail = N->Prev; else N->Next->Prev = N->Prev;
}

The software this came from was a 100k teleconferencing program, which I
optimized to under 50k and verified, there were a couple other (valid)
assumptions that were needed to make the transformation above.