markh@csd4.csd.uwm.edu (Mark William Hopkins) (01/16/90)
This response, which might be of general interest, has been forwarded to this newsgroup (also the mail to the person indirectly mentioned in the letter bounced). * From lewis@godzilla.eecg.toronto.edu Mon Jan 15 08:09:55 1990 * From: david lewis <lewis@eecg.toronto.edu> * To: markh@csd4.csd.uwm.edu * Subject: Re: Program verification is a proven concept * Date: Mon, 15 Jan 90 09:08:59 EST * * You implied that there are algebraic rules for C. That would * surprise me, due to the incredible aliasing problems. I would * believe that you could express the semantics of C using denotational * semantics, but that is not the easiest thing to manipulate. Can * you point out any description of the semantics of C that can acutally * be used? * * For my own programs, I never bother to prove anything, since it is * just too complex for any real program. I'm curious to hear a little * bit more about your experience. * * David Lewis Date: Mon, 15 Jan 90 18:08:26 -0600 From: Mark William Hopkins <markh> To: lewis@godzilla.eecg.toronto.edu, markh Subject: Semantics & Proof A typical application, for me, would be to carry out an algebraic analysis of a program or program segment either to verify it or to verify that some assertion holds over it. The end result can either be to prove that the program does what it is supposed to do (i.e. demonstrate that the semantics derived from the code matches the ever-vague "intent" -- with the latter expressed in natural language or non-linguistically via diagrams), to optimize the code -- by using algebraic rules to simplify code, or to translate from one language to another. This letter is off the top of my head, so instead of trying to deal with this in detail (especially with aliasing), I think it might be more appropriate to forward you a response to another letter I received (slightly edited). I'm assuming you're familiar with the lambda calculus, since it makes quite a bit of reference to it. ------------------------------------------------------------ * markh@csd4.csd.uwm.edu (Mark William Hopkins) [14 Jan 90 02:21:43 GMT]: * > Programs in even languages like C, Pascal can be manipulated via a set of * > (complete) algebraic rules -- which form a foundation for the semantics * > of the language in question. I use such rules all the time. * * I'm quite interested in this. Do you have any examples online that * you could send me (or do you use a pencil :-); can you send me any * references? Like so many others here, I am working in program semantics * (CCS/lambda calculus), and am always interested in program proofs. ... ... A simple rule in Pascal would look like this: while B do S == if B then begin S; while B do S end where B is any boolean expression, and S any statement, and "==" means that both statements have precisely the same input-output semantics. Another, more interesting, set of rules relating to pointers and to structured types would allow you to prove that the Pascal sequence A[I] := 3; B := A; B[I] := 2; writeln(A[I]:1) will write out the value 3 (i.e. it is equivalent to the statement sequence write("3"); writeln), and the C sequence A[I] = 3; B = A; B[I] = 2; printf("%d\n", A[I]); will write out the value 2 (it is equivalent to the statement printf("2\n");) A key to the whole process is to directly translate the assignment statement (involving simple variables) into the let statement of the extended lambda calculus, and to translate assignments involving data structure components or pointer references by using "updatable" functions. For example: A[I] := 3; ... would be translated into let (A I) <- 3 in ... and P^ := 3; ... as let (VAL P) <- 3 in ... where VAL is a programmer-invisible updatable function of type (^int -> int) where let (f a) <- b in E means let (f x) = if (= x a) b (f x) in E or (lambda f.E) (lambda x.if (= x a) b (f x)) Notice the difference between the results of let x = ... in ... let (f x) <- 3 in let g = f in let (g x) <- 2 in (f x) and let x = ... in ... let (x f) <- 3 in let g = f in let (x g) <- 2 in (x f) One returns 3, and the other 2 (that is what underlies the difference between the Pascal and C examples above). The lambda calculus can model the difference between "copying" and "sharing" (aliasing). This correspondence between imperative and functional languages forms the theoretical basis for the set of algebraic rules, but is not, itself, used as a tool in the verification process. The whole idea is to obviate the need to have to bring in the lambda calculus by using a set of relatively easy-to-use algebraic rules in its place. Using algebraic rules alone, the above Pascal example would reduce like this: A[I] := 3; B := A; B[I] := 2; writeln(A[I]:1) == B := A{I <- 3}; B[I] := 2; writeln( A{I <- 3}[I]:1) == A{I <- 3}[I] := 2; writeln( A{I <- 3}[I]:1) == writeln( A{I <- 3}{I <- 2} {I <- 3}[I]:1) == writeln(3:1) where A{I <- x} is an extra-Pascal notation used to denote the lambda expression: let (A I) <- x in A One property of it used in the last reduction is: A{I <- x}{I <- y} = A{I <- y} This one reduction example illustrates that algebraic rules for a programming language need not produce results that lie exclusively in that language. It's often more efficient to develop a set of rules that "preprocess" a program segment into something outside that language, and subsequentially simplify further analysis.
markh@csd4.csd.uwm.edu (Mark William Hopkins) (01/16/90)
This is another forwarded response to a letter, which I feel might be of general interest (the mail to this user bounced too). From: gateley@m2.csc.ti.com (John Gateley) Date: Mon, 15 Jan 90 10:20:15 CST Subject: Re: Program verification is a proven concept Hi Mark, In article <1888@uwm.edu> you write: >I must've missed the boat: I always prove my programs correct. Pathologies >in provability (which are theoretically guaranteed) almost always indicate >flaws in design. Therefore undecibeability is actually an AID in program >verification, not an obstacle. I am very curious: which language(s) do you program in, and could you send me an example of a program plus proof? Thanks, john gateley@m2.csc.ti.com ------------------------------------------------------------ The imperative languages I program in are C, and Pascal (and also the assembly for the Intel 8051 family). I'll send you a proof of a program segment illustrating some very basic rules: PROVE: N = 2; X = 4; for (I = 0, Y = 1; I < N; I++; Y *= X); printf("%d", Y); PRINTS OUT THE VALUE 16. (1) A for-loop in C is equivalent to a while loop (as specified in the K&R reference). Making the transformation specified in the source here gives: N = 2; X = 4; I = 0; Y = 1; while (I < N) {I++; Y *= X;} printf("%d", Y); (2) Under certain conditions (not to be delineated here) as assignment V op= C is equivalent to V = V op C These conditions are met here. Likewise I++ transforms to I = I + 1 in this context. This results in: N = 2; X = 4; I = 0; Y = 1; while (I < N) {I = I + 1; Y = Y * X;} printf("%d", Y); The conditions, or course, relate to the possibility to side-effects, such as in V += V++ - --V; An algebraic rule to handle this would be implementation dependent. For example, if a compiler evaluates from left-to right, the assignment above might be reducible successively to: x1 = V++; x2 = --V; V += x1 - x2; x1 = V; V = V + 1; V = V - 1; x2 = V; V = V + x1 - x2; x1 = V; x2 = V; V = V + x1 - x2; (notice the cancellation) V = V + V - V; V = V; ; (the null statement) where x1, and x2 are variables not occurring elsewhere in the program. (3) A while-loop is equivalent to an INFINITELY nested sequence of if-statements expressed recursively by the following rule: while (B) do S == if (B) {S; while (B) do S} i.e. if (B) {S; if (B) {S; if (B) {S; ...}}} This rule will be used in the following reductions below. (4) The following are equivalent: if (B) S1; else S2; T == if (B) {S1; T} else {S2; T} applied in the case of the sequence: while (I < N) {I = I + 1; Y = Y * X;} printf("%d", Y); this produces an infinitely-nested series of branches defined recursively by: xx = if (I < N) {I = I + 1; Y = Y * X; xx} else printf("%d", Y); (5) A rule analogous to the BETA REDUCTION rule applies for assignment statements of the form: Simple-Variable = Expression. When used on the sequence: N = 2; X = 4; I = 0; Y = 1; xx (with xx defined as above), this gives the successive reductions: N = 2; X = 4; I = 0; Y = 1; xx -> X = 4; I = 0; Y = 1; xx' -> I = 0; Y = 1; xx'' where xx', xx'' are defined by xx' = if (I < 2) {I = I + 1; Y = Y * X; xx'} else printf("%d", Y); xx'' = if (I < 2) {I = I + 1; Y = Y * 4; xx''} else printf("%d", Y); I = 0; Y = 1; xx'' -> Y = 1; if (0 < 2) {I = 0 + 1; Y = Y * 4; xx''} else printf("%d", Y); -> if (TRUE) {I = 1; Y = 1 * 4; xx''} else printf("%d", 1); -> I = 1; Y = 4; xx'' -> Y = 4; if (1 < 2) {I = 1 + 1; Y = Y * 4; xx''} else printf("%d", Y); -> if (TRUE) {I = 2; Y = 4 * 4; xx''} else printf("%d", 4); -> I = 2; Y = 16; xx'' -> Y = 16; if (2 < 2) {I = 2 + 1; Y = Y * 4; xx''} else printf("%d", Y); -> if (FALSE) {I = 3; Y = 16 * 4; xx''} else printf("%d", 16); -> printf("%d", 16); -> printf("16"); These are very fine-grained steps, however. In their place, and based directly on the rules used in manipulating the loop, there is a more general theorem that can be used to generate a system of Diophantine equations from this loop. More generally, the rules only provide a theoretical foundation, and are not to be used on a practical level -- so there are more powerful tools of analysis that can be developed, based directly on them, that can be used in their stead. For example (3), (4) and other identities involving control-constructs can be used as the basis for transforming and optimizing control-constructs. I've used this once to transform a C code segment from: NameInsert(N, Head, Tail) struct NameList *N, **Head, **Tail; { if (*Tail == 0) { N->Prev = N->Next = 0; *Head = *Tail = N; } else { (*Head)->Prev = N; N->Prev = 0; N->Next = *Head; *Head = N; } } NameDelete(N, Head, Tail) struct NameList *N, **Head, **Tail; { if (N->Prev == 0) if (N->Next) { *Head = N->Next; N->Next->Prev = 0; } else *Tail = *Head = 0; else if (N->Next == 0) if (N->Prev) { *Tail = N->Prev; N->Prev->Next = 0; } else *Tail = *Head = 0; else { N->Prev->Next = N->Next; N->Next->Prev = N->Prev; } } to: NameInsert(N, Head, Tail) NameList N, *Head, *Tail; { if (*Tail == 0) *Tail = N; else (*Head)->Prev = N; N->Prev = 0; N->Next = *Head; *Head = N; } NameDelete(N, Head, Tail) NameList N, *Head, *Tail; { if (*Head == N) *Head = N->Next; else N->Prev->Next = N->Next; if (*Tail == N) *Tail = N->Prev; else N->Next->Prev = N->Prev; } The software this came from was a 100k teleconferencing program, which I optimized to under 50k and verified, there were a couple other (valid) assumptions that were needed to make the transformation above.