jimad@microsoft.UUCP (Jim ADCOCK) (03/06/90)
For the record, the last time these flames flared up a month or two ago, I asked for any authors with a "proven correct" non-trivial program that they were willing to submit for critique --said program written in C and runnable on a PC-- to submit such for my edification. I received no such "proven correct" programs. I did receive one question about what I meant by "non-trivial." I told the questioner that I would leave that up to him. I ask the program be in C and runnable on a PC so I can test it. Also, this tends to force the issue of portability, which I consider an interesting test of "goodness" and "non-trivial" Again, if you are a fan of "proving," and have "proven" a non-trivial C program that you are willing to submit to public scrutiny, please send such to me. I will attempt to test and critique it for goodness. This will help me, and others, learn about "proving programs" and how applicable it really is. If you originally wrote the program in another language, and only translated it to C after you got it working, that's okay too. I will report back to the net about any such programs received.
ian@oravax.UUCP (Ian Sutherland) (03/06/90)
In article <52044@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes: >I received no such "proven correct" programs. > >I did receive one question about what I meant by "non-trivial." I told >the questioner that I would leave that up to him. I asked the question originally. Consider it asked again. If you're going to report on this then YOU will be the judge of whether it's "nontrivial" or not. -- Ian Sutherland ian%oravax.uucp@cu-arpa.cs.cornell.edu Sans Peur
render@m.cs.uiuc.edu (03/07/90)
Written 1:31 pm Mar 5, 1990 by jimad@microsoft.UUCP: >For the record, the last time these flames flared up a month or two ago, >I asked for any authors with a "proven correct" non-trivial program that >they were willing to submit for critique --said program written in C >and runnable on a PC-- to submit such for my edification. > >I received no such "proven correct" programs. For the record, I told you about the development of a version control system based on formal development techniques. Because I didn't work on the project, I don't have the code, so I can't send it to you. Still, given the article there is a strong likelihood that such a program does exist and was proven correct by your standards. I've included the reference again at the end of this note. You should try to contact the people involved to see what happened to the code. Speaking of "your standards", what is the point that you're trying to make? That no such programs exist? So far as I know they do, although I am not involved in the area and only hear about such work second- and third-hand. Still, even if no non-trivial programs have ever been verified, it would not indicate that program verification research is a waste of time. Indeed, the work would be even more important so that we can determine exactly why programs are hard to verify and see if it is a fundamentally intractable problem. Hand-wavy arguments are not good enough, at least if you want to call yourself a computer scientist or a software engineer. hal. ---- Ian D. Cottam, "The Rigorous Development of a System Version Control Program", _IEEE Transactions on Software Engineering_, vol. SE-10, no. 2, March 1984, pp. 143--154.
djones@megatest.UUCP (Dave Jones) (03/15/90)
[ I suppose that you program-provers, once you have proved you program, are going to prove that the proof is correct, yes? No? ] Anecdote 1: I often think about "invariant relations" and such when I'm programming, but only once in the eleven years that I've been doing software engineering full time have I written out a formal proof of a piece of production code. (Stop me if you've heard this one.) The thing was made-to-order for a formal proof. It was a tricky hash-table algorithm. It was so tricky, that I figured I better prove it, so I did. A fellow worker had checked over my proof with me, verifying that it was correct. A couple of months later a bug manifested itself. No, it wasn't due to a typo in the coding -- and not a compiler-bug, either. The proof contained a mistake corresponding to a mistake in the code. Anecdote 2: I did my master's thesis on an arcane bit of topology called "Dehn's Lemma". (It concerns itself with necessary criteria in order that a loop be knotted within a three or higher dimensional manifold.) Herr Dehn proved it in 1917 in _Annalen Mathematica_. It got rave reviews. I may not have all these dates exactly right, but I think it was sometime around 1929 when somebody finally noticed that the proof had a whopping whole in it. In 1935, a fellow discovered that the proof could be patched up and saved if a certain other proposition was true. In 1957, a third man proved the second proposition, and everybody was happy again. In 1978, I undertook to put it all together, but I found a flaw in the proof of the second proposition! After a few weeks of sixteen-hours-a-day head-scratching, I was able to replace the faulty section with a completely different, and much shorter, argument. Again, _das lemma_ appeared to be truly proved. The others had published their stuff in big math journals, but I just wanted out of school so I could buy some food, so I never submitted mine to a journal. Of course, my advisor checked it carefully, as supposedly did the jury, but I guess there is some possibility that I too overlooked something. It is a very knotty bit of business, that. Now consider that both of those faulty proofs underwent much closer scrutiny than my little program-proof of anecdote-1 did -- years of scrutiny -- more than ANY program-proof is ever likely to undergo, yet each was incorrect. The point is, if you think that doing a formal proof is a good way to budget your system test time, by all means go ahead. Just don't think it proves anything conclusively. You better run lots of carefully selected test-cases too.
joshua@athertn.Atherton.COM (Flame Bait) (03/15/90)
In article <18185@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes: >No it isn't the code from Discipline of Programming, and believe me, >when I *say* I wrote it, I mean I WROTE THE DAMNED THING. Sorry. I misread your reference to "Dijkstra calculus" to mean that he had written the software in question. My mistake. Joshua Levy (joshua@atherton.com)
ian@oravax.UUCP (Ian Sutherland) (03/16/90)
In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes:
-As an alternative, someone could post a non-trivial program and ask the
-program provers to prove it either correct or incorrect...
I like this idea. Any takers?
--
------
Ian Sutherland ian%oravax.uucp@cu-arpa.cs.cornell.edu
Sans Peur
ian@oravax.UUCP (Ian Sutherland) (03/16/90)
In article <18190@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes: >In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes: >>As an alternative, someone could post a non-trivial program and ask the >>program provers to prove it either correct or incorrect... >Um, it doesn't really work that way, [...] >What one needs to do in a proven program is construct the program with >provability in mind. [...] Sure, sure, and if somebody posts something I think I have a chance of getting through Ariel, I may modify it to make it easier to verify. If I do this, my intent will be to remain faithful to the original program. I'm willing to take the chance that many people may scream that I changed the program beyond recognition. -- ------ Ian Sutherland ian%oravax.uucp@cu-arpa.cs.cornell.edu Sans Peur
sean@castle.ed.ac.uk (S Matthews) (03/16/90)
In article <1423@oravax.UUCP> ian@oravax.odyssey.UUCP (Ian Sutherland) writes: >In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes: >-As an alternative, someone could post a non-trivial program and ask the >-program provers to prove it either correct or incorrect... > >I like this idea. Any takers? >-- >------ >Ian Sutherland ian%oravax.uucp@cu-arpa.cs.cornell.edu Ian should know a *lot* better than this, and severely damages his own credibility by saying it. If he were to look in (and he certainly should have already) `Software development: a rigourous approach (Prentice Hall International, 1980)' which is one of the oldest books on practical programming development using formal methods (VDM), he would finnd that the author (Cliff Jones) describes how he sailed into the the IBM Vienna labs raving about this wonderful new development technique he had. The first response was exactly what we have seen in this news group, followed by an offer by Jones to verify an implimentation of Early's Parser. This attempt failed miserably. Jones then went on to develop, from scratch, a new implimentation, and, on comparing that to the original, discovered some bugs. Anyone who expects to be able to do this after Jones (who, if anybody, knows what he is talking about in formal verification) has admitted that it is not practical, should expect to be doubted in other statements. Formal verification is not a snake oil panacea, but some people make it sound like one and it is no wonder that reasonable people are sceptical of their claims. Sean By the way, if anyone is looking for a convincing example of the advantages of formal/rigorous development techniques, they could do a lot worse than check the Development that Jones describes in chapter 18 of that book.
sean@castle.ed.ac.uk (S Matthews) (03/16/90)
In article <12326@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes: > > [ I suppose that you program-provers, once you have proved you program, > are going to prove that the proof is correct, yes? No? ] > > >Anecdote 1: > >I often think about "invariant relations" and such when I'm programming, but >only once in the eleven years that I've been doing software engineering full >time have I written out a formal proof of a piece of production code. (Stop me >if you've heard this one.) The thing was made-to-order for a formal proof. >It was a tricky hash-table algorithm. It was so tricky, that I figured I >better prove it, so I did. A fellow worker had checked over my proof with me, >verifying that it was correct. A couple of months later a bug manifested >itself. No, it wasn't due to a typo in the coding -- and not a compiler-bug, >either. The proof contained a mistake corresponding to a mistake in the code. > >Anecdote 2: [deleted, makes same point] You are not making a distinction between `formal' and `usual mathematical': a formal proof is a machine checkable structure in the class of deductions in a formal theory. THe option of a mistake is not there (of course your theory may be inconsistent---but I humbly suggest that this is *unlikely*, or the proof checker may be buggy, but proof checkers are very simple programs). >The point is, if you think that doing a formal proof is a good way to budget >your system test time, by all means go ahead. Just don't think it proves >anything conclusively. You better run lots of carefully selected test-cases >too. I agree, but you had also better be careful about what you call a formal proof. Sean P.S. there are good examples of *formal* proof systems around too, and not just for the sort of weenie math that is needed for program verification. Try Cornell's NuPRL system for instance. (A friend of mine produced a proof of the finite version of Ramsey's theorem in it, and I would say that that has been checked a great deal more thoroughly than if Euclid, Archimedes, Newton, Gauss, Euler and Hilbert refereed it together).
crm@romeo.cs.duke.edu (Charlie Martin) (03/17/90)
In article <12326@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes: > > [ I suppose that you program-provers, once you have proved you program, > are going to prove that the proof is correct, yes? No? ] > > >Anecdote 1: > >....A couple of months later a bug manifested >itself. No, it wasn't due to a typo in the coding -- and not a compiler-bug, >either. The proof contained a mistake corresponding to a mistake in the code. > Yup, that happens. Just as in ... >Anecdote 2: > > .... Of course, my advisor checked it carefully, as >supposedly did the jury, but I guess there is some possibility that I too >overlooked something. It is a very knotty bit of business, that. > >Now consider that both of those faulty proofs underwent much closer scrutiny >than my little program-proof of anecdote-1 did -- years of scrutiny -- >more than ANY program-proof is ever likely to undergo, yet each was incorrect. > >The point is, if you think that doing a formal proof is a good way to budget >your system test time, by all means go ahead. Just don't think it proves >anything conclusively. You better run lots of carefully selected test-cases >too. I don't *think* a formal proof is a good way to budget test time, I have evidence from controlled studies and anechdotal evidence that says it is. Even better, I have a strong suspicion that the proof can direct the testing to improve it. But let's just look at the opposite side of this statement: "If you think that a formal proof is a good way to establish a mathematical statement, by all means go ahead. Just don't think it proves anything conclusively. You'd better do lots of carefully-selected examples too." As a mathematician, this ought to immediately strike you as something between "a little off" and "of course; any mathematically literate person knows that you should make sure that a simple example ought to work too." (Although there are those horror stories about the person who one way or another had a counter-example to his main theorem proven during his defense.) The point of program proving (IM philosophical O) is that it establishes the basis on which predictions can be made, against which actual tests can be performed, to raise the certainty of correctness into the realm of "inductive" or "abductive" demonstration. If you want to look into this point further, I can mail you the Canonical Flame War between Jim Fetzer and myself. If that threat doesn't put you off, nothing will..... Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)
bks@alfa.berkeley.edu (Brad Sherman) (03/17/90)
In article <1423@oravax.UUCP> ian@oravax.odyssey.UUCP (Ian Sutherland) writes: >In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes: >-As an alternative, someone could post a non-trivial program and ask the >-program provers to prove it either correct or incorrect... >I like this idea. Any takers? >Ian Sutherland ian%oravax.uucp@cu-arpa.cs.cornell.edu Well this is trivial, but I think many of us would like to see some examples of program proofs so let's start out small. Is this correct or not? /* --------------------------------------------------------------------- * RXUTOS * RadiX Unsigned TO String. * Converts an unsigned integer into a string of digits in any * base (radix) in the range {2, ..., 35}. Digits needed in the range * {10, ..., 35} are represented by the characters {A, B, ..., Z}. * * This function returns a pointer to a static buffer which is * overwritten on each call. On error (radix out of range), a NULL * pointer is returned. * * Author: Bradley K. Sherman (bks@alfa.berkeley.edu) * Copyright (C), The Regents of the University of California, 1990 */ char * rxutos( n, radix ) unsigned int n; unsigned int radix; { static char buf[ ( 8 * sizeof(unsigned int) ) + 1 ]; char *s = ( 8 * sizeof(unsigned int) ) + buf; if ( radix < 2 || radix > 35 ) return( NULL ); *s = '\0'; do { *--s = "0123456789ABCDEFGHIJKLMOPQRSTUVWXYZ"[n % radix]; } while ( (n /= radix) > 0 ); return( s ); } /* --------------------- End of function rxutos() --------------------- */
ian@oravax.UUCP (Ian Sutherland) (03/17/90)
In article <2870@castle.ed.ac.uk> sean@castle.ed.ac.uk (S Matthews) writes: >Ian should know a *lot* better than this, and severely damages his own >credibility by saying it. I don't know what you think I was saying. The question is "what is nontrivial?" I'm willing to have people post programs, look at them, and if I think it's reasonable to try to get them through our system (Ariel), possibly with some modifications, I'll try it. If I have some success, I'll report it. I don't guarantee that I'll even try a single one of the suggestions posted. What about this damages my credibility? >Anyone who expects to be able to do this after Jones (who, if anybody, >knows what he is talking about in formal verification) has admitted that >it is not practical, should expect to be doubted in other statements. What is it that I'm expecting to be able to do? -- ------ Ian Sutherland ian%oravax.uucp@cu-arpa.cs.cornell.edu Sans Peur
freek@fwi.uva.nl (Freek Wiedijk) (03/17/90)
In article <1990Mar16.210324.398@agate.berkeley.edu> bks@alfa.berkeley.edu (Brad Sherman) writes: >In article <1423@oravax.UUCP> ian@oravax.odyssey.UUCP (Ian Sutherland) writes: >>In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes: >>-As an alternative, someone could post a non-trivial program and ask the >>-program provers to prove it either correct or incorrect... >>I like this idea. Any takers? >>Ian Sutherland ian%oravax.uucp@cu-arpa.cs.cornell.edu > >Well this is trivial, but I think many of us would like to see some >examples of program proofs so let's start out small. Is this correct or >not? No: > *--s = "0123456789ABCDEFGHIJKLMOPQRSTUVWXYZ"[n % radix]; ^^ You forgot the N! (I'm sure, I'll be the Nth to reply with this :-) Other criticisms: You never defined NULL, and your program is _much_ too long, it should have been: :-) :-) /*(c)*/char*rxutos(n,r)unsigned n,r;{static char b[8*sizeof(unsigned) +1];char*s=8*sizeof(unsigned)+b;if(r<2||r>35)return(0);*s=0;do{*--s= "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"[n%r];}while((n/=r));return(s);} Also: what's wrong with radix 36? -- Freek "the Pistol Major" Wiedijk Path: uunet!fwi.uva.nl!freek #P:+/ = #+/P?*+/ = i<<*+/P?*+/ = +/i<<**P?*+/ = +/(i<<*P?)*+/ = +/+/(i<<*P?)**
bks@alfa.berkeley.edu (Brad Sherman) (03/17/90)
In article <534@fwi.uva.nl> freek@fwi.uva.nl (Freek Wiedijk) writes: > [various correct criticisms] and > ...your program is _much_ too long, ... without mentioning that there is some justification for omitting the line: *s = '\0'; I still think that it would be interesting to see a proof so I'll try once more: /* --------------------------------------------------------------------- * RXUTOS * RadiX Unsigned TO String. * Converts an unsigned integer into a string of digits in any * base in the range {2, ..., 36}. Digits needed in the range * {10, ..., 35} are represented by the characters {A, B, ..., Z}. * * This function returns a pointer to a static buffer which is * on each call. On error (radix out of range), a NULL * pointer is returned. * * Author: Bradley K. Sherman (bks@alfa.berkeley.edu) * With thanks to: Freek Wiedijk (freek@fwi.uva.nl) * Copyright (C), The Regents of the University of California, 1990 */ char * rxutos( n, radix ) register unsigned int n; register unsigned int radix; { static char buf[ ( 8 * sizeof(unsigned int) ) + 1 ]; char *s = ( 8 * sizeof(unsigned int) ) + buf; if ( radix < 2 || radix > 36 ) return( 0 ); *s = '\0'; do { --s; *s = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"[n % radix]; n /= radix; } while ( n > 0 ); return( s ); } /* --------------------- End of file rxutos.c --------------------- */
runyan@hpcuhc.HP.COM (Mark Runyan) (03/19/90)
>/ sean@castle.ed.ac.uk (S Matthews) / 2:46 am Mar 16, 1990 / >Anyone who expects to be able to do this after Jones (who, if anybody, >knows what he is talking about in formal verification) has admitted that >it is not practical, should expect to be doubted in other statements. > >Formal verification is not a snake oil panacea, but some people make it >sound like one and it is no wonder that reasonable people are sceptical >of their claims. Your telling me that formal proofs are *only* useful on new development? Considering how much old code is out there, I find this slightly frightening. Still, anything that can be used to improve new code is a plus. But at the moment, I'm trying to fight fires in old code where total replacement is *not* considered an alternative. Mark Runyan
crm@romeo.cs.duke.edu (Charlie Martin) (03/19/90)
In article <2480007@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes: >Your telling me that formal proofs are *only* useful on new development? >Considering how much old code is out there, I find this slightly >frightening. Gee, don't be frightened, Mark. It's no more of a boogeyman than it was before. Gee, on second thought, maybe I'll be frightened with you. > >Still, anything that can be used to improve new code is a plus. But at >the moment, I'm trying to fight fires in old code where total replacement >is *not* considered an alternative. The very first experience I had with program proof in real code was in Germany in 1981. I'd just read _The Science of Programming_. The system I was working on (I was the warranty) had a module that had just been unending trouble. It controlled a big n x n switch matrix, and the matrix had a tendency not to switch correctly, but intermittently. The hardware guys swore it was software problems; after some time, in desperation I rewrote and proved the single routine that generated the message to which the switch responded. I reinstalled that module and --- the switch still didn't work. Since I had not only proven the routine but tested the hell out of it (pace Fetzer) I was reasonably confident that THIS module was generating the correct controls. With this basis, I was confident enough to tell the hardware guys that it was absolutely not the software, could we put a logic analyzer on the control lines. When we did, the control signals were as specified. It was the box after all. (Turned out the switch settings were set up through an eprom, which had been xrayed coming through the export process.) The point of this, yet another war story, is that I got good results during maintenance by removing one routine of about 75 lines which was apparently real trouble and then reimplementing it with proof. Since many systems exhibit the 80/20 fault behavior (80 percent of the faults lie in 20 percent of the modules) I suspect that hand-proof can be used during maintenance to fix the buggiest modules first and get a LOT of improvement in reliability. The same could be said of machine-checked proof, but machine-checked proof is higher cost. Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)
hughes@tempest.Berkeley.EDU (Eric Hughes) (03/19/90)
In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes: > As an alternative, someone could post a non-trivial program and ask the > program provers to prove it either correct or incorrect... This really isn't the way it's done. The idea is to develop the program and its proof simultaneously. The generation of invariants and assertions is more difficult (at least now, since there is not as much cultural experience) than writing the code. Eric Hughes hughes@ocf.berkeley.edu
hughes@tempest.Berkeley.EDU (Eric Hughes) (03/20/90)
In article <1990Mar17.063012.24979@agate.berkeley.edu> bks@alfa.berkeley.edu (Brad Sherman) writes: > I still think that it would be interesting to see a proof so I'll > try once more: Your original question was whether the included routine was correct or not. The answer, without looking at the code itself, is no, for the simple reason that code, _per se_, is not proven correct. What is proven is the code plus its precondition and postcondition. Since there is no precondition and no postcondition, the code cannot be proven to do anything. Now, here is what these conditions are. This is not the formal definition, but a working description. The notation generally used is {P} S {R}; here P is the precondition, S is the statement (which may be a compound statement or subroutine), and Q is the postcondition. P and Q are logical predicates (a predicate is a sentence that is either true or false) which describe program state. {P} S {R} is said to be correct if, when P is true at the beginning of the execution of S, then R is always true and the end of execution of S. Here is a simple example: {true} a = 1 { a equals 1 } Here the precondition is the true predicate. Every program state satisfies the true predicate. The statement is the assignment of the value one to the variable a. The postcondition asserts that the value of the variable a is 1. The program is correct, we reason informally, since if you assign the value one to a, then the value of a is one. This may seem rather silly, but there is a fundamental principle here: statements and predicates alternate, each statement moving the current assertion about program state forward one step. I think that a proper statement of preconditions and postconditions is one of the best debugging tools. I recently had too maintain a 500 line section of 8086 assemby code which was basically undocumented. The first thing I did was to triple the size by adding assertion, postconditions, and preconditions everywhere. In testing, I was able to fix half a dozen bugs simply by comparing the stated postcondition of a routine to the expected precondition of some statement and observing that the previous postcondition did not imply the precondition. It was truly wonderful. Without going into too much detail, I have written a proof of the given subroutine and augmented the code with it. I would say that this is a different program, myself, even though the individual code statements were unaltered. Don't worry if you don't understand some of the constructions, we can discuss them. For more details, go get a copy of _The Science of Programming_, by David Gries, from which I learned how to do this. The proof is as complete as I think a hand checked proof needs to be; in fact, I think it overly verbose. (Every proof is, by nature, verbose.) It is also not _really_ complete. I left out a calculation to show that the bound function is always greater than zero. This proof took about an hour to write. Eric Hughes hughes@ocf.berkeley.edu /* --------------------------------------------------------------------- * RXUTOS * RadiX Unsigned TO String. * Author: Bradley K. Sherman (bks@alfa.berkeley.edu) * With thanks to: Freek Wiedijk (freek@fwi.uva.nl) * Copyright (C), The Regents of the University of California, 1990 */ /* Precondition: * 2 <= radix <= 36 * n > 0 * the length of the representation of n in base radix is not greater * than RXUTOS_BUFFER_SIZE * Postcondition: * return value points to a ASCIIZ string which is a representation of * the number n in base b. */ /* Define b = radix */ char * rxutos( n, radix ) register unsigned int n; register unsigned int radix; { #define RXUTOS_BUFFER_SIZE 10 static char buf[ RXUTOS_BUFFER_SIZE + 1 ]; char *s = RXUTOS_BUFFER_SIZE + buf; *s = '\0'; /* Assert s is the empty string. */ /* Define: let t be the length of the representation of n in base b * Define: let s' be the number represented by s in base b * Define: let u be a loop counter, starting at zero * Define: n' = n */ /* Assert s' == 0 * Assert s' + n * 1 == n' * Assert n = [ n' / 1 ] */ /* Invariant: * s' + n * b^u== n' * n = [ n' / b^u ] // real division, greatest integer function * s is an ASCIIZ string * buf == s + RXUTOS_BUFFER_SIZE - u * Bound function: * t - u * Guard: * n > 0 */ do { /* Define s'' = s' * Define n'' = n */ --s; /* Assert buf == s + RXUTOX_BUFFER_SIZE - (u + 1) * Assert u <= t - 1 // from the bound function > 0 * Assert u >= 0 * Hence Assert s in buf[0..RXUTOS_BUFFER_SIZE+1) * Note: this is necessary because with a language * that allows pointers to point anywhere there is * a need to explicitly show that a derefenced pointer * assignment is permitted. */ *s = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"[n % radix]; /* Assert s is still an ASCIIZ string * Assert s' = s'' + ( n'' mod b ) * b ^ u * Assert n > 0 */ n /= radix; /* Assert n >= 0 * Assert n == [ n'' / b ] * == [ [ n' / b^u ] / b ] * == [ n' / b^(u+1) ] */ /* Assert s' + n * b^(u+1) * == s'' + (n'' mod b) * b^u + [ n''/b ] * b^(u+1) * == s'' + b^u * ( n'' mod b + b * [ n''/b ] ) * == s'' + b^u * n'' * == n' */ /* Stmt ++u * Assert buf == s + RXUTOX_BUFFER_SIZE - u * Assert n == [ n / b^u ] * Assert n >= 0 * Assert n' == s' + n * b^(u+1) * Assert bound function has decreased by one * Note: I have not shown that the bound function * is greater that zero after the guards. */ } while ( n > 0 ); /* Assert n >= 0 and not n < 0 * Assert n == 0 * Assert n' == s' + 0 * b^u == s' * Assert s is the representation of the input n in base b */ return( s ); } /* --------------------- End of file rxutos.c --------------------- */
joshua@athertn.Atherton.COM (Flame Bait) (03/20/90)
Someone has suggested that we post programs, and let the verification folks try to verify them. Sounds good to me. I would like to see any of the following four programs verified: GnuTar, GCC, GAWK, or G++. All of them are real, and do useful things. All are available in source code. And all have pretty well defined behaviors. GCC uses the ANSI standard, G++ and GAWK are based on books and some follow on technical reports, and I believe that GnuTar is based on the POSIX tar standard. GCC and G++ are both huge, but GAWK and GnuTar should be reasonably sized. Is anyone willing to send them through a verification system? Ian, do you have the time for this? If so please tell us such things as the bugs you find, the time it took, the verification system and methodology you used, the version of the program you verified, etc. Also, send the results to the GNU people, I'm sure they'll be interested in them. If you can not verify one of these progams, please tell us why. Joshua Levy joshua@atherton.com home:(415)968-3718 {decwrl|sun|hpda}!athertn!joshua work:(408)734-9822
joshua@athertn.Atherton.COM (Flame Bait) (03/20/90)
In article <18261@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes: >I don't *think* a formal proof is a good way to budget test time, I have >evidence from controlled studies and anechdotal evidence that says it >is. Can you please post 3-6 references to the controlled studies which show that formal proofs are a better way to budget time than debugging? For these studies to be really convincing (to me, anyway) they should involve real world software projects, in real world environments. The projects should be on "normal" hardware (no LISP machines, Butterflies, or connection machines). I should be able to buy or otherwise aquire all the speciallized software used. I should be able to take classes in the methodology used. The studies should compare projects which use good SE methodology (without proofs) to good SE methodology (with proofs). Remember: in the commerical world, something only exists if I can buy it. Obviously, I'd be very interested in studies which do not match all of the above requirements, but the more they match, the better they are. Joshua Levy joshua@atherton.com home:(415)968-3718 {decwrl|sun|hpda}!athertn!joshua work:(408)734-9822
decot@hpisod2.HP.COM (Dave Decot) (03/20/90)
> This really isn't the way it's done. The idea is to develop the > program and its proof simultaneously. The generation of invariants > and assertions is more difficult (at least now, since there is not > as much cultural experience) than writing the code. > > Eric Hughes Righto. For discussion purposes, here's the specification version of the radix conversion code (somebody else will have to come up with the invariants and assertions). Dave ----------- /* This a version of the original program with its initial comments converted into a C-like specification language. The meaning of the language constructs are intended to be obvious. The use of the keyword "return" in this language refers to the return value of the function. */ /* The original specification doesn't mention radix, so the function could have been implemented more easily by returning "6" always. :-) However, assuming that the return value should be a numeral representing the value of n in the indicated radix... */ /* Import precompiled specifications of various functions; assume that they are specified as in ANSI C. */ extern spec int strlen(); extern spec char *strchr(); extern spec long strtoul(); #define digits "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ" spec char *rxutos(n, radix) register unsigned int n, radix; { if (radix >= 2 && radix <= 36) { forall int i (i >= 0 && i <= strlen(return)) { define char *p (strchr(return[i], digits)); guarantee p != 0 && p - return < radix; } guarantee strtoul(return, (char **)0, radix) == n; guarantee strlen(return) < 8*sizeof(unsigned int) + 1; } else guarantee return == 0; } /* for comparison, here's the posted implementation */ char *rxutos(n, radix) register unsigned int n; register unsigned int radix; { static char buf[ (8 * sizeof(unsigned int)) + 1 ]; char *s = (8 * sizeof(unsigned int)) + buf; if ( radix < 2 || radix > 36 ) return( 0 ); *s = '\0'; do { --s; *s = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"[n % radix]; n /= radix; } while ( n > 0 ); return( s ); }
shimeall@cs.nps.navy.mil (Tim Shimeall) (03/20/90)
(sorry for the post -- I tried to reply directly but my mailer bounced your address) In article <19989@joshua.athertn.Atherton.COM> you write: >Can you please post 3-6 references to the controlled studies which show >that formal proofs are a better way to budget time than debugging? I can't post 3-6 references to such studies (in general, programming experiments are so expensive to run that expecting several studies to examine a single issue is VERY unrealistic) but I can point you to one: J.E. Brunelle and D.E. Eckhardt ``Fault Tolerant Software: Experiment with the SIFT Operating System,'' {\it AIAA Computers in Aerospace V Conference}, October 1985, pp. 355--360. This study compared a portion of the verified SIFT kernel against two independently developed versions (using a controlled development procedure) of the kernel (all three developed from the same specification). They ran a large number of randomly-generated data sets through the three versions. The study found several faults in the comercially-developed version, but NO faults in the verified version. If you want to see 3-6 studies on this subject, may I suggest that you (or your organization) sponsor such work. But it isn't inexpensive (you don't get something for nothing), especially if you wish to see controlled studies done using industrial problems. Such studies range in cost from the hundreds of thousands of dollars to the millions of dollars. For comparison, I ran a testing study for my dissertation that involved 50+ people and took about 2 years of real time and slightly over a Sun/3 CPU-decade. That was analyzing a set of 2000-line programs. NO ONE in the current environment is seriously planning to try controlled development studies on large industrial problems -- the funds simply aren't there.
crm@romeo.cs.duke.edu (Charlie Martin) (03/20/90)
The best references I can give you off the top of my head re the Software Cleanroon papers; there is an excellent overview that wil appeared (or has appeared, all I have is the preprint), title "Cleanroom Software Engineering", author Harlan Mills, in _Aerospace Software Engineering_, Anderson, Dorfman and Hall, eds., AIAA 1990. Some of the experimental results are in "Certifying the Reliability of Software," IEEE TSE SE-12, No 1, Jan 1986, pp 3--11. Also see "Cleanroom software development: an empirical evaluation," IEEE TSE vol SE-13, No 9, Sept 1987, I don't have the pages. Of course, even better will be the experiment I'm going to perform when I have the time and funding. Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)
madd@world.std.com (jim frost) (03/20/90)
joshua@athertn.Atherton.COM (Flame Bait) writes: >I would like to see any of the following four programs verified: >GnuTar, GCC, GAWK, or G++. None of these even comes close to passing a good C checker cleanly, much less a formal proof. I'd suggest picking something that wouldn't be so easy to prove incorrect. The argument is moot, anyway. Program proving is wonderful when you absolutely positively must not fail and it's worth the extra time to do the proof. For most applications, however, the time necessary to do the proof would make the application out-of-date before it shipped. Both sides are right depending on the job requirements. jim frost saber software jimf@saber.com
hughes@locusts.Berkeley.EDU (Eric Hughes) (03/21/90)
In article <19988@joshua.athertn.Atherton.COM> joshua@athertn.Atherton.COM (Flame Bait) writes:
I would like to see any of the following four programs verified:
GnuTar, GCC, GAWK, or G++.
OK, I accept. I pick GAWK. If you will stipulate that the program
halts on all inputs, then my proof is finished, because I have
picked the true postcondition.
Now if you write a formal spec of the behavior of GAWK, I will
have another, and different, task at hand. But I think it will
take you longer to write a formal spec than for me to write a
proof.
Maxim: A program is only correct with respect to a precondition
and postcondition.
Furthermore, utilities like GAWK, which are meant to be general
purpose, public utilities, should always have the true precondition,
since the should not hang up on arbitrary input.
Eric Hughes
hughes@ocf.berkeley.edu
cox@stpstn.UUCP (Brad Cox) (03/22/90)
Apparently you haven't heard that C (and its derivatives, including Objective-C and undoubtedly C++ as well) supports assertion checking. It's called assert(). You will undoubtedly find it in /usr/include/assert.h on your system, or thereabouts. You use it like this #include "assert.h" assert(YES) a = 1; assert(a == 1); The macro generates if (!(1)) { fprintf(stderr, "assertion failure in file %s line %d\n", _FILE_, _LINE_); exit(1); } a = 1; if (!(a == 1)) { fprintf(stderr, "assertion failure in file %s line %d\n", _FILE_, _LINE_); exit(1); } But if you set the macro variable, NDEBUG, the assert()s vanish at compile time, so they can be left in the source forever as documentation. By the way, I recommend that the macro be modified to generate as follows, for reasons that should become obvious once you start using it a lot. if (!(a == 1)) assertionFailure(_FILE_, _LINE_); This has proven so useful for C code in general, and for polymorphic object-oriented Objective-C code in particular, that assertion-checking has become a habit, something that we take for granted that any professional programmer would be using as a matter of routine. Wonder why the existence of this macro comes as a surprise to so many. I've even heard assertion checking used as a selling point for Eiffel!
hughes@lightning.Berkeley.EDU (Eric Hughes) (03/22/90)
In article <4389@stpstn.UUCP> cox@stpstn.UUCP (Brad Cox) writes:
Apparently you haven't heard that C (and its derivatives, including
Objective-C and undoubtedly C++ as well) supports assertion checking.
Please don't patronize me. Every single file of C code I write has
"#include <assert.h>" in it. In fact, my own assert.h has a macro
halt() in it to put in places where the code should never get to.
Yes, it's useful. Yes, many people have not heard of it. Yes, it is
my habit, almost a reflex. That's not the point.
There are two big differences between using a runtime check (as in the
assert() macro) and a compiler time check (an inline predicate
asserted to be true). They are both called assertions, an unfortunate
confluence which leads to misunderstanding.
The first difference is entirely practical. There are some runtime
checks which appear naturally in proof which just can't be
realistically checked at runtime. Consider the behavior of the
library routine malloc(). We want to be able to say that the pointer
malloc() gives us points to a block of memory that is exclusively
ours, that is, that there does not exist another pointer which will
modify contents of this block. I do not know of a way to verify
this with a runtime check. I do think it possible to prove a
memory allocator correct with the above property, however.
Here is another example, one where a runtime check cannot in principle
be written. Pick any numerical algorithm that uses floating point
numbers as an approximation to the real numbers, that is, that models
a mathematical object by a computer one. We want to know what the
maximum error is from any calculation, which leads naturally to terms
of the form abs( f - r ), where f is floating point (a subset of the
real numbers) and r is real. You can't evaluate this term on a
computer.
The second difference is much more important. I'll even assume, for
the sake of this section of the argument, that any assertion can be
checked at runtime, even the ones above. Even with this strengthening
assumption, runtime checks do not provide use with the confidence of a
formally correct program. For runtime checks only assure us that each
single execution of the program is valid, not that every such possible
execution will be. (Both these assurances presuppose that the
compiler, operating system, hardware, etc., perform as expected. This
is a different issue than Fetzer's article.)
To say this in other language, runtime checking gives us enumerable
instances of program execution; every time we run the program on a
different input and it satisfies all of its assertions, we have
another example of a correct execution of the program. On the other
hand, when we make a logical proof that a program satisfies its
preconditions and postcondition, we have assured ourselves (again,
insofar as our models are correct) that the program will work
correctly on every possible input; a formal proof gives us decidable
instances of correct programs.
The practical difference between these two is enormous. For a program
which is checked for correctness at runtime is correct _only_ insofar
as it is tested before release, but a program which is proved correct
formally has the _further_ assurance that there is a logical and
formal derivation of precondition and postcondition (and that is a
powerful assurance in practice).
Eric Hughes
hughes@ocf.berkeley.edu