[comp.software-eng] "Program Proving"

jimad@microsoft.UUCP (Jim ADCOCK) (03/06/90)

For the record, the last time these flames flared up a month or two ago,
I asked for any authors with a "proven correct" non-trivial program that 
they were willing to submit for critique --said program written in C 
and runnable on a PC-- to submit such for my edification.

I received no such "proven correct" programs.

I did receive one question about what I meant by "non-trivial."  I told
the questioner that I would leave that up to him.

I ask the program be in C and runnable on a PC so I can test it.  Also,
this tends to force the issue of portability, which I consider an interesting
test of "goodness" and "non-trivial"

Again, if you are a fan of "proving," and have "proven" a non-trivial C
program that you are willing to submit to public scrutiny, please send
such to me.  I will attempt to test and critique it for goodness.  This
will help me, and others, learn about "proving programs" and how applicable
it really is.

If you originally wrote the program in another language, and only translated
it to C after you got it working, that's okay too.

I will report back to the net about any such programs received.

ian@oravax.UUCP (Ian Sutherland) (03/06/90)

In article <52044@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
>I received no such "proven correct" programs.
>
>I did receive one question about what I meant by "non-trivial."  I told
>the questioner that I would leave that up to him.

I asked the question originally.  Consider it asked again.  If you're
going to report on this then YOU will be the judge of whether it's
"nontrivial" or not.
-- 
Ian Sutherland		ian%oravax.uucp@cu-arpa.cs.cornell.edu

Sans Peur

render@m.cs.uiuc.edu (03/07/90)

Written  1:31 pm  Mar  5, 1990 by jimad@microsoft.UUCP:
>For the record, the last time these flames flared up a month or two ago,
>I asked for any authors with a "proven correct" non-trivial program that 
>they were willing to submit for critique --said program written in C 
>and runnable on a PC-- to submit such for my edification.
>
>I received no such "proven correct" programs.

For the record, I told you about the development of a version control
system based on formal development techniques.  Because I didn't work
on the project, I don't have the code, so I can't send it to you.  Still,
given the article there is a strong likelihood that such a program does 
exist and was proven correct by your standards.  I've included the reference
again at the end of this note.  You should try to contact the people involved
to see what happened to the code.

Speaking of "your standards", what is the point that you're trying to make?
That no such programs exist?  So far as I know they do, although I am not
involved in the area and only hear about such work second- and third-hand.
Still, even if no non-trivial programs have ever been verified, it would 
not indicate that program verification research is a waste of time.  Indeed,
the work would be even more important so that we can determine exactly why 
programs are hard to verify and see if it is a fundamentally intractable 
problem.  Hand-wavy arguments are not good enough, at least if you want
to call yourself a computer scientist or a software engineer.

hal.
----

 Ian D. Cottam, "The Rigorous Development of a System Version Control Program",
    _IEEE Transactions on Software Engineering_, vol. SE-10, no. 2, March 1984,
    pp. 143--154.

djones@megatest.UUCP (Dave Jones) (03/15/90)

   [ I suppose that you program-provers, once you have proved you program,
     are going to prove that the proof is correct, yes? No? ]


Anecdote 1:

I often think about "invariant relations" and such when I'm programming, but
only once in the eleven years that I've been doing software engineering full
time have I written out a formal proof of a piece of production code. (Stop me
if you've heard this one.) The thing was made-to-order for a formal proof.
It was a tricky hash-table algorithm. It was so tricky, that I figured I
better prove it, so I did. A fellow worker had checked over my proof with me,
verifying that it was correct. A couple of months later a bug manifested
itself. No, it wasn't due to a typo in the coding -- and not a compiler-bug,
either. The proof contained a mistake corresponding to a mistake in the code.

Anecdote 2:

I did my master's thesis on an arcane bit of topology called "Dehn's Lemma".
(It concerns itself with necessary criteria in order that a loop be knotted
within a three or higher dimensional manifold.) Herr Dehn proved it in 1917
in _Annalen Mathematica_. It got rave reviews.  I may not have all these
dates exactly right, but I think it was sometime around 1929 when somebody
finally noticed that the proof had a whopping whole in it. In 1935, a fellow
discovered that the proof could be patched up and saved if a certain other
proposition was true. In 1957, a third man proved the second proposition, and
everybody was happy again. In 1978, I undertook to put it all together, but I
found a flaw in the proof of the second proposition!  After a few weeks of
sixteen-hours-a-day head-scratching, I was able to replace the faulty section
with a completely different, and much shorter, argument. Again, _das lemma_
appeared to be truly proved. The others had published their stuff in big math
journals, but I just wanted out of school so I could buy some food, so I never
submitted mine to a journal. Of course, my advisor checked it carefully, as
supposedly did the jury, but I guess there is some possibility that I too
overlooked something. It is a very knotty bit of business, that.

Now consider that both of those faulty proofs underwent much closer scrutiny
than my little program-proof of anecdote-1 did -- years of scrutiny --
more than ANY program-proof is ever likely to undergo, yet each was incorrect.

The point is, if you think that doing a formal proof is a good way to budget
your system test time, by all means go ahead. Just don't think it proves
anything conclusively. You better run lots of carefully selected test-cases
too.

joshua@athertn.Atherton.COM (Flame Bait) (03/15/90)

In article <18185@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes:
>No it isn't the code from Discipline of Programming, and believe me,
>when I *say* I wrote it, I mean I WROTE THE DAMNED THING.

Sorry.  I misread your reference to "Dijkstra calculus" to mean that he
had written the software in question. 

My mistake.

Joshua Levy  (joshua@atherton.com)

ian@oravax.UUCP (Ian Sutherland) (03/16/90)

In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes:
-As an alternative, someone could post a non-trivial program and ask the
-program provers to prove it either correct or incorrect...

I like this idea.  Any takers?
-- 
------
Ian Sutherland		ian%oravax.uucp@cu-arpa.cs.cornell.edu

Sans Peur

ian@oravax.UUCP (Ian Sutherland) (03/16/90)

In article <18190@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes:
>In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes:
>>As an alternative, someone could post a non-trivial program and ask the
>>program provers to prove it either correct or incorrect...
>Um, it doesn't really work that way,
[...]
>What one needs to do in a proven program is construct the program with
>provability in mind.
[...]

Sure, sure, and if somebody posts something I think I have a chance of
getting through Ariel, I may modify it to make it easier to verify.  If
I do this, my intent will be to remain faithful to the original
program.  I'm willing to take the chance that many people may scream
that I changed the program beyond recognition.
-- 
------
Ian Sutherland		ian%oravax.uucp@cu-arpa.cs.cornell.edu

Sans Peur

sean@castle.ed.ac.uk (S Matthews) (03/16/90)

In article <1423@oravax.UUCP> ian@oravax.odyssey.UUCP (Ian Sutherland) writes:
>In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes:
>-As an alternative, someone could post a non-trivial program and ask the
>-program provers to prove it either correct or incorrect...
>
>I like this idea.  Any takers?
>-- 
>------
>Ian Sutherland		ian%oravax.uucp@cu-arpa.cs.cornell.edu

Ian should know a *lot* better than this, and severely damages his own
credibility by saying it. 

If he were to look in (and he certainly should have already) `Software
development: a rigourous approach (Prentice Hall International, 1980)'
which is one of the oldest books on practical programming development
using formal methods (VDM), he would finnd that the author (Cliff Jones)
describes how he sailed into the the IBM Vienna labs raving about this
wonderful new development technique he had.  The first response was
exactly what we have seen in this news group, followed by an offer by
Jones to verify an implimentation of Early's Parser.  This attempt
failed miserably.  Jones then went on to develop, from scratch, a new
implimentation, and, on comparing that to the original, discovered some
bugs. 

Anyone who expects to be able to do this after Jones (who, if anybody,
knows what he is talking about in formal verification) has admitted that
it is not practical, should expect to be doubted in other statements.

Formal verification is not a snake oil panacea, but some people make it
sound like one and it is no wonder that reasonable people are sceptical
of their claims.

Sean

By the way, if anyone is looking for a convincing example of the
advantages of formal/rigorous development techniques, they could do a
lot worse than check the Development that Jones describes in chapter 18
of that book.

sean@castle.ed.ac.uk (S Matthews) (03/16/90)

In article <12326@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>
>   [ I suppose that you program-provers, once you have proved you program,
>     are going to prove that the proof is correct, yes? No? ]
>
>
>Anecdote 1:
>
>I often think about "invariant relations" and such when I'm programming, but
>only once in the eleven years that I've been doing software engineering full
>time have I written out a formal proof of a piece of production code. (Stop me
>if you've heard this one.) The thing was made-to-order for a formal proof.
>It was a tricky hash-table algorithm. It was so tricky, that I figured I
>better prove it, so I did. A fellow worker had checked over my proof with me,
>verifying that it was correct. A couple of months later a bug manifested
>itself. No, it wasn't due to a typo in the coding -- and not a compiler-bug,
>either. The proof contained a mistake corresponding to a mistake in the code.
>
>Anecdote 2:
[deleted, makes same point]

You are not making a distinction between `formal' and `usual
mathematical': a formal proof is a machine checkable structure in the
class of deductions in a formal theory.  THe option of a mistake is not
there (of course your theory may be inconsistent---but I humbly suggest
that this is *unlikely*, or the proof checker may be buggy, but proof
checkers are very simple programs). 

>The point is, if you think that doing a formal proof is a good way to budget
>your system test time, by all means go ahead. Just don't think it proves
>anything conclusively. You better run lots of carefully selected test-cases
>too.

I agree, but you had also better be careful about what you call a formal
proof. 

Sean

P.S.  there are good examples of *formal* proof systems around too, and
not just for the sort of weenie math that is needed for program
verification.  Try Cornell's NuPRL system for instance.  (A friend of
mine produced a proof of the finite version of Ramsey's theorem in it,
and I would say that that has been checked a great deal more thoroughly
than if Euclid, Archimedes, Newton, Gauss, Euler and Hilbert refereed it
together). 

crm@romeo.cs.duke.edu (Charlie Martin) (03/17/90)

In article <12326@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>
>   [ I suppose that you program-provers, once you have proved you program,
>     are going to prove that the proof is correct, yes? No? ]
>
>
>Anecdote 1:
>
>....A couple of months later a bug manifested
>itself. No, it wasn't due to a typo in the coding -- and not a compiler-bug,
>either. The proof contained a mistake corresponding to a mistake in the code.
>
Yup, that happens.  Just as in ...

>Anecdote 2:
>
> .... Of course, my advisor checked it carefully, as
>supposedly did the jury, but I guess there is some possibility that I too
>overlooked something. It is a very knotty bit of business, that.
>

>Now consider that both of those faulty proofs underwent much closer scrutiny
>than my little program-proof of anecdote-1 did -- years of scrutiny --
>more than ANY program-proof is ever likely to undergo, yet each was incorrect.
>
>The point is, if you think that doing a formal proof is a good way to budget
>your system test time, by all means go ahead. Just don't think it proves
>anything conclusively. You better run lots of carefully selected test-cases
>too.

I don't *think* a formal proof is a good way to budget test time, I have
evidence from controlled studies and anechdotal evidence that says it
is.  Even better, I have a strong suspicion that the proof can direct
the testing to improve it.

But let's just look at the opposite side of this statement:  "If you
think that a formal proof is a good way to establish a mathematical
statement, by all means go ahead.  Just don't think it proves anything
conclusively.  You'd better do lots of carefully-selected examples too."

As a mathematician, this ought to immediately strike you as something
between "a little off" and "of course; any mathematically literate
person knows that you should make sure that a simple example ought to
work too."  (Although there are those horror stories about the person
who one way or another had a counter-example to his main theorem proven
during his defense.)  The point of program proving (IM philosophical O)
is that it establishes the basis on which predictions can be made,
against which actual tests can be performed, to raise the certainty of
correctness into the realm of "inductive" or "abductive"
demonstration.

If you want to look into this point further, I can mail you the
Canonical Flame War between Jim Fetzer and myself.  If that threat
doesn't put you off, nothing will.....


Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm) 

bks@alfa.berkeley.edu (Brad Sherman) (03/17/90)

In article <1423@oravax.UUCP> ian@oravax.odyssey.UUCP (Ian Sutherland) writes:
>In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes:
>-As an alternative, someone could post a non-trivial program and ask the
>-program provers to prove it either correct or incorrect...
>I like this idea.  Any takers?
>Ian Sutherland		ian%oravax.uucp@cu-arpa.cs.cornell.edu

Well this is trivial, but I think many of us would like to see some
examples of program proofs so let's start out small. Is this correct or
not?

/* ---------------------------------------------------------------------
 *      RXUTOS
 *      RadiX Unsigned TO String.
 *      Converts an unsigned integer into a string of digits in any
 *      base (radix) in the range {2, ..., 35}.  Digits needed in the range
 *      {10, ..., 35} are represented by the characters {A, B, ..., Z}.
 *
 *      This function returns a pointer to a static buffer which is
 *      overwritten on each call. On error (radix out of range), a NULL
 *      pointer is returned.
 *
 *      Author: Bradley K. Sherman (bks@alfa.berkeley.edu)
 *      Copyright (C), The Regents of the University of California, 1990
 */
char *
rxutos( n, radix )
unsigned int   n;
unsigned int   radix;
{
        static char	buf[ ( 8 * sizeof(unsigned int) ) + 1 ];
        char		*s = ( 8 * sizeof(unsigned int) ) + buf;   
 
        if ( radix < 2  ||  radix > 35 ) 
           return( NULL );
 
        *s = '\0'; 
        do {
                *--s = "0123456789ABCDEFGHIJKLMOPQRSTUVWXYZ"[n % radix];
        } while ( (n /= radix) > 0 );
 
        return( s );
}
/* --------------------- End of function rxutos() ---------------------  */

ian@oravax.UUCP (Ian Sutherland) (03/17/90)

In article <2870@castle.ed.ac.uk> sean@castle.ed.ac.uk (S Matthews) writes:
>Ian should know a *lot* better than this, and severely damages his own
>credibility by saying it. 

I don't know what you think I was saying.  The question is "what is
nontrivial?"  I'm willing to have people post programs, look at them,
and if I think it's reasonable to try to get them through our system
(Ariel), possibly with some modifications, I'll try it.  If I have
some success, I'll report it.  I don't guarantee that I'll even try a
single one of the suggestions posted.  What about this damages my
credibility?

>Anyone who expects to be able to do this after Jones (who, if anybody,
>knows what he is talking about in formal verification) has admitted that
>it is not practical, should expect to be doubted in other statements.

What is it that I'm expecting to be able to do?
-- 
------
Ian Sutherland		ian%oravax.uucp@cu-arpa.cs.cornell.edu

Sans Peur

freek@fwi.uva.nl (Freek Wiedijk) (03/17/90)

In article <1990Mar16.210324.398@agate.berkeley.edu> bks@alfa.berkeley.edu (Brad Sherman) writes:
>In article <1423@oravax.UUCP> ian@oravax.odyssey.UUCP (Ian Sutherland) writes:
>>In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes:
>>-As an alternative, someone could post a non-trivial program and ask the
>>-program provers to prove it either correct or incorrect...
>>I like this idea.  Any takers?
>>Ian Sutherland		ian%oravax.uucp@cu-arpa.cs.cornell.edu
>
>Well this is trivial, but I think many of us would like to see some
>examples of program proofs so let's start out small. Is this correct or
>not?

No:
>                *--s = "0123456789ABCDEFGHIJKLMOPQRSTUVWXYZ"[n % radix];
                                               ^^
You forgot the N!  (I'm sure, I'll be the Nth to reply with this :-)

Other criticisms: You never defined NULL, and your program is _much_
too long, it should have been: :-) :-)

/*(c)*/char*rxutos(n,r)unsigned n,r;{static char b[8*sizeof(unsigned)
+1];char*s=8*sizeof(unsigned)+b;if(r<2||r>35)return(0);*s=0;do{*--s=
"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"[n%r];}while((n/=r));return(s);}

Also: what's wrong with radix 36?

--
Freek "the Pistol Major" Wiedijk                  Path: uunet!fwi.uva.nl!freek
#P:+/ = #+/P?*+/ = i<<*+/P?*+/ = +/i<<**P?*+/ = +/(i<<*P?)*+/ = +/+/(i<<*P?)**

bks@alfa.berkeley.edu (Brad Sherman) (03/17/90)

In article <534@fwi.uva.nl> freek@fwi.uva.nl (Freek Wiedijk) writes:
> [various correct criticisms] and
> ...your program is _much_ too long, ...
without mentioning that there is some justification for omitting the line:
	*s = '\0';

I still think that it would be interesting to see a proof so I'll
try once more:

/* ---------------------------------------------------------------------
 *      RXUTOS
 *      RadiX Unsigned TO String.
 *      Converts an unsigned integer into a string of digits in any
 *      base in the range {2, ..., 36}.  Digits needed in the range
 *      {10, ..., 35} are represented by the characters {A, B, ..., Z}.
 *
 *      This function returns a pointer to a static buffer which is
 *       on each call. On error (radix out of range), a NULL
 *      pointer is returned.
 *
 *      Author: Bradley K. Sherman (bks@alfa.berkeley.edu)
 *	With thanks to: Freek Wiedijk (freek@fwi.uva.nl)
 *      Copyright (C), The Regents of the University of California, 1990
 */
char *
rxutos( n, radix )
register unsigned int   n;
register unsigned int   radix;
{
        static char     buf[ ( 8 * sizeof(unsigned int) ) + 1 ];
        char            *s = ( 8 * sizeof(unsigned int) ) + buf;   
 
        if ( radix < 2  ||  radix > 36 ) 
           return( 0 );
 
        *s = '\0';
        do {
		--s;
                *s = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"[n % radix];
		n /= radix;
        } while ( n > 0 );
 
        return( s );
}
/* --------------------- End of file rxutos.c ---------------------  */

runyan@hpcuhc.HP.COM (Mark Runyan) (03/19/90)

>/ sean@castle.ed.ac.uk (S Matthews) /  2:46 am  Mar 16, 1990 /
>Anyone who expects to be able to do this after Jones (who, if anybody,
>knows what he is talking about in formal verification) has admitted that
>it is not practical, should expect to be doubted in other statements.
>
>Formal verification is not a snake oil panacea, but some people make it
>sound like one and it is no wonder that reasonable people are sceptical
>of their claims.

Your telling me that formal proofs are *only* useful on new development?
Considering how much old code is out there, I find this slightly 
frightening.

Still, anything that can be used to improve new code is a plus.  But at
the moment, I'm trying to fight fires in old code where total replacement
is *not* considered an alternative.

Mark Runyan

crm@romeo.cs.duke.edu (Charlie Martin) (03/19/90)

In article <2480007@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes:
>Your telling me that formal proofs are *only* useful on new development?
>Considering how much old code is out there, I find this slightly 
>frightening.

Gee, don't be frightened, Mark.  It's no more of a boogeyman than it was
before.

Gee, on second thought, maybe I'll be frightened with you.

>
>Still, anything that can be used to improve new code is a plus.  But at
>the moment, I'm trying to fight fires in old code where total replacement
>is *not* considered an alternative.

The very first experience I had with program proof in real code was in
Germany in 1981.  I'd just read _The Science of Programming_.  The
system I was working on (I was the warranty) had a module that had just
been unending trouble.  It controlled a big n x n switch matrix, and the
matrix had a tendency not to switch correctly, but intermittently.

The hardware guys swore it was software problems; after some time, in
desperation I rewrote and proved the single routine that generated the
message to which the switch responded.  I reinstalled that module and
--- the switch still didn't work.  Since I had not only proven the
routine but tested the hell out of it (pace Fetzer) I was reasonably
confident that THIS module was generating the correct controls.  With
this basis, I was confident enough to tell the hardware guys that it was
absolutely not the software, could we put a logic analyzer on the
control lines.  When we did, the control signals were as specified.  It
was the box after all.  (Turned out the switch settings were set up
through an eprom, which had been xrayed coming through the export
process.)

The point of this, yet another war story, is that I got good results
during maintenance by removing one routine of about 75 lines which was
apparently real trouble and then reimplementing it with proof.  Since
many systems exhibit the 80/20 fault behavior (80 percent of the faults
lie in 20 percent of the modules) I suspect that hand-proof can be used
during maintenance to fix the buggiest modules first and get a LOT of
improvement in reliability.  The same could be said of machine-checked
proof, but machine-checked proof is higher cost.


Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm) 

hughes@tempest.Berkeley.EDU (Eric Hughes) (03/19/90)

In article <2480006@hpcuhc.HP.COM> runyan@hpcuhc.HP.COM (Mark Runyan) writes:
>   As an alternative, someone could post a non-trivial program and ask the
>   program provers to prove it either correct or incorrect...

This really isn't the way it's done.  The idea is to develop the
program and its proof simultaneously.  The generation of invariants
and assertions is more difficult (at least now, since there is not
as much cultural experience) than writing the code.

Eric Hughes
hughes@ocf.berkeley.edu

hughes@tempest.Berkeley.EDU (Eric Hughes) (03/20/90)

In article <1990Mar17.063012.24979@agate.berkeley.edu> bks@alfa.berkeley.edu (Brad Sherman) writes:
>   I still think that it would be interesting to see a proof so I'll
>   try once more:

Your original question was whether the included routine was correct or
not.  The answer, without looking at the code itself, is no, for the
simple reason that code, _per se_, is not proven correct.  What is
proven is the code plus its precondition and postcondition.  Since
there is no precondition and no postcondition, the code cannot be
proven to do anything.

Now, here is what these conditions are.  This is not the formal
definition, but a working description.  The notation generally used is
{P} S {R}; here P is the precondition, S is the statement (which may
be a compound statement or subroutine), and Q is the postcondition.  P
and Q are logical predicates (a predicate is a sentence that is either
true or false) which describe program state.  {P} S {R} is said to be
correct if, when P is true at the beginning of the execution of S,
then R is always true and the end of execution of S.

Here is a simple example:

{true}
a = 1
{ a equals 1 }

Here the precondition is the true predicate.  Every program state
satisfies the true predicate.  The statement is the assignment of the
value one to the variable a.  The postcondition asserts that the value
of the variable a is 1.  The program is correct, we reason informally,
since if you assign the value one to a, then the value of a is one.
This may seem rather silly, but there is a fundamental principle here:
statements and predicates alternate, each statement moving the current
assertion about program state forward one step.

I think that a proper statement of preconditions and postconditions is
one of the best debugging tools.  I recently had too maintain a 500
line section of 8086 assemby code which was basically undocumented.
The first thing I did was to triple the size by adding assertion,
postconditions, and preconditions everywhere.  In testing, I was able
to fix half a dozen bugs simply by comparing the stated postcondition
of a routine to the expected precondition of some statement and
observing that the previous postcondition did not imply the
precondition.  It was truly wonderful.

Without going into too much detail, I have written a proof of the given
subroutine and augmented the code with it.  I would say that this is
a different program, myself, even though the individual code statements
were unaltered.

Don't worry if you don't understand some of the constructions, we can
discuss them.  For more details, go get a copy of _The Science of
Programming_, by David Gries, from which I learned how to do this.

The proof is as complete as I think a hand checked proof needs to be;
in fact, I think it overly verbose.  (Every proof is, by nature,
verbose.)  It is also not _really_ complete.  I left out a calculation
to show that the bound function is always greater than zero.  This
proof took about an hour to write.

Eric Hughes
hughes@ocf.berkeley.edu

   /* ---------------------------------------------------------------------
    *      RXUTOS
    *      RadiX Unsigned TO String.
    *      Author: Bradley K. Sherman (bks@alfa.berkeley.edu)
    *	With thanks to: Freek Wiedijk (freek@fwi.uva.nl)
    *      Copyright (C), The Regents of the University of California, 1990
    */
/* Precondition: 
 *   2 <= radix <= 36
 *   n > 0
 *   the length of the representation of n in base radix is not greater
 *     than RXUTOS_BUFFER_SIZE
 * Postcondition:
 *   return value points to a ASCIIZ string which is a representation of
 *     the number n in base b. 
 */ 
/* Define b = radix
 */
   char *
   rxutos( n, radix )
   register unsigned int   n;
   register unsigned int   radix;
   {
	#define RXUTOS_BUFFER_SIZE	10
	   static char     buf[ RXUTOS_BUFFER_SIZE + 1 ];
	   char            *s = RXUTOS_BUFFER_SIZE + buf;   

	   *s = '\0';
	/* Assert s is the empty string.
	 */
	/* Define: let t be the length of the representation of n in base b
	 * Define: let s' be the number represented by s in base b
	 * Define: let u be a loop counter, starting at zero
	 * Define: n' = n
	 */
	/* Assert s' == 0
	 * Assert s' + n * 1 == n'
	 * Assert n = [ n' / 1 ]
	 */
	/* Invariant:
	 *   s' + n * b^u== n' 
	 *   n = [ n' / b^u ]       // real division, greatest integer function
	 *   s is an ASCIIZ string
	 *   buf == s + RXUTOS_BUFFER_SIZE - u
	 * Bound function:
	 *   t - u
	 * Guard:
	 *   n > 0
	 */
	   do {
		/* Define s'' = s'
		 * Define n'' = n
		 */
		   --s;
		/* Assert buf == s + RXUTOX_BUFFER_SIZE - (u + 1)
		 * Assert u <= t - 1     // from the bound function > 0
		 * Assert u >= 0
		 * Hence Assert s in buf[0..RXUTOS_BUFFER_SIZE+1)
		 *   Note:  this is necessary because with a language
		 *    that allows pointers to point anywhere there is
		 *    a need to explicitly show that a derefenced pointer
		 *    assignment is permitted.
		 */
		   *s = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"[n % radix];
		/* Assert s is still an ASCIIZ string
		 * Assert s' = s'' + ( n'' mod b ) * b ^ u
		 * Assert n > 0
		 */
		   n /= radix;
		/* Assert n >= 0
		 * Assert n == [ n'' / b ]
		 *          == [ [ n' / b^u ] / b ]
		 *          == [ n' / b^(u+1) ]
		 */
		/* Assert s' + n * b^(u+1)
		 *         == s'' + (n'' mod b) * b^u + [ n''/b ] * b^(u+1)
		 *         == s'' + b^u * ( n'' mod b + b * [ n''/b ] )
		 *         == s'' + b^u * n''
		 *         == n'
		 */
		/* Stmt ++u
		 * Assert buf == s + RXUTOX_BUFFER_SIZE - u
		 * Assert n == [ n / b^u ]
		 * Assert n >= 0
		 * Assert n' == s' + n * b^(u+1)
		 * Assert bound function has decreased by one
		 * Note: I have not shown that the bound function
		 *   is greater that zero after the guards.
		 */
	   } while ( n > 0 );
	/* Assert n >= 0 and not n < 0
	 * Assert n == 0
	 * Assert n' == s' + 0 * b^u == s'
	 * Assert s is the representation of the input n in base b
	 */

	   return( s );
   }
   /* --------------------- End of file rxutos.c ---------------------  */

joshua@athertn.Atherton.COM (Flame Bait) (03/20/90)

Someone has suggested that we post programs, and let the verification 
folks try to verify them.  Sounds good to me.

I would like to see any of the following four programs verified:
GnuTar, GCC, GAWK, or G++.

All of them are real, and do useful things.  All are available in source
code.  And all have pretty well defined behaviors.  GCC uses the ANSI
standard, G++ and GAWK are based on books and some follow on technical 
reports, and I believe that GnuTar is based on the POSIX tar standard.  
GCC and G++ are both huge, but GAWK and GnuTar should be reasonably sized.

Is anyone willing to send them through a verification system?  
Ian, do you have the time for this?  If so please tell us such things
as the bugs you find, the time it took, the verification system and
methodology you used, the version of the program you verified, etc.
Also, send the results to the GNU people, I'm sure they'll be interested
in them.  If you can not verify one of these progams, please tell us
why.

Joshua Levy                          joshua@atherton.com  home:(415)968-3718
                        {decwrl|sun|hpda}!athertn!joshua  work:(408)734-9822 

joshua@athertn.Atherton.COM (Flame Bait) (03/20/90)

In article <18261@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes:
>I don't *think* a formal proof is a good way to budget test time, I have
>evidence from controlled studies and anechdotal evidence that says it
>is.  

Can you please post 3-6 references to the controlled studies which show
that formal proofs are a better way to budget time than debugging?

For these studies to be really convincing (to me, anyway) they should
involve real world software projects, in real world environments.  The
projects should be on "normal" hardware (no LISP machines, Butterflies,
or connection machines).  I should be able to buy or otherwise aquire 
all the speciallized software used.  I should be able to take classes
in the methodology used.  The studies should compare projects which 
use good SE methodology (without proofs) to good SE methodology (with
proofs).  

Remember: in the commerical world, something only exists if I can buy it.

Obviously, I'd be very interested in studies which do not match all of
the above requirements, but the more they match, the better they are.

Joshua Levy                          joshua@atherton.com  home:(415)968-3718
                        {decwrl|sun|hpda}!athertn!joshua  work:(408)734-9822 

decot@hpisod2.HP.COM (Dave Decot) (03/20/90)

> This really isn't the way it's done.  The idea is to develop the
> program and its proof simultaneously.  The generation of invariants
> and assertions is more difficult (at least now, since there is not
> as much cultural experience) than writing the code.
> 
> Eric Hughes

Righto.  For discussion purposes, here's the specification version
of the radix conversion code (somebody else will have to come up with
the invariants and assertions).

Dave
-----------

/*  This a version of the original program with its initial
    comments converted into a C-like specification language.
    The meaning of the language constructs are intended to be
    obvious.  The use of the keyword "return" in this language
    refers to the return value of the function.
*/

/* The original specification doesn't mention radix, so the function
   could have been implemented more easily by returning "6" always.  :-)
   However, assuming that the return value should be a numeral
   representing the value of n in the indicated radix...
*/

/* Import precompiled specifications of various functions; assume that
   they are specified as in ANSI C.
*/

extern spec int strlen();
extern spec char *strchr();
extern spec long strtoul();

#define digits "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"

spec char *rxutos(n, radix)
register unsigned int n, radix;
{
    if (radix >= 2 && radix <= 36)
    {
	forall int i (i >= 0 && i <= strlen(return))
	{
	    define char *p (strchr(return[i], digits));
	    guarantee p != 0 && p - return < radix;
	}

        guarantee strtoul(return, (char **)0, radix) == n;
	guarantee strlen(return) < 8*sizeof(unsigned int) + 1;
    }
    else
	guarantee return == 0;
}

/* for comparison, here's the posted implementation */

char *rxutos(n, radix)
register unsigned int n;
register unsigned int radix;
{
        static char buf[ (8 * sizeof(unsigned int)) + 1 ];
        char *s = (8 * sizeof(unsigned int)) + buf;   
 
        if ( radix < 2  ||  radix > 36 ) 
           return( 0 );

        *s = '\0';
        do {
		--s;
                *s = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"[n % radix];
		n /= radix;
        } while ( n > 0 );
 
        return( s );
}

shimeall@cs.nps.navy.mil (Tim Shimeall) (03/20/90)

(sorry for the post -- I tried to reply directly but my mailer bounced
 your address)
In article <19989@joshua.athertn.Atherton.COM> you write:
>Can you please post 3-6 references to the controlled studies which show
>that formal proofs are a better way to budget time than debugging?

I can't post 3-6 references to such studies (in general, programming
experiments are so expensive to run that expecting several studies to
examine a single issue is VERY unrealistic) but I can point you to
one:

J.E. Brunelle and D.E. Eckhardt ``Fault Tolerant Software: Experiment
with the SIFT Operating System,'' {\it AIAA Computers in Aerospace V
Conference}, October 1985, pp. 355--360.

This study compared a portion of the verified SIFT kernel against two
independently developed versions (using a controlled development
procedure) of the kernel (all three developed from the same
specification).  They ran a large number of randomly-generated data
sets through the three versions.  The study found several faults in the
comercially-developed version, but NO faults in the verified version.

If you want to see 3-6 studies on this subject, may I suggest that you
(or your organization) sponsor such work.  But it isn't inexpensive
(you don't get something for nothing), especially if you wish to see
controlled studies done using industrial problems.  Such studies range
in cost from the hundreds of thousands of dollars to the millions of
dollars.  For comparison, I ran a testing study for my dissertation 
that involved 50+ people and took about 2 years of real time and slightly 
over a Sun/3 CPU-decade.  That was analyzing a set of 2000-line programs.  
NO ONE in the current environment is seriously planning to try controlled
development studies on large industrial problems -- the funds simply
aren't there.

crm@romeo.cs.duke.edu (Charlie Martin) (03/20/90)

The best references I can give you off the top of my head re the
Software Cleanroon papers; there is an excellent overview that wil
appeared (or has appeared, all I have is the preprint), title "Cleanroom
Software Engineering", author Harlan Mills, in _Aerospace
Software Engineering_, Anderson, Dorfman and Hall, eds., AIAA 1990.
Some of the experimental results are in "Certifying the Reliability of
Software,"  IEEE TSE SE-12, No 1, Jan 1986, pp 3--11.  Also see
"Cleanroom software development: an empirical evaluation," IEEE TSE vol
SE-13, No 9, Sept 1987, I don't have the pages.

Of course, even better will be the experiment I'm going to perform when
I have the time and funding.


Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm) 

madd@world.std.com (jim frost) (03/20/90)

joshua@athertn.Atherton.COM (Flame Bait) writes:
>I would like to see any of the following four programs verified:
>GnuTar, GCC, GAWK, or G++.

None of these even comes close to passing a good C checker cleanly,
much less a formal proof.  I'd suggest picking something that wouldn't
be so easy to prove incorrect.

The argument is moot, anyway.  Program proving is wonderful when you
absolutely positively must not fail and it's worth the extra time to
do the proof.  For most applications, however, the time necessary to
do the proof would make the application out-of-date before it shipped.
Both sides are right depending on the job requirements.

jim frost
saber software
jimf@saber.com

hughes@locusts.Berkeley.EDU (Eric Hughes) (03/21/90)

In article <19988@joshua.athertn.Atherton.COM> joshua@athertn.Atherton.COM (Flame Bait) writes:
   I would like to see any of the following four programs verified:
   GnuTar, GCC, GAWK, or G++.

OK, I accept.  I pick GAWK.  If you will stipulate that the program
halts on all inputs, then my proof is finished, because I have
picked the true postcondition.

Now if you write a formal spec of the behavior of GAWK, I will
have another, and different, task at hand.  But I think it will
take you longer to write a formal spec than for me to write a 
proof.

Maxim:  A program is only correct with respect to a precondition
and postcondition.

Furthermore, utilities like GAWK, which are meant to be general
purpose, public utilities, should always have the true precondition,
since the should not hang up on arbitrary input.

Eric Hughes
hughes@ocf.berkeley.edu

cox@stpstn.UUCP (Brad Cox) (03/22/90)

Apparently you haven't heard that C (and its derivatives, including
Objective-C and undoubtedly C++ as well) supports assertion checking.
It's called assert(). You will undoubtedly find it in /usr/include/assert.h
on your system, or thereabouts.  You use it like this

	#include "assert.h"
	assert(YES)
	a = 1;
	assert(a == 1);

The macro generates
	if (!(1)) {
		fprintf(stderr, "assertion failure in file %s line %d\n",
			_FILE_, _LINE_);
		exit(1);
	}
	a = 1;
	if (!(a == 1)) {
		fprintf(stderr, "assertion failure in file %s line %d\n",
			_FILE_, _LINE_);
		exit(1);
	}

But if you set the macro variable, NDEBUG, the assert()s vanish at
compile time, so they can be left in the source forever as documentation.

By the way, I recommend that the macro be modified to generate as follows,
for reasons that should become obvious once you start using it a lot.

	if (!(a == 1)) assertionFailure(_FILE_, _LINE_);

This has proven so useful for C code in general, and for polymorphic
object-oriented Objective-C code in particular, that assertion-checking 
has become a habit, something that we take for granted that any professional
programmer would be using as a matter of routine.

Wonder why the existence of this macro comes as a surprise to so many. 
I've even heard assertion checking used as a selling point for Eiffel!

hughes@lightning.Berkeley.EDU (Eric Hughes) (03/22/90)

In article <4389@stpstn.UUCP> cox@stpstn.UUCP (Brad Cox) writes:
   Apparently you haven't heard that C (and its derivatives, including
   Objective-C and undoubtedly C++ as well) supports assertion checking.

Please don't patronize me.  Every single file of C code I write has
"#include <assert.h>" in it.  In fact, my own assert.h has a macro
halt() in it to put in places where the code should never get to.

Yes, it's useful.  Yes, many people have not heard of it.  Yes, it is
my habit, almost a reflex.  That's not the point.

There are two big differences between using a runtime check (as in the
assert() macro) and a compiler time check (an inline predicate
asserted to be true).  They are both called assertions, an unfortunate
confluence which leads to misunderstanding.

The first difference is entirely practical.  There are some runtime
checks which appear naturally in proof which just can't be
realistically checked at runtime.  Consider the behavior of the
library routine malloc().  We want to be able to say that the pointer
malloc() gives us points to a block of memory that is exclusively
ours, that is, that there does not exist another pointer which will
modify contents of this block.  I do not know of a way to verify
this with a runtime check.  I do think it possible to prove a 
memory allocator correct with the above property, however.

Here is another example, one where a runtime check cannot in principle
be written.  Pick any numerical algorithm that uses floating point
numbers as an approximation to the real numbers, that is, that models
a mathematical object by a computer one.  We want to know what the
maximum error is from any calculation, which leads naturally to terms
of the form abs( f - r ), where f is floating point (a subset of the
real numbers) and r is real.  You can't evaluate this term on a
computer.

The second difference is much more important.  I'll even assume, for
the sake of this section of the argument, that any assertion can be
checked at runtime, even the ones above.  Even with this strengthening
assumption, runtime checks do not provide use with the confidence of a
formally correct program.  For runtime checks only assure us that each
single execution of the program is valid, not that every such possible
execution will be.  (Both these assurances presuppose that the
compiler, operating system, hardware, etc., perform as expected.  This
is a different issue than Fetzer's article.)

To say this in other language, runtime checking gives us enumerable
instances of program execution; every time we run the program on a
different input and it satisfies all of its assertions, we have
another example of a correct execution of the program.  On the other
hand, when we make a logical proof that a program satisfies its
preconditions and postcondition, we have assured ourselves (again,
insofar as our models are correct) that the program will work
correctly on every possible input; a formal proof gives us decidable
instances of correct programs.

The practical difference between these two is enormous.  For a program
which is checked for correctness at runtime is correct _only_ insofar
as it is tested before release, but a program which is proved correct
formally has the _further_ assurance that there is a logical and
formal derivation of precondition and postcondition (and that is a
powerful assurance in practice).

Eric Hughes
hughes@ocf.berkeley.edu