[comp.software-eng] Have *YOU* ever written a program which COULDN'T be proven correct?

crm@romeo.cs.duke.edu (Charlie Martin) (02/01/90)

In article <793@arc.UUCP> steve@arc.UUCP (Steve Savitzky) writes:
>>>Actually, all that is needed is for a machine to have *infinitely
>>>expandable* memory, and mine has that -- I can always go out and buy
>>>another disk.  (One of my professors once described a Turing Machine
>>>as having a tape that ran out the door and into a paper factory.)
>>You will always always always run into a limit: the number of slots for
>>disk controllers, the number of disks per controller, the addressing
>>capability of the bus, something.
>
>I was referring to diskettes, not drives -- I never said that manual
>intervention by an operator might not required.  Yes, you will run
>into limits (including the heat death of the universe); the point is
>that a computer can be treated as a Turing machine FOR ALL PRACTICAL
>PURPOSES, (Oops; I forgot -- verification isn't practical! :-),
>because you can't put a finite upper bound on its number of states.
>

Weren't we carrying on this argument by mail just a day or two ago?  Or
was it someone else?

In any case, you won't see ME asserting that verification isn't
practical.  I regularly write programs with proofs.  I regularly code
them in C.  I regularly write programs which have zero defects during
testing and zero defects in the field in real use.  I'm talking about
multiple-thousand line programs doing hard system-programming tasks.  I
think anyone who doesn't work this way is missing the boat, and I think
anyone who writes life-critical software without proofs -- and all the
other verfication and validations tools available -- is negligent.

My point is that I can treat a machine as a Turing machine if it suits
my purposes, e.g., if I can't bound a particular algorithm or I want to
ignore things like word-length descriptions.

But you can't treat a physical realization of a program as having
infinite memory for any practical purpose; you only use infinite memory
in order to gain insight about computation as a mathematical entity.
Even most of the interesting results that come out of computability
theory come about because you can bound the size of the tape required
according to the size of the input, or some such.  (Admittedly, this
isn't true of things like the halting problem.)

>
>I think you missed my point, which is that whether or not the halting
>problem can be solved *in theory* for real programs on real computers,
>it cannot be solved *in practice*.  
>
>In any case, I don't think that any practical proof methods operate by
>treating programs as FSM's.

At least one does, or can be read to: see Chapter 0 of Dijkstra's
Discipline of Programming.  The intuitive model he offers is
specifically bounded to finite state spaces.

Conveniently, most or all proof methods can ignore bounds on the state
space by simply assuming that there is "enough" memory.  One of the
important points about computability theory in terms of a formal basis
for real programming is the computability hierarchy, which establishes
bounds on what can be treated as equivalent between FSA's and TM's.

This means that one can ignore bounds in many ways; once a program is
proven "totally correct" -- which means just that the mapping from
inputs to outputs is as specified and the program always halts -- 
then you are assured that you can bound the size of the machine needed
to run the program in a real execution.

>
>....  The more
>complicated and formal the specification, the greater the likelihood
>that (a) the customer will believe it and (b) the spec will be wrong.

Sure, and that is a real problem.  The solution doesn't appear to have a
formal basis: there is no algebra of cognition to which we can appeal to
"prove" our specifications have read the user's mind.

My opinion is that interactive development and prototyping are the only
techniques which really give much leverage to this problem.  But a
formal specification can be based on the behavior of a prototype.  A
*validation* step can then confirm that the realizaed program behaves
the same way as the prototype.  

>
>How about dropping the term "verification" in favor of something that
>more accurately describes the state of the art, like "compile-time
>assertion-checking" (or, if appropriate, "coding-time assertion
>checking" ("semantics-directed editing?!") and "design-time
>consistency checking").

How about sending out for pizza and complaining when it doesn't arrive
on italian bread?  The accepted and standard term for what we're talking
about is "verification."  The accepted and standard term for confirming
the utility of the resulting program and thereby the suitabilty of the
requirements as originally stated is "validation."  I'm not that hot on
verification as a term myself, but there is an existing community that
uses the terminology standardly, and they aren't going to change for you
or for me.

But this is just like the technical term "object."  The word has many
meanings; however, if you don't complain about the term "object-oriented
programming" because the definition of the word "object" requires
something that can be seen or felt, all that will happen is that the
people who already use the terminology will understand that you don't
know the existing work well enough to have a useful opinion.
Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm) 

steve@arc.UUCP (Steve Savitzky) (02/02/90)

In article <1303@oravax.UUCP> ian@oravax.odyssey.UUCP (Ian Sutherland) writes:
>In article <796@arc.UUCP> oravax!cornell!uw-beaver!rice!cs.utexas.edu!samsung!brutus.cs.uiuc.edu!apple!arc!steve steve@arc.UUCP (Steve Savitzky) writes:
>>In article <1297@oravax.UUCP> ian@oravax.odyssey.UUCP (Ian Sutherland) writes:
>>>I managed a project to build a verification system for C programs
>>>which use floating point arithmetic.  This system makes the explicit
>>>assumption that the state space of the program is finite.  I like to
>>>think our proof methods are getting to the practical stage :-)
>>
>>What about the stack, the heap, and the file system?
>
>Like I said: the system explicitly assumes that the state space of the
>program is finite.  This includes the stack and the heap.  It doesn't
>include the file system because the subset of C/UNIX the system handled
>didn't include system calls for accessing the file system.

I can count the useful programs in this subset on the fingers of one
foot.  (Actually, I presume your system handles things like library
routines, so it could have some use, but I couldn't resist. :-)

>>Could you
>>actually derive bounds on, say, the disk usage of a program?

>Since we didn't have the file system included in the state, the answer
>is of course "no", but as for the stack and the heap, the answer is a
>little more subtle.  The system (which is called Ariel by the way)
>could theoretically derive such bounds, but at present it's probably
                                                             ^^^^^^^^
 (almost certainly)---------------------------------------------^
>not practical to derive explicit bounds except in simple cases.  What
>we do instead is to prove that the program will run correctly (whatever
>that means for a particular program) on a sufficiently large finite
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^
>machine.                                            ^
 ^^^^^^^                                             |______
                                                            |
Good!  Whether or not you know it, you're agreeing with me. +
I can, for example, write a Turing Machine simulator, and you can
prove that it works correctly unless it runs out of memory.  But you
CAN'T determine (in a reasonable amount of time) under what
circumstances it will run out of memory (because the Busy Beaver
function grows faster than any computable function)!

Unfortunately, unless you *can* derive explicit bounds, you can't
prove that your program isn't correct but such a memory hog as to be
useless on anything but a toy problem.

>           This kind of analysis will detect problems like unbounded
>recursion or unbounded use of space on the heap.  For real arithmetic,
>we essentially prove that any desired degree of accuracy in the result
>could be achieved by running the program on a machine with sufficiently
>accurate primitive real arithmetic operations (+, *, etc.).  Of course,
>explicit numerical data would be nicer, but is harder to come by.
>We're working on it ...

Again, you're proving MY point, which is simply that verification has
certain limits which are often forgotten, brushed aside, or blythely
ignored.  (I further argue that IT IS DANGEROUS TO IGNORE THESE
LIMITS.)  

My thinking on this subject has been influenced by my very first
programming assignment, 25 years ago or thereabouts.  Someone had
written a polynomial equation-solver that worked perfectly on some
very nasty, ill-conditioned test cases, but gave hopelessly incorrect
answers on any practical problem.  You can guess why:  it operated by
repeatedly transforming the polynomial to square the roots (ten
times), and any reasonable coefficients would overflow.  I fixed it by
writing a very-extended-precision floating-point package.

If it had gone through the fortran equivalent of your verifier it
would have passed, bug intact.

--
\ Steve Savitzky      \ ADVANsoft Research Corp \ REAL hackers use an AXE!
 \ steve@arc.UUCP      \ 4301 Great America Pkwy \ #include<std_disclaimer.h>
  \ arc!steve@apple.COM \ Santa Clara, CA 95954   \ 408-727-3357
   \__________________________________________________________________________

ian@oravax.UUCP (Ian Sutherland) (02/04/90)

In article <801@arc.UUCP> steve@arc.UUCP (Steve Savitzky) writes:
>In article <1303@oravax.UUCP> ian@oravax.odyssey.UUCP (Ian Sutherland) writes:
>>Like I said: the system explicitly assumes that the state space of the
>>program is finite.  This includes the stack and the heap.  It doesn't
>>include the file system because the subset of C/UNIX the system handled
>>didn't include system calls for accessing the file system.
>
>I can count the useful programs in this subset on the fingers of one
>foot.  (Actually, I presume your system handles things like library
>routines, so it could have some use, but I couldn't resist. :-)

The first version of the system was intended to verify floating point
programs of the sort found in the IMSL library of FORTRAN numerical
subroutines.  There is a large body of people who consider these
programs very useful.  When I write programs using floating point
arithmetic, I try to encapsulate the real numerical computation in
subroutines which don't have any I/O.  This allows me to change the
external interface to the program without having to change the basic
numerical algorithms, which are far less likely to need any changes.
The ability to verify such routines is therefore very useful.

In any case, I/O is being added to the subset, so your comment above
will soon be moot.  I agree that a verification system which cannot
handle I/O is of very limited usefulness.  That's why it's being
extended.  We are also adding things like pipes and forking to the
subset.

>>The system (which is called Ariel by the way)
>>could theoretically derive such bounds, but at present it's probably
>                                                             ^^^^^^^^
> (almost certainly)---------------------------------------------^
>>not practical to derive explicit bounds except in simple cases.

I don't think you know enough about what we're doing to make such a
statement Mr. Savitzky.  I don't actually think it's as hard to do as I
first thought.  The thing that's harder is deriving bounds on roundoff
error.  Stack and heap size should be easier.  At any rate, figuring
out bounds on stack size was not a real high priority for the Ariel
project because the kinds of routine we were trying to be able to
verify almost never nest function calls to a depth of more than, say,
3.  Many of them are sufficiently small that they don't call any other
functions at all.  When we get to the point where we're working on
programs which are sufficiently large that this might be a problem,
we'll worry about addressing that problem.  The analogous comments hold
for heap size.

>>What
>>we do instead is to prove that the program will run correctly (whatever
>>that means for a particular program) on a sufficiently large finite
>                                           ^^^^^^^^^^^^^^^^^^^^^^^^^
>>machine.                                            ^
> ^^^^^^^                                             |______
>                                                            |
>Good!  Whether or not you know it, you're agreeing with me. +

What you don't seem to realize Mr. Savitzky is that I've been agreeing
with most of what you've been saying all along.

>I can, for example, write a Turing Machine simulator, and you can
>prove that it works correctly unless it runs out of memory.

This is NOT what we prove.  Lots of systems can prove that a program
will work correctly if it does not run out of memory.  What we prove
is that the program will work correctly if it does not run out of
memory, AND that there is SOME amount of memory which is sufficient.
As I said in my previous posting, this rules out programs that could
use unbounded amounts of stack or heap.  Perhaps you don't think that
catching bugs of that sort is important.  I do.

>But you
>CAN'T determine (in a reasonable amount of time) under what
>circumstances it will run out of memory (because the Busy Beaver
>function grows faster than any computable function)!

This comment reminds me of the comments in other newsgroups about
verification and the halting problem.  I don't have to be able to
figure out how much memory ANY program will use in order to have a
useful system.  I just need to be able to prove something useful about
useful programs.

In fact, I don't need to be able to figure out ANYTHING about stack
size OR heap size OR disk usage in order to have a USEFUL system.
There are huge numbers of bugs in programs that have nothing to do
resource exhaustion.  If a system addresses these bugs then it can be
useful.  Of course, I'm not trying to make this argument for our
system because we DO intend to address such problems.

>Unfortunately, unless you *can* derive explicit bounds, you can't
>prove that your program isn't correct but such a memory hog as to be
>useless on anything but a toy problem.

You are quite right, but as I said above, a system does not need to
address all kinds of bug in order to be useful, and as I said in my
previous posting, we're working on it.

[my explanation that Ariel will catch unbounded stack and heap usage
deleted]

>Again, you're proving MY point, which is simply that verification has
>certain limits which are often forgotten, brushed aside, or blythely
>ignored.  (I further argue that IT IS DANGEROUS TO IGNORE THESE
>LIMITS.)  

Again, I already agree with this.  If this is your major point, then
(a) I think you've made it adequately, and (b) I don't think anyone's
arguing with you.  I'm certain that I'm not, so stop arguing with ME,
OK?
-- 
Ian Sutherland		ian%oravax.uucp@cu-arpa.cs.cornell.edu

Sans Peur

cox@stpstn.UUCP (Brad Cox) (02/05/90)

In article <793@arc.UUCP> you write:
>In article <17143@duke.cs.duke.edu> crm@romeo.UUCP (Charlie Martin) writes:
>>In article <789@arc.UUCP> steve@arc.UUCP (Steve Savitzky) writes:
>>This is a semantic problem; you're using the word "specification" in a
>>different sense than I am.  In my terminology you are saying that you
>>don't know if what you stated in the requirements really meets your
>>needs, and you're right. 
>
>How about dropping the term "verification" in favor of something that
>more accurately describes the state of the art, like "compile-time
>assertion-checking" (or, if appropriate, "coding-time assertion
>checking" ("semantics-directed editing?!") and "design-time
>consistency checking").

This whole debate is like the fable of the blind men arguing about the 
elephant.  "Its like a tree", said the one touching a leg. "Its like a 
snake" said the one touching the trunk. And so forth.

The "computer scientists" view software as a solitary, internal, mental
activity, for which mathematics is a close analogy. The "software engineers"
view software as an organizational, external, tangible activity, for which
plumbing is (in principle; not today in practice; reusability is still more 
a dream than reality) a close analogy.

The intersection between the internal world of the mind and the external
world of the senses has been the subject of long-standing philosophical 
debate. I particularly recommend "Zen and the Art of Motorcycle Maintenance;
An Inquiry into Values", because it makes this very distinction right
in its title; Zen = the mind, Motorcycles = the senses.

None of the blind men understood the elephant from their own point of view,
but they did manage it by putting their disparate points of view together.

I go into all this in more detail in an article scheduled for the November
1990 issue of IEEE Software Magazine; "Software Technologies of the 1990's;
a coffee-table issue meant to have a shelf-life of a decade.  My 
contribution is titled "Planning the Software Industrial Revolution; The
Impact of Object-oriented Technologies". 

It is about putting diverse points of view, and diverse technologies 
together. It is a plea for the software community to stop viewing
specific technologies, ala' Ada, C++, or Smalltalk, as panaceas but 
as mere tools.  It is a plea to stop focusing which weapon is 'better' 
and to begin planning how to put all available weapons together to win 
the *WAR*; the Software Industrial Revolution.

	Brad Cox

jimad@microsoft.UUCP (JAMES ADCOCK) (02/13/90)

In article <4157@stpstn.UUCP> cox@stpstn.UUCP (Brad Cox) writes:
...
>I go into all this in more detail in an article scheduled for the November
>1990 issue of IEEE Software Magazine; "Software Technologies of the 1990's;
>a coffee-table issue meant to have a shelf-life of a decade.  My 
>contribution is titled "Planning the Software Industrial Revolution; The
>Impact of Object-oriented Technologies". 
>
>It is about putting diverse points of view, and diverse technologies 
>together. It is a plea for the software community to stop viewing
>specific technologies, ala' Ada, C++, or Smalltalk, as panaceas but 
>as mere tools.  It is a plea to stop focusing which weapon is 'better' 
>and to begin planning how to put all available weapons together to win 
>the *WAR*; the Software Industrial Revolution.

Designers of Object Oriented Languages and/or compilers for the beasts
can help in this by addressing methods to make OOPLs interoperable.
Once upon a time Basic couldn't call C couldn't call Pascal couldn't
call FORTRAN.  Now we have widely available compilers allowing this
kind of interoperability.

Then came OOP, and no languages can inter-call[message] again.  Mind you,
many OOPLs support calling C or other non-OOP languages-- but allow little
or no support for calling *from* C or other non-OOP language.  And the 
various OOPLs don't consider how to message between each other.  Thus
each OOPL attempts to lock out the others.

Give us ways to [almost]seamlessly message between OOPLs, give us ways
to fire up the various OOPL run-time support systems from a different
language, and give us run-time support systems that can co-exist with
each other.  Then people can mix and match languages and software modules
as best meet their needs.

brnstnd@stealth.acf.nyu.edu (02/14/90)

In article <793@arc.UUCP> steve@arc.UUCP (Steve Savitzky) writes:
> I think you missed my point, which is that whether or not the halting
> problem can be solved *in theory* for real programs on real computers,
> it cannot be solved *in practice*.  

A year ago I posted a short note to comp.theory titled Finite Halting
Problem Considered Solvable. The gist of it is that you can solve the
halting problem for a program using just twice the memory and thrice
the time.

In practice, with language support, you designate certain variables
to be tested for looping. Those variables are stored twice, and the
program runs at a third of the speed. If those variables together enter
a loop, the loop will be detected during its first period. For some
programs this could be very useful.

---Dan