[comp.lang.misc] Will this *thread* ever halt?

) (05/20/91)

yodaiken@chelm.cs.umass.edu (victor yodaiken) writes:
>In article <TcT422w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) writes:
>>Yes, but do you want to have to be aware of exactly how much memory is
>>available to your programs?
>
>Not always. But, for the compiler to  warn me that 
>"while(1)i =i+1" will either crash or cycle state forever does not require
>me to know this information.

Indeed. There are always simple cases which can be analysed easily. The
compiler I use gives warnings about while loops with constant boolean
conditions.

In general, though, determining whether a program will halt on a real
computer by analysing it as a FSM requires knowledge about exactly how much
memory is available.

>>Consider the following pseudo-code:
>>
>>i = 0
>>repeat
>>   allocate 1 byte of memory
>>   i++
>>until allocation fails
>>if i is prime, while(1);
>>end
>>
>>You can write such a program in ANSI C. In order to predict whether it will
>>enter a non-terminating loop you have to know exactly how much memory is
>>available to the program when it runs, to the byte.
>
>If your point is that one can write ANSI "C" programs that are machine
>specific, sure.

The above program is not machine specific. You can write it so that it will
compile and run on any machine which supports ANSI C. You just can't tell
whether it will terminate or not unless you know exactly how much memory that
machine has.

>            And if your point is that one can write ANSI "C" programs
>that will depend on I/O or operating system behavior, I also agree.
[...]
>The compiler cannot predict such paths, but it can warn us when our
>programs contain i/o dependent paths that can lead to indefinite cycles.

How many non-trivial C programs don't use dynamic memory allocation?  Sure,
you can restrict yourself to fixed-size data structures, but that leads to
even WORSE problems when you try to handle I/O.

>Again, you raise a valid issue, but I'm not clear how it applies to the
>point at hand. If I write a program that can run correctly given a "C"
>compiler that uses 32bits to represent integers, but which will loop
>given an 8 bit "C" compiler, the compilers can flag the error, without
>requiring the programmer to worry about exact representation.

Eh?  The programmer wants his program to work. If it works for 32 bit ints
but not for 16 bit ints, then he has to worry about exact representation.
Having the compiler say "your program doesn't work because this machine has
16 bit ints" does not mean that the programmer doesn't have to worry about
it.

>                                                            We provide
>abstractions in our programmng languages, but these are conveniences which
>do not negate the facts that real machines do not have "integers" and
>"real numbers", but only have limited representations of these mathematical
>abstractions.

"Real numbers" are a complete con. I'd rather leave them out of it.
"Integers" _do_ exist in some programming languages; the limitation is the
amount of hardware you give the language to work in. I have commented on this
in another posting.

>         Even if the compiler implements "i" with a variable format
>(e.g. bignums), the programmer will want to know if the program may
>exceed thelimits of machine representation.

When working with integers mathematically you have to know whether your
result will become too big to work with on paper.

>                                                 A correctly written program
>will not depend on exactly how much memory is available, and if the compiler
>flags such a dependency, then it has found what is most likely an error.

In that case, there are no correctly-written programs which perform
non-trivial tasks.

>>The restrictions you have to place on a language in order to enable
>>guaranteed termination are such that the language is not useful for general
>>programming.
>
>Demonstrate this? Any demonstraton would seem to require unusually
>precise information about the limits of ingenuity.

I was talking in the present tense. If you want to start speculating wildly
about future developments in program proving, then go ahead.

I was also taking as given the assumption that most people require an
imperative language for general programming. Given the relative popularity of
pure functional languages vs C, BASIC, FORTRAN and so on, I think that that is
reasonable.

>>Secondly, I want to know whether my programs will work on my friends'
>>machines.
>
>Try compiling on those machines, or compile with a "portability" flag.

What is this "portability" flag supposed to do?  Not bother reporting about
termination?

>If I had a "C" compiler that would run on, say, an IBM PC with at least
>6 meg of memory and a large disk, which would tell me within a few days
>whether or not an arbitrary "C" program would fail given at most
>X bytes of memory (X < 3 meg), then 
>I believe that at least a few people would be interested
>in using it.

Indeed. But you can't do it. Even knowing the exact value of X doesn't help;
you have to know other things such as the value of the system clock.
(Consider "if seconds = 32 then while (1)").

You might be able to write a compiler which could predict whether or not a C
program would run on a machine with 3104272 bytes of memory, if started at
precisely 3:15:22 PM on Tuesday 21st May with no other programs resident.
Whether it would be worth your while to do so is another matter.


mathew

 

yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/21/91)

In article <gZy9221w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) writes:
>In general, though, determining whether a program will halt on a real
>computer by analysing it as a FSM requires knowledge about exactly how much
>memory is available.
>

The way you cast the problem evades the TM/FSM question.  Yes. Programs
that depend on particularities of machine configuration do depend on
particularities of machine configuration. If we write a program P that
requests specific machine configuration information, and then choses
a path based on that data, then the operation of the program will not
be the same on all machines. But, this is essentially the same situation
as i/o dependencies. I argue that it is possible, in theory, for a C
compiler to detect all paths that lead to non-terminating loops and to
flag these paths. It is not possible for a "C" compiler to determine,
in advance, the result of checking the current time, asking how much
memory is available, or reading the next character typed on a terminal.
These questions are not undecidable in the technical sense, they are
simply undetermined.

>How many non-trivial C programs don't use dynamic memory allocation?  Sure,
>you can restrict yourself to fixed-size data structures, but that leads to
>even WORSE problems when you try to handle I/O.

If we give X.c to the hypothical smart compiler and it tells us that
X.c contains a path that exceeds the process memory limits on this
machine, or that X.c contains a path that will cause it to enter a
non-terminating loop if "alloc" fails, then the compiler has solved
the halting problem for this code.

>>Again, you raise a valid issue, but I'm not clear how it applies to the
>>point at hand. If I write a program that can run correctly given a "C"
>>compiler that uses 32bits to represent integers, but which will loop
>>given an 8 bit "C" compiler, the compilers can flag the error, without
>>requiring the programmer to worry about exact representation.
>
>Eh?  The programmer wants his program to work. If it works for 32 bit ints
>but not for 16 bit ints, then he has to worry about exact representation.
>Having the compiler say "your program doesn't work because this machine has
>16 bit ints" does not mean that the programmer doesn't have to worry about
>it.

You've lost me here. The correctness of "C" programs can depend on how
the compiler implements "ints". This is a feature of the "C" language, not
an artificial constraint that I'm inventing. Programmers cannot treat
"C" as if it were more high level than it is, and expect to get away with
it. 


>>                                                 A correctly written program
>>will not depend on exactly how much memory is available, and if the compiler
>>flags such a dependency, then it has found what is most likely an error.
>
>In that case, there are no correctly-written programs which perform
>non-trivial tasks.
>

Can you give me an example of such a program. The "C" compiler, lisp
interpreter, emacs editor, news-reader, operating system, computer
algebra programs and and theorem provers we have around here seem to 
be able to tolerate wide variations in available memory.

>
>>If I had a "C" compiler that would run on, say, an IBM PC with at least
>>6 meg of memory and a large disk, which would tell me within a few days
>>whether or not an arbitrary "C" program would fail given at most
>>X bytes of memory (X < 3 meg), then 
>>I believe that at least a few people would be interested
>>in using it.
>
>Indeed. But you can't do it. Even knowing the exact value of X doesn't help;
>you have to know other things such as the value of the system clock.
>(Consider "if seconds = 32 then while (1)").
>

Again, you mistake the nature of the problem. If I claim that P
implements the function f: D -> D', then I am claiming that for any
x in the appropriate domain running P on input x will result in 
the value f(x). If the compiler can tell me that there is an x inthe
domain that causes P to enter a non-terminating loop, then the compiler
has detected a bug in my program. You are arguing that the compiler will
not be able to predict x. So what? Generally, we want to know whether the
program will compute the correct value given any input, and we include
such things as system time among the inputs. 

debray@cs.arizona.edu (Saumya K. Debray) (05/22/91)

yodaikenchelm.cs.umass.edu (victor yodaiken) writes:

> I argue that it is possible, in theory, for a C compiler to detect all
> paths that lead to non-terminating loops and to flag these paths. 
[ ... ]
> If we give X.c to the hypothical smart compiler and it tells us that
> X.c contains a path that exceeds the process memory limits on this
> machine, or that X.c contains a path that will cause it to enter a
> non-terminating loop if "alloc" fails, then the compiler has solved
> the halting problem for this code.

Good.  Perhaps Mr. Yodaiken would care to convince us infidels once and
for all by posting some source code to do what he has so eloquently
argued can be done?  And if Mr. Yodaiken is unable/unwilling to do so,
then is this "discussion" accomplishing anything more than blowing warm
air back and forth across the networks?

The proof of the pudding, Mr. Yodaiken, is in the eating.  You've argued
that it's possible to decide the halting problem for "real computers".  
So show us.

-- 
Saumya Debray		CS Department, University of Arizona, Tucson

     internet:   debray@cs.arizona.edu
     uucp:       uunet!arizona!debray

gudeman@cs.arizona.edu (David Gudeman) (05/22/91)

In article  <30814@dime.cs.umass.edu> victor yodaiken writes:
]... I argue that it is possible, in theory, for a C
]compiler to detect all paths that lead to non-terminating loops and to
]flag these paths.

I argue that this is impossible in theory.  Infinite-loop detection
algorithms must almost certainly store some information about the
state space of the algorithm in question.  Furthermore, in the general
case this information must almost certainly require space on the order
of the size of the set of all possible states at a given point in the
program (see H1 below).

If so, the amount of information needed for at least some classes of
programs is huge.  So there are programs that would compile without
this analysis that cannot be compiled with this analysis.
Furthermore, due to the magnitude of the information that the
algorithm has to keep around, I'll bet that _most_ nontrivial programs
could not be analysed in this way.

Of course you can always reduce the amount of information you keep
around by compacting many states into a single approximating state.
If you do this you can greatly increase the set of programs you can
analyse, but your analysis is only approximate.  You can make the
approximation a safe one or a complete one: A safe approximation will
recognize all programs that definitely will _not_ halt.  A complete
approximation will recognize all programs that definitely _will_ halt.
I expect that these approximations can be made good enough to be very
useful.

Hypothesis H1:
  any algorithm, HALT(p) that solves the halting problem for p must in
general, for each point in p, encode at one time all the possible
states that might occur at that point in p.

Argument:
  suppose HALT(p1) returns TRUE.  Create program p2 by adding
somewhere the statement 

  if (F()) exit(); else for(;;);

where F is a function whose output depends on most of the current
state.  Then HALT must be able to determine whether F can ever
possibly answer 0 in order to tell if p2 halts.  To do this, in
general HALT must be able to encode all possible states that might
occur at this point.  It is possible to construct sets of states that
cannot be encoded by anything much smaller than the entire set of
states.

Any HALT algorighm that attempts to avoid this by only doing one state
at a time must in general deal with the possibility that HALT might
not terminate, since it is simulating the program being analysed.  The
same considerations apply to a program that works with subsets of the
total set of possible states.

You can use the trick of running p in two different virtual machines,
one going twice as fast as the other, and comparing states.  The
storage overhead of this is equal to twice the size of the states of
p.  In this case, there are still programs that will compile without
HALT that will not compile with HALT -- so _theoretically_ this
doesn't solve the problem.  And it isn't practical either, since if
there is an infinite loop, it may take gigayears to find it.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/22/91)

In article <3409@optima.cs.arizona.edu> debray@cs.arizona.edu (Saumya K. Debray) writes:
>yodaikenchelm.cs.umass.edu (victor yodaiken) writes:
>
>> I argue that it is possible, in theory, for a C compiler to detect all
>> paths that lead to non-terminating loops and to flag these paths. 
>[ ... ]
>> If we give X.c to the hypothical smart compiler and it tells us that
>> X.c contains a path that exceeds the process memory limits on this
>> machine, or that X.c contains a path that will cause it to enter a
>> non-terminating loop if "alloc" fails, then the compiler has solved
>> the halting problem for this code.
>
>Good.  Perhaps Mr. Yodaiken would care to convince us infidels once and
>for all by posting some source code to do what he has so eloquently
>argued can be done?


I'm flattered that you have so much confidence in my abilites: if I can't
come up with an algorithm, then clearly, such an algorithm cannot exist.
There is an obvious, and obviously impractical algorithm, however. Let's
first consider the case of programs that just compute a function of 
their arguments: i.e. programs that perform no i/o but  can be written
as subroutines of the form prog( type: x, type: y .... ). 
For example, suppose we have a C function  "int f(x,y,z) int x,y,z ..."
Now we look at how "ints" are implemented on the target machine, suppose
that each int is a 32 bit value. We then emulate f(x,y,z) for each of
the mere 2^96 possible assignments. The emulation consists of single
stepping the computation of "f" either on the target computer or within
a program emulating this computer. At each step, we make a copy
of the entire memory space of the emulation. The program is sure to
either repeat state (a non-terminating loop) or exit within a bounded
amount of time because it can only reference a bounded amount of memory.

[BTW, this is not as impractical as it seems. H. Massalin has written
an ingenious assembly language optimizer that, given a short
assembly language routine that computes a simple result, will find the
shortest possible sequence of assembly statements that computes the
same function. Masslin's "Super-optimizer" essentially makes an
exhaustive search of the set of all 1 line assembly programs, then all
2 line programs, ..... With some smart heuristics the optimizer comes up with
some incredible sequences of code (e.g., a min function that has no
branch). There is a paper on the optimizer in some recent ASPLOS
proceeedings. So, enough of this whining about what we can't do. ]

Now we return to the impractical algorithm and extend it to programs that
have system calls. Ther are a couple of strategies here, but we'll take
the dumbest, since it requires the least effort  from me. Each system
call, i/o driven or not, changes program state in only a bounded number
of ways. For example, a call "read(filedescriptor,bufferaddr,n)"
will change the values of the "n" bytes of memory begining a 
"bufferaddr" in some way. Whenever we encounter such a call in the
"C" program, we simply try all 2^(n*8) possibilities. Similarly, with
calls to "alloc" which either add some non-determined data to the program
data space (up to a known limit) or return a fail code. We just test
all branches. Pretty simple, hey? 

This algorithm will work. And in some cases it might even be practical.
In hardware design it is already common to build special purpose machines
that do brute force emulations of chip behavior.  Is there a smarter
algorithm, a better programing language, a well defined o.s. interface
that will make this type of analysis practical without special purpose
hardware? No idea. But, there is emphatically no mathematical theorem
that eliminates the possibility of such an algorithm.

>The proof of the pudding, Mr. Yodaiken, is in the eating.  You've argued
>that it's possible to decide the halting problem for "real computers".  
>So show us.

yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/22/91)

In article <3410@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>In article  <30814@dime.cs.umass.edu> victor yodaiken writes:
>]... I argue that it is possible, in theory, for a C
>]compiler to detect all paths that lead to non-terminating loops and to
>]flag these paths.
>
>I argue that this is impossible in theory.  Infinite-loop detection
>state space of the algorithm in question.  Furthermore, in the general
>case this information must almost certainly require space on the order
>of the size of the set of all possible states at a given point in the
>program (see H1 below).
>
>If so, the amount of information needed for at least some classes of
>programs is huge.  So there are programs that would compile without
>this analysis that cannot be compiled with this analysis.
>Furthermore, due to the magnitude of the information that the
>algorithm has to keep around, I'll bet that _most_ nontrivial programs
>could not be analysed in this way.
>

Your argument is that the task is difficult, and would require
huge resources, possibly impractical resources, for most non-trivial
programs. No disagreement here. Impractical != impossible. You must
understand "in theory" in a nonstandard way. 


>Of course you can always reduce the amount of information you keep
>around by compacting many states into a single approximating state.
>If you do this you can greatly increase the set of programs you can
>analyse, but your analysis is only approximate.  You can make the
>approximation a safe one or a complete one: A safe approximation will
>recognize all programs that definitely will _not_ halt.  A complete
>approximation will recognize all programs that definitely _will_ halt.
>I expect that these approximations can be made good enough to be very
>useful.
>

I suspect that you are right. 

>Hypothesis H1:
>  any algorithm, HALT(p) that solves the halting problem for p must in
>general, for each point in p, encode at one time all the possible
>states that might occur at that point in p.
>
>Argument:
>  suppose HALT(p1) returns TRUE.  Create program p2 by adding
>somewhere the statement 
>
>  if (F()) exit(); else for(;;);
>
>where F is a function whose output depends on most of the current
>state.  Then HALT must be able to determine whether F can ever
>possibly answer 0 in order to tell if p2 halts.  To do this, in
>general HALT must be able to encode all possible states that might
>occur at this point.  It is possible to construct sets of states that
>cannot be encoded by anything much smaller than the entire set of
>states.
>
That's an interesting conjecture. For many cases it would be easy, even
if "F" looks at all of memory. For example, if F computes a checksum on
all of memory, then the possible state set of F would be huge, but it
would be easy to see that F does return 0 sometimes. Essentially, we
want to know if there is a path through F that terminates at some
"return 0" state, but we don't want to have to build the whole damn
state set of F.  People in hardware boolean circuit optimization and
in formal verification have been looking at this problem. I'll dig
up some references if anyone wants. Clarke and Burch recently reported that
in verification of bit-slice controllers, the state set of 10^20 elements
did not have to be completely unrolled, and that only 60,000 ( ?) or
so states had to be visited. But, your mileage may vary.


>Any HALT algorighm that attempts to avoid this by only doing one state
>at a time must in general deal with the possibility that HALT might
>not terminate, since it is simulating the program being analysed.  The
>same considerations apply to a program that works with subsets of the
>total set of possible states.

Yes. The HALT might not detect a loop, because it has thrown away the 
loop start state.  Is that your point? 

gudeman@cs.arizona.edu (David Gudeman) (05/22/91)

In article  <30835@dime.cs.umass.edu> victor yodaiken writes:
]In article <3410@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
]>In article  <30814@dime.cs.umass.edu> victor yodaiken writes:
]>]... I argue that it is possible, in theory, for a C
]>]compiler to detect all paths that lead to non-terminating loops and to
]>]flag these paths.
]>I argue that this is impossible in theory.
]Your argument is that the task is difficult...
].. Impractical != impossible. You must understand "in theory" in a
]nonstandard way.

No, it's just that you seem to be trying to be trying to have it both
ways: "Since a computer has a finite number of states, it is possible
in principle to solve the halting problem for it, since you can always
just traverse all possible states."  The first part of the sentence
implies that you are modeling a computer as an FSM.  The second part
of the sentence implies that you are modeling a computer as a
non-finite state machine, since you are assuming that you can traverse
all possible states.

Pick your model.  No matter which one you pick, you cannot solve the
halting problem for all programs on a given machine with a program on
that machine.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

) (05/22/91)

yodaiken@chelm.cs.umass.edu (victor yodaiken) writes:
> In article <gZy9221w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) 
> >In general, though, determining whether a program will halt on a real
> >computer by analysing it as a FSM requires knowledge about exactly how much
> >memory is available.
>                                           If we write a program P that
> requests specific machine configuration information, and then choses
> a path based on that data, then the operation of the program will not
> be the same on all machines.

OK. Give me an example of a C program which performs a major useful function,
and does not use any features which would make halting analysis of that
program dependent upon knowing exact details of the machine hardware.

Just one.

>                      I argue that it is possible, in theory, for a C
> compiler to detect all paths that lead to non-terminating loops and to
> flag these paths.

No. It is possible in theory for a C compiler to detect all paths that MIGHT
lead to non-terminating loops. Anything more requires exact detailed
knowledge about the hardware.

And if you opt to flag all paths which MIGHT lead to non-terminating loops or
system crashes, you'll end up with most of the program being flagged.

>                   It is not possible for a "C" compiler to determine,
> in advance, the result of checking the current time, asking how much
> memory is available, or reading the next character typed on a terminal.
> These questions are not undecidable in the technical sense, they are
> simply undetermined.

Right. But the former two are properties of the machine environment the
program will be running in, and can seriously affect whether the program
halts. Look at all the programs which crashed when the system date on UNIX
went past the size of an int.

You can, in certain languages, make halting predictions; you can say "If this
program is fed a file of finite size, it will terminate". C is not, in
general, one of those languages.

> If we give X.c to the hypothical smart compiler and it tells us that
> X.c contains a path that exceeds the process memory limits on this
> machine, or that X.c contains a path that will cause it to enter a
> non-terminating loop if "alloc" fails, then the compiler has solved
> the halting problem for this code.

...on this particular machine, running at this particular time, with this
particular amount of processor stack, and so on.

To get any useful GENERAL halting results would require massive restrictions
on what you allowed in your "Halting ANSI C". Such severe restrictions that I
doubt whether the language would be useful.

> >Eh?  The programmer wants his program to work. If it works for 32 bit ints
> >but not for 16 bit ints, then he has to worry about exact representation.
> >Having the compiler say "your program doesn't work because this machine has
> >16 bit ints" does not mean that the programmer doesn't have to worry about
> >it.
> 
> You've lost me here. The correctness of "C" programs can depend on how
> the compiler implements "ints". This is a feature of the "C" language, not
> an artificial constraint that I'm inventing. Programmers cannot treat
> "C" as if it were more high level than it is, and expect to get away with
> it. 

Right. And whether a C program halts or not can be critically dependent upon
the size of integers used in that particular implementation of C.

So to perform your magic halting analysis, you need to know exactly how big
integers are. And if your analysis is to be useful, this information needs to
be directly or indirectly communicated to the programmer. Therefore the
programmer needs to worry about exact implementation specifics.

> >>                                                 A correctly written progra
> >>will not depend on exactly how much memory is available,
> >
> >In that case, there are no correctly-written programs which perform
> >non-trivial tasks.
> 
> Can you give me an example of such a program. The "C" compiler, lisp
> interpreter, emacs editor, news-reader, operating system, computer
> algebra programs and and theorem provers we have around here seem to 
> be able to tolerate wide variations in available memory.

Really?  Try running them in 1KB.

There will be some lower memory threshold x beneath which the programs will
not run. If you have x-1 bytes of memory, the programs will not run. It is
therefore important that the exact amount of memory you have is greater than
x. This is a condition on exactly how much memory must be available. Equality
isn't the only condition which requires you to know exact amounts.

> >>If I had a "C" compiler that would run on, say, an IBM PC with at least 6
> >>meg of memory and a large disk, which would tell me within a few days
> >>whether or not an arbitrary "C" program would fail given at most X bytes
> >>of memory
[...]
> >Indeed. But you can't do it. Even knowing the exact value of X doesn't help;
> >you have to know other things such as the value of the system clock.
[...]
> Again, you mistake the nature of the problem. If I claim that P
> implements the function f: D -> D', then I am claiming that for any
> x in the appropriate domain running P on input x will result in 
> the value f(x). If the compiler can tell me that there is an x inthe
> domain that causes P to enter a non-terminating loop, then the compiler
> has detected a bug in my program. You are arguing that the compiler will
> not be able to predict x. So what? Generally, we want to know whether the
> program will compute the correct value given any input, and we include
> such things as system time among the inputs. 

You said "an arbitrary C program". Not "an arbitrary C program which is not
allowed to look at the system time".

If your C code is not allowed to look at the system time, not allowed to use
malloc (which effectively involves looking at available memory), not allowed
to look at the file system (for temporary files) and so on, then yes, you
probably can find out whether it will compute the correct value given any
input.

Now, how many useful programs do you know which satisfy those criteria?

To look at it another way: Most C programs will end up with a list of inputs
which includes system memory, clock, stack size, file system size, and so on.
You will need to know the values of these inputs before you can determine
whether the program will halt or not. This is not very useful.

I'm about ready to give up arguing with you. You seem to be just as clueless
as when you started, and I really have better things to do than argue about
whether you can solve the halting problem for general ANSI C programs.

If you still believe you can determine whether an arbitrary non-trivial ANSI
C program will halt or not, and have that information generally applicable
enough to be useful to the programmer, then go out there and do it and earn
yourself millions. While you're at it, there are some guys in sci.skeptic who
have a design for a perpetual motion machine which needs development.


mathew

 

yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/22/91)

In article <3430@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>No, it's just that you seem to be trying to be trying to have it both
>ways: "Since a computer has a finite number of states, it is possible
>in principle to solve the halting problem for it, since you can always
>just traverse all possible states."  The first part of the sentence
>implies that you are modeling a computer as an FSM.  The second part
>of the sentence implies that you are modeling a computer as a
>non-finite state machine, since you are assuming that you can traverse
>all possible states.
>

You are defining the problem incorrectly. I do not argue that for any
FSM M, we can use M to decide its own halting problem. Instead, I argue
that there is a deterministic  *algorithm* with time/space bounded by the
number of states of M, and that this algorithm can decide the halting
problem for M. If we want, we can compute this algorithm on a FSM with
sufficient state space (probably quite a bit larger than M).

You are arguing that there are programs which can be executed on
any computer which use enough of the state space of the computer as to
leave no room for a program that would emulate them. But, of course. 
However these programs *can* be emulated on a bigger, but still finite,
physical or virtual machine.  Note that there is no analog for this
procedure with TMs. A TM does not get any more powerful
when we add additional tapes or heads, but FSMs limited to n states
are strictly less powerful than FSMs limited to n+1 states.  

mlc%gva.decnet@CONSRT.ROCKWELL.COM ("GVA::MLC") (05/23/91)

Newsgroups: comp.lang.misc
From: mlc@aten.cca.rok.com (Michael L. Cook)
Subject: Re: Will this *thread* ever halt?
Sender: news@zoot.avgrp.cr.rok.com
Reply-To: MLC%GVA.DECNET@CONSRT.ROK.COM
Organization: Rockwell Collins, Cedar Rapids, IA

A general solution to the halting problem does not exist,
just as it is impossible to predict when this thread will halt!!

--

Michael Cook
Internet: mlc%gva.decnet@consrt.rok.com
"A little infrastructure goes a long way."  --  Stuart Shipman

rockwell@socrates.umd.edu (Raul Rockwell) (05/23/91)

victor yodaiken:
   I argue that there is a deterministic *algorithm* with time/space
   bounded by the number of states of M, and that this algorithm can
   decide the halting problem for M. If we want, we can compute this
   algorithm on a FSM with sufficient state space (probably quite a
   bit larger than M).

Well, yes.  If you're dealing with an arbitrary machine, with 16k ram
(8 bit bytes) you have to have a state space of over 10^(10^6) bits to
store halting information (halts/doesn't) for every possible state.
And this doubles every time you add a bit to the machine -- for a unix
system with a gigabyte of disk space, your state space over
10^(8.26x10^10) bits (a number greater than 1 followed by over 82
billion zeros (decimal)).

An the other hand, many algorithms can be easily shown to halt.  (Any
algorithm which is "primitive recursive" in the number-theoretic
sense, which includes just about every computer programming technique
except arbitrary loops and arbitrary recursion).

So I can see why a person might want to say that the halting problem
is solvable for a practical machine, but I wouldn't say that using
state space is a useful solution.  The solution I see is to limit
yourself to a class of problems which are primitive recursive.

From personal experience, I can say that you're still going to have to
resort to state machine style techniques when you deal with i/o or
secondary storage (secondary storage being defined as any storage not
managed transparently by your target language).

And I'm sure that any systems implementor can tell you horror stories
about lurking race conditions and other potential bogosities in the os...

Raul Rockwell

kwalker@cs.arizona.edu (Kenneth Walker) (05/23/91)

In article <30863@dime.cs.umass.edu>, yodaiken@chelm.cs.umass.edu (victor yodaiken) writes:
> 
> You are defining the problem incorrectly. I do not argue that for any
> FSM M, we can use M to decide its own halting problem. Instead, I argue
> that there is a deterministic  *algorithm* with time/space bounded by the
> number of states of M, and that this algorithm can decide the halting
> problem for M. If we want, we can compute this algorithm on a FSM with
> sufficient state space (probably quite a bit larger than M).

The original problem, as I recall, was whether a compiler can solve
the halting problem for real programs. I insist on being able to
run your compiler on the same machine I run my program on. (After
all, this is a real-world question and not some "silly" theoretical
exercise, right? :-)

  Ken Walker / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721
  +1 602 621-4252  kwalker@cs.arizona.edu {uunet|allegra|noao}!arizona!kwalker

yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/24/91)

In article <NqHc34w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) writes:
>yodaiken@chelm.cs.umass.edu (victor yodaiken) writes:
>> In article <gZy9221w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) 
>> >In general, though, determining whether a program will halt on a real
>> >computer by analysing it as a FSM requires knowledge about exactly how much
>> >memory is available.

I'm really mystified by this discussion. Part ofthe problem may arise from
the odd idea of "exact" that you seem to hold.
You write:

>There will be some lower memory threshold x beneath which the programs will
>not run. If you have x-1 bytes of memory, the programs will not run. It is
>therefore important that the exact amount of memory you have is greater than
>x. This is a condition on exactly how much memory must be available. Equality
>isn't the only condition which requires you to know exact amounts.
>

Forgive me if I mistake your meaning, but this seems like complete nonsense.
The expression x> 10 does not constrain x  to an "exact amount". 
If emacs requires at least 4 meg of memory in order to run correctly,
then knowing that x>4meg is certainly good enough.

A second problem arises from your misuse of the term "halting problem".
The halting problem involves deterministic machines. It does not
involve predicting the time of day which you will chose to run 
a program or the next word which I will type into a keyboard.
A computer program can be considered to be a deterministic machine
computing a function on its input. You want to say that program behavior
is not deterministic, because the input is not deterministic.  Generally,
however, we are less interested in predicting system input than in making
sure that the program can cope correctly with *every* possible input.
When I say that we can, in theory,  decide whether or not a "C"
program contains an non-terminating loop, I do not mean to suggest
that we can decide in advance whether or not it will ever enter that
loop. The program "read c; if c='\n' while(1); exit" contains
a reachable non-terminating loop. You may be content
to run this program without ever hitting a cariage return. I'm not,
however, interested in predicting your behavior.

I wrote:
>> Again, you mistake the nature of the problem. If I claim that P
>> implements the function f: D -> D', then I am claiming that for any
>> x in the appropriate domain running P on input x will result in 
>> the value f(x). If the compiler can tell me that there is an x inthe
>> domain that causes P to enter a non-terminating loop, then the compiler
>> has detected a bug in my program. You are arguing that the compiler will
>> not be able to predict x. So what? Generally, we want to know whether the
>> program will compute the correct value given any input, and we include
>> such things as system time among the inputs. 
>
You reply:
>You said "an arbitrary C program". Not "an arbitrary C program which is not
>allowed to look at the system time".
>

Here, I'm considering the program as a function with input varying over
its domain. In this case, consider the domain to be the set of possible
time values. We don't want to attempt to predict which elements of this
domain will be encountered: i.e., we don't want to try to predict the
time of day when the program will be run. We just want to make sure that
the program behaves "correctly" for any time value. 

>I'm about ready to give up arguing with you. You seem to be just as clueless
>as when you started, and I really have better things to do than argue about
>whether you can solve the halting problem for general ANSI C programs.


Go to 'em.

) (05/28/91)

yodaiken@chelm.cs.umass.edu (victor yodaiken) writes:
> Forgive me if I mistake your meaning, but this seems like complete nonsense.
> The expression x> 10 does not constrain x  to an "exact amount".

No. It does, however, constrain x to be greater than some exact amount. As
opposed to being greater than some fuzzily-defined amount which is
approximately 10.

> If emacs requires at least 4 meg of memory in order to run correctly,
> then knowing that x>4meg is certainly good enough.

It depends. It might require significantly more memory than that if you're
editing large files. The amount of memory required by Emacs is not an exact
amount. It depends what you're doing with the program.

Part of the job of determining whether an ANSI C program will halt will
involve placing exact bounds on how much memory it might use in running, and
hence producing exact specifications of input values/conditions for which it
might not terminate. And remember that a single byte might make a difference;
so if it's December, your calendar program might fail to work properly
whereas it worked fine in May, because somewhere the program is allocating
space for the name of the month and this is resulting in its not having
enough space to complete some other task.

> The halting problem involves deterministic machines. It does not
> involve predicting the time of day which you will chose to run 
> a program or the next word which I will type into a keyboard.

I didn't mean to imply that it did.

> A computer program can be considered to be a deterministic machine
> computing a function on its input. You want to say that program behavior
> is not deterministic, because the input is not deterministic.  Generally,
> however, we are less interested in predicting system input than in making
> sure that the program can cope correctly with *every* possible input.

Right. So in addition, your Terminating ANSI C has to check whether all
possible return values from standard library calls will result in termination
or not. For example, if the program uses time.h, we must check to see that it
terminates for all possible start times.

Basically, it adds another dimension of complexity to the problem.

> >You said "an arbitrary C program". Not "an arbitrary C program which is not
> >allowed to look at the system time".
> 
> Here, I'm considering the program as a function with input varying over
> its domain. In this case, consider the domain to be the set of possible
> time values. We don't want to attempt to predict which elements of this
> domain will be encountered: i.e., we don't want to try to predict the
> time of day when the program will be run. We just want to make sure that
> the program behaves "correctly" for any time value. 

Right. That's fine.

Basically, for any moderately complicated ANSI C program the dimensions of
the input domain and the range of each dimension will be so large that trying
all combinations will be infeasible on even the largest computers. I'll leave
it to someone else to do some sums to demonstrate this.

In short, determining the halting status of ANSI C programs is not feasible
for real programs; and it's not possible for general ANSI C programs, because
general ANSI C programs might have arbitrarily large input spaces making
analysis require impossible amounts of memory and CPU time.

If you want to write programs and be told whether they will halt or not, you
will have to use a different language; one which is restricted so that you
don't have to rely on brute-force methods for determining halting status. For
example, a functional language which uses map operations (iteration over a
structure) rather than while loops and which limits general recursion.


mathew

 

doug@netcom.COM (Doug Merritt) (05/29/91)

In article <k3ZN38w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) writes:
>In short, determining the halting status of ANSI C programs is not feasible
>for real programs; and it's not possible for general ANSI C programs, because
>general ANSI C programs might have arbitrarily large input spaces making
>analysis require impossible amounts of memory and CPU time.

I don't see why all that stuff is even being discussed; like I pointed out
the other day, even determining halting for self-contained C programs (no
malloc, no clock dependency, etc) is impossible, so why bring in the other
junk?

>If you want to write programs and be told whether they will halt or not, you
>will have to use a different language; one which is restricted so that you
>don't have to rely on brute-force methods for determining halting status. For
>example, a functional language which uses map operations (iteration over a
>structure) rather than while loops and which limits general recursion.

If you are aware of work that shows that the halting problem is soluble
for functional languages, I would dearly love to see the reference. Or
even that it *might* be soluble. (In other words, I don't believe it for
a minute; functional languages are universal computational models and
as such can be mapped one-to-one with Turing machines and the lambda calculus
etc.)

Also, you and your conversational opponent are starting to sound like you're
in violent agreement. No doubt that's just because I've lost track of your
actual base positions.
	Doug
-- 
Doug Merritt				doug@netcom.com (apple!netcom!doug)
	-OR- sun.com!jfrank!doug	-OR- doug@eris.berkeley.edu
Professional Wild-eyed Visionary	Member, Crusaders for a Better Tomorrow

yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/29/91)

In article <k3ZN38w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) writes:
>Basically, for any moderately complicated ANSI C program the dimensions of
>the input domain and the range of each dimension will be so large that trying
>all combinations will be infeasible on even the largest computers. I'll leave
>it to someone else to do some sums to demonstrate this.
>
>In short, determining the halting status of ANSI C programs is not feasible
>for real programs; and it's not possible for general ANSI C programs, because
>general ANSI C programs might have arbitrarily large input spaces making
>analysis require impossible amounts of memory and CPU time.

This discussion has passed into a terminally dull zone. So, I'm going
to give up, with some final words. First, I'd like to return to my
orignal point: computer scientists overuse and misuse the notion
of undecidable problems and in doing so may be shutting off some interesting
research avenues. It is not clear that there are any interesting problems
in compiler design and program verification which are undecidable. This
is simply because we are seldom interested in knowing whether computations
can be done in unbounded time and space, rather we generally want to know
if a problem can be solved using  some bounded level of resources, e.g.,
within 10 nanoseconds, within polynomial/exponential time,
within n time units where n is a parameter, within 100 million steps
times the size of the input, using no more than MAXPROC
space etc. etc.   Second, I want to reiterate that it is important to
precisely distinguish between "hard" and "impossible". There are some
problems that we know are impossible to solve algorithmically, and there
are other problems which seem intractable. But, just because the
readers of this newsgroup are incapable of thinking up an algorithm
or theoretical result on the fly, does not imply that there is no
such algorithm, or no result that will melt  a currently intractable
problem. Finally, a quote:
 
\begin{quote}
Turing machines are widely considered to be the abstract prototype of
digital computers; workers in the field, however, have felt more and
more that the notion of a Turing machine is too general to serve as
an accurate model of actual computers. It is well known that for even
a simple calculation it is impossible to give an {\it a priori} upper
bound on the amount of tape a Turing machine will neeed for any given
computation. It is precisely this feature which renders Turing's
concept unrealistic. 
In the last few years the idea of a {\em finite automata} has appeared
in the literature. These are machines having only a finite number
of internal states that can be used for memory and computation. The
restriction of finiteness appears to give a better approximation to
the idea of a physical machine. Of course, such machines cannot do as
much as Turing machines, but the advantage of being able to compute a
general recursive function is questionable, since very few of these
functions come up in practical applications."
\end{quote}

author="Rabin M.O.  and  Dana Scott",
title="Finite automata and their decision problems",
journal="IBM Journal of Research and Development",
vol={3},
number={2},
month={April},
year={1959},
page={114-125}

rockwell@socrates.umd.edu (Raul Rockwell) (05/30/91)

Victor Yodaiken:
   \begin{quote}
   Turing machines are widely considered to be the abstract prototype
   of digital computers; workers in the field, however, have felt more
   and more that the notion of a Turing machine is too general to
   serve as an accurate model of actual computers. It is well known
   that for even a simple calculation it is impossible to give an 
   {\it a priori} upper bound on the amount of tape a Turing machine
   will neeed for any given computation. It is precisely this feature
   which renders Turing's concept unrealistic.
   \end{quote}

   author="Rabin M.O. and Dana Scott",
[etc...]

While this quote may be literally accurate (workers in the field have
felt...) it misses what I feel is a rather major point:

 The reason it is impossible to give an upper bound of tape on a
 Turing machine for a given computation is because these computations
 are traditionally expressed for *arbitrarily large objects*.  If any
 object can be arbitrarily large then it doesn't matter WHAT the
 computation is, it will take an arbitrarily large amount of tape to
 represent the computation on that object.

Note that if you limit your objects to some finite size (like,
integers whose absolute magnitude is less than 2 billion), you can
represent (for example) multiplication of two of these numbers using
only two tape cells.  Of course, then you have to deal with overflow
conditions, but that's a problem with any finite number
representation.

Note that you do *not* want to write out each of the quadrapoles by
hand for this kind of turing machine--so don't.  Just use
parameterized descriptions.

Raul Rockwell

) (05/30/91)

doug@netcom.COM (Doug Merritt) writes:
> If you are aware of work that shows that the halting problem is soluble
> for functional languages, I would dearly love to see the reference. Or
> even that it *might* be soluble.

Sorry, this bit of the thread was talking not about solving the halting
problem in general, but about getting useful (but limited) information about
halting conditions for certain classes of program. I don't believe that
Victor Yodaiken thinks that *general* C programs are analyzable for halting
status.

My contention is that even this limited sort of halting information is
impractical to work out for ANSI C, but that I understand that some functional
languages might allow you to gain useful information about the likelihood of
some useful programs halting.

I'm not really an expert in functional programming, so I'll leave it at that.


mathew

 

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/12/91)

In article <1991May28.204434.16396@netcom.COM> doug@netcom.COM (Doug Merritt) writes:
> I don't see why all that stuff is even being discussed; like I pointed out
> the other day, even determining halting for self-contained C programs (no
> malloc, no clock dependency, etc) is impossible,

There does not exist a single self-contained C program which will, all
by itself, determine the halting status of *every* self-contained C
program.

However, for any given *size* (including program size and memory use) of
self-contained C program, there exists a self-contained C program which
will determine the halting status of every self-contained C program
under that size. Therefore ``determining halting for self-contained C
programs'' is quite possible.

In fact, it's rather easy to construct the program with approximately
twice the given size, plus a fixed amount of housekeeping. So
``determining halting for self-contained C programs'' is not only
possible, it's practical.

In fact, it's rather easy.

Now, Doug, would you like to explain on what basis in fact or fiction
you decided to call this ``impossible''?

---Dan

smryan@clipper.ingr.com (Steven Ryan) (06/13/91)

In article <22012:Jun1205:25:0491@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

>However, for any given *size* (including program size and memory use) of
>self-contained C program, there exists a self-contained C program which
>will determine the halting status of every self-contained C program
>under that size. Therefore ``determining halting for self-contained C
>programs'' is quite possible.

Not all programs have an a priori space/time bounds. Your conclusion
therefore only applies to bounded programs. You have no proof that unbounded
programs do not exist.

Even if you wish to assume that the universe has only finite space and time,
therefore any 'practical' program must be bounded, your argument is not
proved. If the universe is bounded and if there exists a program which
requires exactly all the space of the universe, the program to analyze it
requires double the size of universe and is thus 'practically' impossible.

>In fact, it's rather easy.

Perhaps you should spend a little time study the math you so easily dismiss.

>Now, Doug, would you like to explain on what basis in fact or fiction
>you decided to call this ``impossible''?

Not that you'd pay attention to this.
-- 
|-In compliance with DoC,-----------------------Steven Ryan--------------------|
| this posting is devoid         |              2400 Geng Road, Palo Alto, CA  |
| of all information.            |       [clipper![wyrmham!]] smryan@ingr.com  |
|-And Oliver North married William Secord/and gave birth to a little Teheran.--|

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/14/91)

In article <1991Jun12.194212.21395@clipper.ingr.com> smryan@clipper.ingr.com (Steven Ryan) writes:
> In article <22012:Jun1205:25:0491@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> >However, for any given *size* (including program size and memory use) of
> >self-contained C program, there exists a self-contained C program which
> >will determine the halting status of every self-contained C program
> >under that size. Therefore ``determining halting for self-contained C
> >programs'' is quite possible.
> Not all programs have an a priori space/time bounds.

Since you obviously weren't paying attention: ``Self-contained C
program'' was defined by Doug to mean one that used neither malloc()
nor an unpredictable external input.

> Your conclusion
> therefore only applies to bounded programs.

No shit. That's what ``for any given *size*'' means: it's a *bound*.
(``Welcome to Mr. Rogers' Neighborhood. Can you say `bound'? B-O-U-N-D.
Bound. Very good, boys and girls. This is a bound.'')

> You have no proof that unbounded
> programs do not exist.

Sure I do. Computers have finite memory. To determine halting for an IBM
PC in some state, I need merely hook up another IBM PC and a bit of
extra hardware, and run both machines a bit slower than usual. To
determine halting for a given language environment, the interpreter or
compiler need only duplicate its memory and do a bit of housekeeping.
This is eminently practical.

And, by the way, here's a bit of elementary logic: Even if unbounded
programs ``exist'' in one sense or another (certainly not in the real
world), that would not affect the validity of my statements, since I
quantified them only over the set of self-contained C programs, all of
which are bounded.

So, Stevie, what does your claim have to do with what I said? As your
mother told you: If you aren't contributing to the discussion at hand,
shut the fuck up.

> >In fact, it's rather easy.
> Perhaps you should spend a little time study the math you so easily dismiss.

Ahahahaha. That's cute. For someone who's just made a fool out of
himself, Stevie, you've got a good sense of humor, you know?

---Dan

billaud@geocub.UUCP (Michel BILLAUD) (06/17/91)

In article <10452.Jun1317.26.4791@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <1991Jun12.194212.21395@clipper.ingr.com> smryan@clipper.ingr.com (Steven Ryan) writes:
>> In article <22012:Jun1205:25:0491@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>> Your conclusion
>> therefore only applies to bounded programs.
>
>No shit. That's what ``for any given *size*'' means: it's a *bound*.
>(``Welcome to Mr. Rogers' Neighborhood. Can you say `bound'? B-O-U-N-D.
>Bound. Very good, boys and girls. This is a bound.'')
>
>> You have no proof that unbounded
>> programs do not exist.
>
>Sure I do. Computers have finite memory. To determine halting for an IBM
>PC in some state, I need merely hook up another IBM PC and a bit of
>extra hardware, and run both machines a bit slower than usual. To
>determine halting for a given language environment, the interpreter or
>compiler need only duplicate its memory and do a bit of housekeeping.
>This is eminently practical.

	Another point is that there is only a 
	*FINITE*NUMBER* of computers in the universe.
	As "Computers have finite memory", there
	exists a computer CM which has more memory
	than others.
	"Practically", there will be no computer
	to run your program to determine halting
	on that CM.
	You could object that you may "hook up" several
	CM with additional hardware : but there is a finite
	number of networks, and so on.

Also we mortals have a limited amount of time, so ...

M Billaud

	

-- 
Michel BILLAUD                 :  billaud@geocub.greco-prog.fr
Departement d'Informatique     :  
IUT "A", Universite Bordeaux I :  phone W: 56.84.57.92  // 56.84.69.22
33405 Talence  (FRANCE)        :        H: 56.36.65.90 (answering mach.)

smryan@clipper.ingr.com (Steven Ryan) (06/18/91)

In article <10452.Jun1317.26.4791@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>So, Stevie, what does your claim have to do with what I said? As your
>mother told you: If you aren't contributing to the discussion at hand,
>shut the fuck up.
>
>> >In fact, it's rather easy.
>> Perhaps you should spend a little time study the math you so easily dismiss.
>
>Ahahahaha. That's cute. For someone who's just made a fool out of
>himself, Stevie, you've got a good sense of humor, you know?
>
>---Dan

Ooooooooo! Touche! A delightfully well thought out and logically incisive
argument.

You're sexy when you're mathematical.
-- 
|-In compliance with DoC,-----------------------Steven Ryan--------------------|
| this posting is devoid         |              2400 Geng Road, Palo Alto, CA  |
| of all information.            |       [clipper![wyrmham!]] smryan@ingr.com  |
|-And Oliver North married William Secord/and gave birth to a little Teheran.--|

smryan@clipper.ingr.com (Steven Ryan) (06/18/91)

>Since you obviously weren't paying attention: ``Self-contained C
>program'' was defined by Doug to mean one that used neither malloc()
>nor an unpredictable external input.
>
>> Your conclusion
>> therefore only applies to bounded programs.
>
>No shit. That's what ``for any given *size*'' means: it's a *bound*.
>(``Welcome to Mr. Rogers' Neighborhood. Can you say `bound'? B-O-U-N-D.
>Bound. Very good, boys and girls. This is a bound.'')

And of course I missed were you put a bound on all external storage, including
temporary files, discs, mag tapes, and whatever else.

You're just so sexy.
-- 
|-In compliance with DoC,-----------------------Steven Ryan--------------------|
| this posting is devoid         |              2400 Geng Road, Palo Alto, CA  |
| of all information.            |       [clipper![wyrmham!]] smryan@ingr.com  |
|-And Oliver North married William Secord/and gave birth to a little Teheran.--|

smryan@clipper.ingr.com (Steven Ryan) (06/19/91)

OOOOOO Danny you're so rough. I guess you like playing public school master.
All us bad little boys better bare our bottoms and let you flail away.

>Since you obviously weren't paying attention: ``Self-contained C
>program'' was defined by Doug to mean one that used neither malloc()
>nor an unpredictable external input.

(1) Is recursion also prohibited? It's simple to write a pda which uses no
heap. Does this mean
		pda = fsm	?

(2) Now that you've demonstrated all kinds of wonderful things about
`self-contained C programs' can you give some examples of this class?
cat? cp? Not wc unless you're going to restrict the size of the input.
Or does predictable input mean bounded size? I guess no exabyte tape
drives this semester.

>Sure I do. Computers have finite memory. To determine halting for an IBM. . .
>. . .This is eminently practical.

(3) Of course, how stupid of me, this is soooo obvious. Let's assume your
analyzer uses twice as much space (speak dirty to me: say bytes) as the
analyzee. But does the analyzer halt? I know, I'll build an analyzer for
it. The analyzer analyzer only uses four times times as much space as the
analyzee, but does it halt? I know, I'll build an analyzer for it. The
analyzer analyzer analyzer only uses eight times as much space as the
analyzee, but does it halt? I know, I'll build an analyzer for it. The
. . . analyzer analyzer analyzer analyzer analyzer analyzer analyzer . . .

Dan, you sexy programmer, you. I guess all goosebumpy just thinking about
how you've solved the halting problem. And you only need infinite space.
-- 
|-In compliance with DoC,-----------------------Steven Ryan--------------------|
| this posting is devoid         |              2400 Geng Road, Palo Alto, CA  |
| of all information.            |       [clipper![wyrmham!]] smryan@ingr.com  |
|-And Oliver North married William Secord/and gave birth to a little Teheran.--|

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/21/91)

In article <1991Jun18.172115.7185@clipper.ingr.com> smryan@clipper.ingr.com (Steven Ryan) writes:
> >Since you obviously weren't paying attention: ``Self-contained C
> >program'' was defined by Doug to mean one that used neither malloc()
> >nor an unpredictable external input.
> (1) Is recursion also prohibited?

In any reasonable computational model with a reasonable definition of
``size'', unbounded non-tail recursion implies unbounded space.

> (2) Now that you've demonstrated all kinds of wonderful things about
> `self-contained C programs' can you give some examples of this class?

Sure: A program which integrates a fluid flow field over a 10000x10000
grid. A graph manipulation program. A program which tests a given number
for primality. A program which finds the next billion zeros of the
Riemann zeta function. Have you forgotten the derivation of the word
``computer''?

> cat? cp? Not wc unless you're going to restrict the size of the input.
> Or does predictable input mean bounded size? I guess no exabyte tape
> drives this semester.

Predictable input means that you can replace all input statements with a
bounded set of coroutines which supply that input. It does not imply
``finite input'', which more precisely might mean ``the n'th read() will
return 0 for some n''; an infinite stream of x's is perfectly
predictable. Figure out the rest for yourself.

> >Sure I do. Computers have finite memory. To determine halting for an IBM. . .
> >. . .This is eminently practical.
> (3) Of course, how stupid of me, this is soooo obvious. Let's assume your
> analyzer uses twice as much space (speak dirty to me: say bytes) as the
> analyzee. But does the analyzer halt? I know, I'll build an analyzer for
> it.

Don't be stupid. The analyzer will always halt, because the original
computer either enters a loop (in which case the analyzer detects the
loop) or itself halts (in which case the analyzer halts too).

> Dan, you sexy programmer, you. I guess all goosebumpy just thinking about
> how you've solved the halting problem. And you only need infinite space.

Are you babbling because you were lost somewhere in the evolutionary
tree, or because you simply don't know what you're talking about?

---Dan

silver@xrtll (Hi Ho Silver) (06/21/91)

Sayeth smryan@clipper.ingr.com (Steven Ryan):
$And of course I missed were you put a bound on all external storage, including
$temporary files, discs, mag tapes, and whatever else.

   This is even more effective if your girlfriend is bound as well.

$You're just so sexy.

   Particularly since I managed to bind myself to my girlfriend and
we're both STARK NAKED in the back of our BLACK WINDOWLESS VAN trying
to find a way to get out so that we can run across the parking lot and
into the 7-11 with our PRIVATE PARTS BOUNCING ALL OVER THE PLACE where
EVERYONE CAN SEE THEM.

   It wouldn't have been so bad if she'd finished shaving my PUBIC HAIR
because then I wouldn't have gotten any of them caught in the ropes.
-- 
.--------------------------------------.nexus.yorku.edu!xrtll!silver
|Silver, perpetually searching for SNTF|----------------------------
`--------------------------------------'a vaguely phallic .signature

smryan@clipper.ingr.com (Steven Ryan) (06/22/91)

In article <4457.Jun2021.35.1991@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <1991Jun18.172115.7185@clipper.ingr.com> smryan@clipper.ingr.com (Steven Ryan) writes:
>> >Since you obviously weren't paying attention: ``Self-contained C
>> >program'' was defined by Doug to mean one that used neither malloc()
>> >nor an unpredictable external input.
>> (1) Is recursion also prohibited?
>
>In any reasonable computational model with a reasonable definition of
>``size'', unbounded non-tail recursion implies unbounded space.

Golly. I would have never thought of that of myself. Thanks.

Do you permit recursion and temporary files in your self-contained C
program?

>> (2) Now that you've demonstrated all kinds of wonderful things about
>> `self-contained C programs' can you give some examples of this class?
>
>Sure: A program which integrates a fluid flow field over a 10000x10000
>grid. A graph manipulation program. A program which tests a given number
>for primality. A program which finds the next billion zeros of the
>Riemann zeta function. Have you forgotten the derivation of the word
>``computer''?

Oh, I get it now. Any program with fixed size input is equivalent to a FSM.
Wow. That's a really important result. Are you going to write about a paper
on this?

What about an n*n grid where n is not known when the program is coded?
What n+e graph manipulator when n and e are not known when the program....
A program that tests any number for primality?
The next m zeros?

Computer? Compter? I reckon so. (Inside joke for our Dutch and German friends.)

>Predictable input means that you can replace all input statements with a
>bounded set of coroutines which supply that input. It does not imply
>``finite input'', which more precisely might mean ``the n'th read() will
>return 0 for some n''; an infinite stream of x's is perfectly
>predictable. Figure out the rest for yourself.

Well, the coroutines have to be fixed size to compile. If you're serious about
bounded recursion and space, that means the only unbounded inputs are
type 3 languages. Wow, another important result. It only takes a FSM to
recognise a type 3 language.

You are indeed a tower of intellect. I bet you got all kinds of net.groupies.

>Don't be stupid. The analyzer will always halt, because the original
>computer either enters a loop (in which case the analyzer detects the
>loop) or itself halts (in which case the analyzer halts too).

Of course! Ah-ha! It's so obvious. If you say it's true, it must be.

You don't happen to have a proof, do you?

>Are you babbling because you were lost somewhere in the evolutionary
>tree, or because you simply don't know what you're talking about?

Oh, Dan, I am the dust beneath your feet. I grovel for enlightment. Please
answer the query, which is too much for my feeble brain.

I'm running my program on a machine #1, maxing out its memory. You run your
analyzer on a slightly larger machine #2. Well, I'm maxed and need an
upgrade--I use machine #2. I guess you need a slight larger machine #3.
Oh, but I'll that one instead. Now you need a slightly larger machine #4.

Silly me, but it seems that I can consume machines faster than you can produce
them. If, as you claim, resources are inherently limited, it seems I'll
be using some machine #n and there won't be a machine #n+1 for your analyzer.
Is my program still a self-contained C program? If so, how are you going to
run your analyzer?
-- 
|-In compliance with DoC,---------------------Steven Ryan----------------------|
| this posting is devoid         |            2400 Geng Road, Palo Alto, CA    |
| of all information.            |            smryan@wyrmham.clipper.ingr.com  |
|-And Oliver North married William Secord/and gave birth to a little Teheran.--|

rockwell@socrates.umd.edu (Raul Rockwell) (06/22/91)

Dan Bernstein:
   Don't be stupid. The analyzer will always halt, because the
   original computer either enters a loop (in which case the analyzer
   detects the loop) or itself halts (in which case the analyzer halts
   too).

Detecting loops on a machine with n states requires O(2^n) states in
the analyzer.  Except for trivial cases, this is far more than n+n
states.

-- 
Raul <rockwell@socrates.umd.edu>

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/23/91)

In article <ROCKWELL.91Jun22010647@socrates.umd.edu> rockwell@socrates.umd.edu (Raul Rockwell) writes:
> Dan Bernstein:
>    Don't be stupid. The analyzer will always halt, because the
>    original computer either enters a loop (in which case the analyzer
>    detects the loop) or itself halts (in which case the analyzer halts
>    too).
> Detecting loops on a machine with n states requires O(2^n) states in
> the analyzer.

This is the equivalent in computer ``science'' of an old wives' tale.

Would you like to prove it, Raul? Let me guess: You were taught in
computer science class that the halting problem could be solved for
n-state machines with a 2^n-state machine that simply kept track of
which states had been seen so far. *Therefore*, said your professor, the
halting problem was solvable for FSMs in theory, but not in practice.
How's that for a guess? ``This method for solving the problem takes
exponential time. Therefore the problem is not in P.''

Anyone who reads Knuth (which, I gather, has fallen out of favor among
computer scientists, as their students no longer know enough math to
understand the first two pages) will find out that to detect loops in a
sequence X_{n+1} = f(X_n), you do NOT need to store all previous X_k and
compare X_{n+1} to each in turn. All you need to do is generate X_n and
X_{2n} simultaneously, then compare the two. Can you prove this, Raul?
Can you figure out how to adapt it to make a practical solution to the
halting problem? Congratulations. Now what do you think of the statement
``Detecting loops on a machine with n states requires O(2^n) states in
the analyzer''?

---Dan

mhcoffin@tolstoy.waterloo.edu (Michael Coffin) (06/23/91)

In article <29254.Jun2219.22.4491@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>
>Anyone who reads Knuth (which, I gather, has fallen out of favor among
>computer scientists, as their students no longer know enough math to
>understand the first two pages) will find out that to detect loops in a
>sequence X_{n+1} = f(X_n), you do NOT need to store all previous X_k and
>compare X_{n+1} to each in turn. All you need to do is generate X_n and
>X_{2n} simultaneously, then compare the two. Can you prove this, Raul?
>Can you figure out how to adapt it to make a practical solution to the
>halting problem? Congratulations. Now what do you think of the statement
>``Detecting loops on a machine with n states requires O(2^n) states in
>the analyzer''?

Lighten up, Mr. Bernstein.  (As the inimitable J. Buffet says, "Don't
ever forget---you might end up being wrong.")  The technique you
describe is well known by most computer scientists, and great for
finding cycles in graphs.  It doesn't make a "practical solution" to
the halting problem because can take a long, long time to run.  Here's
a quick exercise---maybe a 10 in Knuth's rating system.  How many
steps can the above algorith run, as a function of the number of
bits of state?  Estimate how long the algorithm would take
to decide that the following program indeed halts:

    for i := 1 to 2^31 do
    	for j := 1 to 2^31 do
    	    for k := 1 to 2^31 do
    	    	;

Can a solution to the halting problem be said to be "practical" if it
takes millenia to terminate when applied to a simple program that uses
less than 12 bytes of state?

The loop detection algorithm might actually be handy once in a while,
to detect programs that get into tight loops that don't change the
state.  But don't you think that calling it a "practical solution to
the halting problem" is a tad overblown?

-mike

rockwell@socrates.umd.edu (Raul Rockwell) (06/23/91)

Dan Bernstein:
   Anyone who reads Knuth (which, I gather, has fallen out of favor
   among computer scientists, as their students no longer know enough
   math to understand the first two pages) will find out that to
   detect loops in a sequence X_{n+1} = f(X_n), you do NOT need to
   store all previous X_k and compare X_{n+1} to each in turn. All you
   need to do is generate X_n and X_{2n} simultaneously, then compare
   the two. Can you prove this, Raul?  Can you figure out how to adapt
   it to make a practical solution to the halting problem?
   Congratulations. Now what do you think of the statement ``Detecting
   loops on a machine with n states requires O(2^n) states in the
   analyzer''?

Ok, so I need to buy myself a copy of Knuth...  (again)

There's still the minor problem that a machine with n bits has 2^n
states...

-- 
Raul <rockwell@socrates.umd.edu>

oz@ursa.ccs.yorku.ca (Ozan Yigit) (06/24/91)

Dan Bernstein writes:

   Anyone who reads Knuth (which, I gather, has fallen out of favor among
   computer scientists, as their students no longer know enough math to
   understand the first two pages) will find out ...

Those volumes are not out of favor, but they are out-of-date by a
decade and a half, not to mention the use a prehistoric notation to illustrate
(and I use the term loosely) some of the algorithms. Actually, exhaustive
math and detail are the things that make the boring "mix" scraps almost
tolerable.

oz
---
Often it is means that justify ends: Goals    | email: oz@nexus.yorku.ca
advance technique and technique survives even | phone: 416-736-5257 x 33976
when goal structures crumble. -- A. J. Perlis | other: oz@ursa.ccs.yorku.ca

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/24/91)

In article <OZ.91Jun23145702@ursa.ccs.yorku.ca> oz@ursa.ccs.yorku.ca (Ozan Yigit) writes:
  [ on Knuth's Art of Computer Programming ]
> Those volumes are not out of favor, but they are out-of-date by a
> decade and a half,

Really? Within the subject matter covered by volumes 1 through 3, can
you name any important algorithm (i.e., algorithm used by a significant
percentage of today's programmers) not mentioned by Knuth? Can you name
an important field---newer than 1973, or 1981 for the second volume---
that Knuth doesn't cover but that would logically fit into his books?
(Keep in mind that compilers, for example, will be covered in a later
volume if Knuth ever stops working on TeX.) I'm listening.

---Dan

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/24/91)

In article <1991Jun22.213750.18954@watserv1.waterloo.edu> mhcoffin@tolstoy.waterloo.edu (Michael Coffin) writes:
> The technique you
> describe is well known by most computer scientists,

Some, at least. So how come there are so many published statements that
to detect loops in an n-bit computer requires 2^n bits? I first saw this
statement in a DDJ article---surely DDJ's editors don't all suffer the
same delusion!

> It doesn't make a "practical solution" to
> the halting problem because can take a long, long time to run.

But in *practice* most loops cycle through a very small set of states.
This is true of nearly all the infinite loops I've seen, both in my code
and in code I've had to maintain.

> Can a solution to the halting problem be said to be "practical" if it
> takes millenia to terminate when applied to a simple program that uses
> less than 12 bytes of state?

There will always exist programs of length n that run for 2^2^{kn}
steps before stopping. In *practice* very few programs exhibit such
behavior.

---Dan

plyon@ut-emx.uucp (Paul Lyon) (06/24/91)

In article <12358.Jun2400.03.5891@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <OZ.91Jun23145702@ursa.ccs.yorku.ca> oz@ursa.ccs.yorku.ca (Ozan Yigit) writes:
>  [ on Knuth's Art of Computer Programming ]
>> Those volumes are not out of favor, but they are out-of-date by a
>> decade and a half,
>
>Really? Within the subject matter covered by volumes 1 through 3, can
>you name any important algorithm (i.e., algorithm used by a significant
>percentage of today's programmers) not mentioned by Knuth? 

It strikes me that there are some obvious examples here, some of which
Knuth himself had a hand in discovering. For example: the Knuth-Morris-
Pratt string searching algorithm and the Knuth-Bendix completion 
algorithm for term-rewriting systems (the basis for a good deal of
the activity in "Logic Programming" relating to equations). Other
obvious examples are the Boyer-Moore string searching algorithm
and the Simplex algorithm. Then there is Robinson's Unification 
Algorithm, the alpha-beta minimax algorithm, and so it goes..
For, bear in mind that besides the projected Volume 7 on compilers,
there were (are?) meant to be Volume 4, on Combinatorial Algorithms
(which, judging by the handful of references to NP-Completeness in
Volumes 1 - 3, was where complexity theory would be discussed),
and Volume 5, on Lexical Scanning and Parsing (see the outline
on pages vii-viii of the Preface to Volume 1). Perhaps the claim
would be correct if the remaining volumes were published, but
since this hasn't happened yet ....

rockwell@socrates.umd.edu (Raul Rockwell) (06/24/91)

Dan Bernstein:
   There will always exist programs of length n that run for 2^2^{kn}
   steps before stopping. In *practice* very few programs exhibit such
   behavior.

So?  I can burn hours, possibly days of cpu time with a single command
line.  And there's not even an infinite loop to detect.

It's true that in practice I try and avoid such constructs, but I
don't see it as a meaningful, qualitative, statement that such things
don't exist.

-- 
Raul <rockwell@socrates.umd.edu>

sw@smds.UUCP (Stephen E. Witham) (06/24/91)

In article <29254.Jun2219.22.4491@kramden.acf.nyu.edu>, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> In article <ROCKWELL.91Jun22010647@socrates.umd.edu> rockwell@socrates.umd.edu (Raul Rockwell) writes:
> > Detecting loops on a machine with n states requires O(2^n) states in
> > the analyzer.
> 
> This is the equivalent in computer ``science'' of an old wives' tale.

There is a difference of assumptions here that has been going on through
this whole thread.

If you know all the inputs to your program, then it's equivalent to an FSM
starting in a certain state, and you can find out whether it halts with
two copies of the FSM--twice the memory, O(n^2) states.

If you don't know all the inputs in advance, then you're trying to
find out whether a whole class of FSMs halt.  For n bits of input, that's
2^n different problems.

--Steve

macrakis@osf.org (Stavros Macrakis) (06/25/91)

In article <12358.Jun2400.03.5891@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

   Within the subject matter covered by [Knuth] volumes 1 through 3,
   can you name any important algorithm (i.e., algorithm used by a
   significant percentage of today's programmers) not mentioned by
   Knuth?

Since someone has already answered the question, let me question your
question.  Since when is an algorithm's ``importance'' measured by the
percentage of programmers using it?  By that metric, probably 80% of
volume 1 is ``unimportant'', about 99% of volume 2, and perhaps 90% of
volume 3.  Maybe a little more in some very high-tech environments.

Knuth is full of important algorithms, but many important algorithms
have been published since.  Alas, neither old nor new algorithms seem
to be widely understood and used.

Unfortunate, no doubt, but that's life.

	-s

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/25/91)

In article <MACRAKIS.91Jun24135614@next15.osf.org> macrakis@osf.org (Stavros Macrakis) writes:
> Since when is an algorithm's ``importance'' measured by the
> percentage of programmers using it?

You're right. A better definition would be an algorithm or set of
algorithms worthy of a section in volumes 1 through 3, or an algorithm
that renders some of the information in volumes 1 through 3 obsolete. If
Knuth's books are ``out of date'' then surely such algorithms exist, no?

---Dan

oz@ursa.ccs.yorku.ca (Ozan Yigit) (06/25/91)

Dan Bernstein writes:

   > Those volumes are not out of favor, but they are out-of-date by a
   > decade and a half,

   Really?

Yes, really.

	Within the subject matter covered by volumes 1 through 3, can
   you name any important algorithm (i.e., algorithm used by a significant
   percentage of today's programmers) not mentioned by Knuth?

If your silly response is any indication, you would still be staring
at my finger after I point to some of those algorithms, with no hope
of you ever discontinuing your semi-literate ramblings.

oz

smryan@clipper.ingr.com (Steven Ryan) (06/26/91)

>But in *practice* most loops cycle through a very small set of states.
>This is true of nearly all the infinite loops I've seen, both in my code
>and in code I've had to maintain.
>
>There will always exist programs of length n that run for 2^2^{kn}
>steps before stopping. In *practice* very few programs exhibit such
>behavior.

Oh, how faux profund. You can write an analyzer for any practical program.
And what is a practical program? One for which you can write an analyzer.
-- 
|-In compliance with DoC,---------------------Steven Ryan----------------------|
| this posting is devoid         |            2400 Geng Road, Palo Alto, CA    |
| of all information.            |            smryan@wyrmham.clipper.ingr.com  |
|-And Oliver North married William Secord/and gave birth to a little Teheran.--|

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/26/91)

In article <OZ.91Jun25114153@ursa.ccs.yorku.ca> oz@ursa.ccs.yorku.ca (Ozan Yigit) writes:
>    > Those volumes are not out of favor, but they are out-of-date by a
>    > decade and a half,
>    Really?
> Yes, really.

So why don't you back up your claim?

Here, let me give you a head start. Knuth does spend quite a bit of time
on asymptotically fast matrix multiplication. Today we not only know
methods that are asymptotically even faster, but we have a somewhat
better understanding of *why* the methods work. That portion of that
section is clearly out of date, albeit by only a few years.

I can name a few other examples, each deserving an exercise or two and
perhaps a mention in the text. None of them are important, the way that
each new sorting algorithm has found its niche, or that the spectral
test caught on, or that so many memory allocators depend on the buddy
system. None of them could be discussed at Knuth's level for more than
half a page.

C'mon, Oz. You said the books are out of date by a decade and a half.
That means that there have been discoveries throughout the last 15 years
which, had they been known at the time, would have been included in the
books. What are they? I'm perfectly willing to believe that they exist,
but you have to prove it by example rather than rhetoric.

---Dan

mhcoffin@tolstoy.waterloo.edu (Michael Coffin) (06/26/91)

The first thing that occurs to me that has occured since Knuth wrote
The ACP, and ought to be in any programmers toolbox, is skip lists.
Robin Hood hashing would probably be mentioned as well.  (That took me
2 minutes.)

-mike

schwartz@groucho.cs.psu.edu (Scott Schwartz) (06/26/91)

mhcoffin@tolstoy.waterloo.edu (Michael Coffin) writes:
   ...Skip lists.   Robin Hood hashing...

Better, splay trees and amortised analysis.

gudeman@cs.arizona.edu (David Gudeman) (06/26/91)

In article  <5979.Jun2519.38.4991@kramden.acf.nyu.edu> Dan Bernstein writes:
[about Knuth books being out of date]
]
]C'mon, Oz. You said the books are out of date by a decade and a half.
]That means that there have been discoveries throughout the last 15 years
]which, had they been known at the time, would have been included in the
]books.

No, Dan, that is not what "out of date" means.  "Out of date" in this
context means that (1) there are much better ways to express
algorithms than the opaque, archaic notations in ACP, and (2) there
are new ways to approach the teaching of algorithms which many people
feel are better than the approach in ACP (your opinion on this is
irrelevant).

Besides, you just want to play this game where people suggest
algorithms that should have been in the books and aren't, then you get
to chortle and say "that doesn't fit the subject matter of the books",
or "that was going to be in the fourth book", or "that isn't an
important algorithm".  (I've had this discussion with Dan before...)

Let me answer each objection once, to save net bandwidth:

Objection: "that doesn't fit the subject matter of the books"
Answer: then the subject matter of the books is outdated as material
for teaching algorithms.  There are different ways of teaching
algorithms today (as before, your opinion on whether these new methods
are better, is without relevance).

Objection: "that was going to be in the fourth book"
Answer: if the fourth book ever gets written, then it presumably won't
be out of date when it is first published.

Objection: "that isn't an important algorithm"
Answer: Is so.

Basically, Dan, this is a matter of opinion.  Most CS people think
that the books are out of date -- for teaching at least.  The fact
that you have a different opinion is neither interesting nor
enlightening.  So why do you insist on afflicting us with your
argumentative opinions?  Surely you don't think you are going to
change anyone's mind.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman