) (05/20/91)
yodaiken@chelm.cs.umass.edu (victor yodaiken) writes: >In article <TcT422w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) writes: >>Yes, but do you want to have to be aware of exactly how much memory is >>available to your programs? > >Not always. But, for the compiler to warn me that >"while(1)i =i+1" will either crash or cycle state forever does not require >me to know this information. Indeed. There are always simple cases which can be analysed easily. The compiler I use gives warnings about while loops with constant boolean conditions. In general, though, determining whether a program will halt on a real computer by analysing it as a FSM requires knowledge about exactly how much memory is available. >>Consider the following pseudo-code: >> >>i = 0 >>repeat >> allocate 1 byte of memory >> i++ >>until allocation fails >>if i is prime, while(1); >>end >> >>You can write such a program in ANSI C. In order to predict whether it will >>enter a non-terminating loop you have to know exactly how much memory is >>available to the program when it runs, to the byte. > >If your point is that one can write ANSI "C" programs that are machine >specific, sure. The above program is not machine specific. You can write it so that it will compile and run on any machine which supports ANSI C. You just can't tell whether it will terminate or not unless you know exactly how much memory that machine has. > And if your point is that one can write ANSI "C" programs >that will depend on I/O or operating system behavior, I also agree. [...] >The compiler cannot predict such paths, but it can warn us when our >programs contain i/o dependent paths that can lead to indefinite cycles. How many non-trivial C programs don't use dynamic memory allocation? Sure, you can restrict yourself to fixed-size data structures, but that leads to even WORSE problems when you try to handle I/O. >Again, you raise a valid issue, but I'm not clear how it applies to the >point at hand. If I write a program that can run correctly given a "C" >compiler that uses 32bits to represent integers, but which will loop >given an 8 bit "C" compiler, the compilers can flag the error, without >requiring the programmer to worry about exact representation. Eh? The programmer wants his program to work. If it works for 32 bit ints but not for 16 bit ints, then he has to worry about exact representation. Having the compiler say "your program doesn't work because this machine has 16 bit ints" does not mean that the programmer doesn't have to worry about it. > We provide >abstractions in our programmng languages, but these are conveniences which >do not negate the facts that real machines do not have "integers" and >"real numbers", but only have limited representations of these mathematical >abstractions. "Real numbers" are a complete con. I'd rather leave them out of it. "Integers" _do_ exist in some programming languages; the limitation is the amount of hardware you give the language to work in. I have commented on this in another posting. > Even if the compiler implements "i" with a variable format >(e.g. bignums), the programmer will want to know if the program may >exceed thelimits of machine representation. When working with integers mathematically you have to know whether your result will become too big to work with on paper. > A correctly written program >will not depend on exactly how much memory is available, and if the compiler >flags such a dependency, then it has found what is most likely an error. In that case, there are no correctly-written programs which perform non-trivial tasks. >>The restrictions you have to place on a language in order to enable >>guaranteed termination are such that the language is not useful for general >>programming. > >Demonstrate this? Any demonstraton would seem to require unusually >precise information about the limits of ingenuity. I was talking in the present tense. If you want to start speculating wildly about future developments in program proving, then go ahead. I was also taking as given the assumption that most people require an imperative language for general programming. Given the relative popularity of pure functional languages vs C, BASIC, FORTRAN and so on, I think that that is reasonable. >>Secondly, I want to know whether my programs will work on my friends' >>machines. > >Try compiling on those machines, or compile with a "portability" flag. What is this "portability" flag supposed to do? Not bother reporting about termination? >If I had a "C" compiler that would run on, say, an IBM PC with at least >6 meg of memory and a large disk, which would tell me within a few days >whether or not an arbitrary "C" program would fail given at most >X bytes of memory (X < 3 meg), then >I believe that at least a few people would be interested >in using it. Indeed. But you can't do it. Even knowing the exact value of X doesn't help; you have to know other things such as the value of the system clock. (Consider "if seconds = 32 then while (1)"). You might be able to write a compiler which could predict whether or not a C program would run on a machine with 3104272 bytes of memory, if started at precisely 3:15:22 PM on Tuesday 21st May with no other programs resident. Whether it would be worth your while to do so is another matter. mathew
yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/21/91)
In article <gZy9221w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) writes: >In general, though, determining whether a program will halt on a real >computer by analysing it as a FSM requires knowledge about exactly how much >memory is available. > The way you cast the problem evades the TM/FSM question. Yes. Programs that depend on particularities of machine configuration do depend on particularities of machine configuration. If we write a program P that requests specific machine configuration information, and then choses a path based on that data, then the operation of the program will not be the same on all machines. But, this is essentially the same situation as i/o dependencies. I argue that it is possible, in theory, for a C compiler to detect all paths that lead to non-terminating loops and to flag these paths. It is not possible for a "C" compiler to determine, in advance, the result of checking the current time, asking how much memory is available, or reading the next character typed on a terminal. These questions are not undecidable in the technical sense, they are simply undetermined. >How many non-trivial C programs don't use dynamic memory allocation? Sure, >you can restrict yourself to fixed-size data structures, but that leads to >even WORSE problems when you try to handle I/O. If we give X.c to the hypothical smart compiler and it tells us that X.c contains a path that exceeds the process memory limits on this machine, or that X.c contains a path that will cause it to enter a non-terminating loop if "alloc" fails, then the compiler has solved the halting problem for this code. >>Again, you raise a valid issue, but I'm not clear how it applies to the >>point at hand. If I write a program that can run correctly given a "C" >>compiler that uses 32bits to represent integers, but which will loop >>given an 8 bit "C" compiler, the compilers can flag the error, without >>requiring the programmer to worry about exact representation. > >Eh? The programmer wants his program to work. If it works for 32 bit ints >but not for 16 bit ints, then he has to worry about exact representation. >Having the compiler say "your program doesn't work because this machine has >16 bit ints" does not mean that the programmer doesn't have to worry about >it. You've lost me here. The correctness of "C" programs can depend on how the compiler implements "ints". This is a feature of the "C" language, not an artificial constraint that I'm inventing. Programmers cannot treat "C" as if it were more high level than it is, and expect to get away with it. >> A correctly written program >>will not depend on exactly how much memory is available, and if the compiler >>flags such a dependency, then it has found what is most likely an error. > >In that case, there are no correctly-written programs which perform >non-trivial tasks. > Can you give me an example of such a program. The "C" compiler, lisp interpreter, emacs editor, news-reader, operating system, computer algebra programs and and theorem provers we have around here seem to be able to tolerate wide variations in available memory. > >>If I had a "C" compiler that would run on, say, an IBM PC with at least >>6 meg of memory and a large disk, which would tell me within a few days >>whether or not an arbitrary "C" program would fail given at most >>X bytes of memory (X < 3 meg), then >>I believe that at least a few people would be interested >>in using it. > >Indeed. But you can't do it. Even knowing the exact value of X doesn't help; >you have to know other things such as the value of the system clock. >(Consider "if seconds = 32 then while (1)"). > Again, you mistake the nature of the problem. If I claim that P implements the function f: D -> D', then I am claiming that for any x in the appropriate domain running P on input x will result in the value f(x). If the compiler can tell me that there is an x inthe domain that causes P to enter a non-terminating loop, then the compiler has detected a bug in my program. You are arguing that the compiler will not be able to predict x. So what? Generally, we want to know whether the program will compute the correct value given any input, and we include such things as system time among the inputs.
debray@cs.arizona.edu (Saumya K. Debray) (05/22/91)
yodaikenchelm.cs.umass.edu (victor yodaiken) writes: > I argue that it is possible, in theory, for a C compiler to detect all > paths that lead to non-terminating loops and to flag these paths. [ ... ] > If we give X.c to the hypothical smart compiler and it tells us that > X.c contains a path that exceeds the process memory limits on this > machine, or that X.c contains a path that will cause it to enter a > non-terminating loop if "alloc" fails, then the compiler has solved > the halting problem for this code. Good. Perhaps Mr. Yodaiken would care to convince us infidels once and for all by posting some source code to do what he has so eloquently argued can be done? And if Mr. Yodaiken is unable/unwilling to do so, then is this "discussion" accomplishing anything more than blowing warm air back and forth across the networks? The proof of the pudding, Mr. Yodaiken, is in the eating. You've argued that it's possible to decide the halting problem for "real computers". So show us. -- Saumya Debray CS Department, University of Arizona, Tucson internet: debray@cs.arizona.edu uucp: uunet!arizona!debray
gudeman@cs.arizona.edu (David Gudeman) (05/22/91)
In article <30814@dime.cs.umass.edu> victor yodaiken writes:
]... I argue that it is possible, in theory, for a C
]compiler to detect all paths that lead to non-terminating loops and to
]flag these paths.
I argue that this is impossible in theory. Infinite-loop detection
algorithms must almost certainly store some information about the
state space of the algorithm in question. Furthermore, in the general
case this information must almost certainly require space on the order
of the size of the set of all possible states at a given point in the
program (see H1 below).
If so, the amount of information needed for at least some classes of
programs is huge. So there are programs that would compile without
this analysis that cannot be compiled with this analysis.
Furthermore, due to the magnitude of the information that the
algorithm has to keep around, I'll bet that _most_ nontrivial programs
could not be analysed in this way.
Of course you can always reduce the amount of information you keep
around by compacting many states into a single approximating state.
If you do this you can greatly increase the set of programs you can
analyse, but your analysis is only approximate. You can make the
approximation a safe one or a complete one: A safe approximation will
recognize all programs that definitely will _not_ halt. A complete
approximation will recognize all programs that definitely _will_ halt.
I expect that these approximations can be made good enough to be very
useful.
Hypothesis H1:
any algorithm, HALT(p) that solves the halting problem for p must in
general, for each point in p, encode at one time all the possible
states that might occur at that point in p.
Argument:
suppose HALT(p1) returns TRUE. Create program p2 by adding
somewhere the statement
if (F()) exit(); else for(;;);
where F is a function whose output depends on most of the current
state. Then HALT must be able to determine whether F can ever
possibly answer 0 in order to tell if p2 halts. To do this, in
general HALT must be able to encode all possible states that might
occur at this point. It is possible to construct sets of states that
cannot be encoded by anything much smaller than the entire set of
states.
Any HALT algorighm that attempts to avoid this by only doing one state
at a time must in general deal with the possibility that HALT might
not terminate, since it is simulating the program being analysed. The
same considerations apply to a program that works with subsets of the
total set of possible states.
You can use the trick of running p in two different virtual machines,
one going twice as fast as the other, and comparing states. The
storage overhead of this is equal to twice the size of the states of
p. In this case, there are still programs that will compile without
HALT that will not compile with HALT -- so _theoretically_ this
doesn't solve the problem. And it isn't practical either, since if
there is an infinite loop, it may take gigayears to find it.
--
David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman
yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/22/91)
In article <3409@optima.cs.arizona.edu> debray@cs.arizona.edu (Saumya K. Debray) writes: >yodaikenchelm.cs.umass.edu (victor yodaiken) writes: > >> I argue that it is possible, in theory, for a C compiler to detect all >> paths that lead to non-terminating loops and to flag these paths. >[ ... ] >> If we give X.c to the hypothical smart compiler and it tells us that >> X.c contains a path that exceeds the process memory limits on this >> machine, or that X.c contains a path that will cause it to enter a >> non-terminating loop if "alloc" fails, then the compiler has solved >> the halting problem for this code. > >Good. Perhaps Mr. Yodaiken would care to convince us infidels once and >for all by posting some source code to do what he has so eloquently >argued can be done? I'm flattered that you have so much confidence in my abilites: if I can't come up with an algorithm, then clearly, such an algorithm cannot exist. There is an obvious, and obviously impractical algorithm, however. Let's first consider the case of programs that just compute a function of their arguments: i.e. programs that perform no i/o but can be written as subroutines of the form prog( type: x, type: y .... ). For example, suppose we have a C function "int f(x,y,z) int x,y,z ..." Now we look at how "ints" are implemented on the target machine, suppose that each int is a 32 bit value. We then emulate f(x,y,z) for each of the mere 2^96 possible assignments. The emulation consists of single stepping the computation of "f" either on the target computer or within a program emulating this computer. At each step, we make a copy of the entire memory space of the emulation. The program is sure to either repeat state (a non-terminating loop) or exit within a bounded amount of time because it can only reference a bounded amount of memory. [BTW, this is not as impractical as it seems. H. Massalin has written an ingenious assembly language optimizer that, given a short assembly language routine that computes a simple result, will find the shortest possible sequence of assembly statements that computes the same function. Masslin's "Super-optimizer" essentially makes an exhaustive search of the set of all 1 line assembly programs, then all 2 line programs, ..... With some smart heuristics the optimizer comes up with some incredible sequences of code (e.g., a min function that has no branch). There is a paper on the optimizer in some recent ASPLOS proceeedings. So, enough of this whining about what we can't do. ] Now we return to the impractical algorithm and extend it to programs that have system calls. Ther are a couple of strategies here, but we'll take the dumbest, since it requires the least effort from me. Each system call, i/o driven or not, changes program state in only a bounded number of ways. For example, a call "read(filedescriptor,bufferaddr,n)" will change the values of the "n" bytes of memory begining a "bufferaddr" in some way. Whenever we encounter such a call in the "C" program, we simply try all 2^(n*8) possibilities. Similarly, with calls to "alloc" which either add some non-determined data to the program data space (up to a known limit) or return a fail code. We just test all branches. Pretty simple, hey? This algorithm will work. And in some cases it might even be practical. In hardware design it is already common to build special purpose machines that do brute force emulations of chip behavior. Is there a smarter algorithm, a better programing language, a well defined o.s. interface that will make this type of analysis practical without special purpose hardware? No idea. But, there is emphatically no mathematical theorem that eliminates the possibility of such an algorithm. >The proof of the pudding, Mr. Yodaiken, is in the eating. You've argued >that it's possible to decide the halting problem for "real computers". >So show us.
yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/22/91)
In article <3410@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes: >In article <30814@dime.cs.umass.edu> victor yodaiken writes: >]... I argue that it is possible, in theory, for a C >]compiler to detect all paths that lead to non-terminating loops and to >]flag these paths. > >I argue that this is impossible in theory. Infinite-loop detection >state space of the algorithm in question. Furthermore, in the general >case this information must almost certainly require space on the order >of the size of the set of all possible states at a given point in the >program (see H1 below). > >If so, the amount of information needed for at least some classes of >programs is huge. So there are programs that would compile without >this analysis that cannot be compiled with this analysis. >Furthermore, due to the magnitude of the information that the >algorithm has to keep around, I'll bet that _most_ nontrivial programs >could not be analysed in this way. > Your argument is that the task is difficult, and would require huge resources, possibly impractical resources, for most non-trivial programs. No disagreement here. Impractical != impossible. You must understand "in theory" in a nonstandard way. >Of course you can always reduce the amount of information you keep >around by compacting many states into a single approximating state. >If you do this you can greatly increase the set of programs you can >analyse, but your analysis is only approximate. You can make the >approximation a safe one or a complete one: A safe approximation will >recognize all programs that definitely will _not_ halt. A complete >approximation will recognize all programs that definitely _will_ halt. >I expect that these approximations can be made good enough to be very >useful. > I suspect that you are right. >Hypothesis H1: > any algorithm, HALT(p) that solves the halting problem for p must in >general, for each point in p, encode at one time all the possible >states that might occur at that point in p. > >Argument: > suppose HALT(p1) returns TRUE. Create program p2 by adding >somewhere the statement > > if (F()) exit(); else for(;;); > >where F is a function whose output depends on most of the current >state. Then HALT must be able to determine whether F can ever >possibly answer 0 in order to tell if p2 halts. To do this, in >general HALT must be able to encode all possible states that might >occur at this point. It is possible to construct sets of states that >cannot be encoded by anything much smaller than the entire set of >states. > That's an interesting conjecture. For many cases it would be easy, even if "F" looks at all of memory. For example, if F computes a checksum on all of memory, then the possible state set of F would be huge, but it would be easy to see that F does return 0 sometimes. Essentially, we want to know if there is a path through F that terminates at some "return 0" state, but we don't want to have to build the whole damn state set of F. People in hardware boolean circuit optimization and in formal verification have been looking at this problem. I'll dig up some references if anyone wants. Clarke and Burch recently reported that in verification of bit-slice controllers, the state set of 10^20 elements did not have to be completely unrolled, and that only 60,000 ( ?) or so states had to be visited. But, your mileage may vary. >Any HALT algorighm that attempts to avoid this by only doing one state >at a time must in general deal with the possibility that HALT might >not terminate, since it is simulating the program being analysed. The >same considerations apply to a program that works with subsets of the >total set of possible states. Yes. The HALT might not detect a loop, because it has thrown away the loop start state. Is that your point?
gudeman@cs.arizona.edu (David Gudeman) (05/22/91)
In article <30835@dime.cs.umass.edu> victor yodaiken writes: ]In article <3410@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes: ]>In article <30814@dime.cs.umass.edu> victor yodaiken writes: ]>]... I argue that it is possible, in theory, for a C ]>]compiler to detect all paths that lead to non-terminating loops and to ]>]flag these paths. ]>I argue that this is impossible in theory. ]Your argument is that the task is difficult... ].. Impractical != impossible. You must understand "in theory" in a ]nonstandard way. No, it's just that you seem to be trying to be trying to have it both ways: "Since a computer has a finite number of states, it is possible in principle to solve the halting problem for it, since you can always just traverse all possible states." The first part of the sentence implies that you are modeling a computer as an FSM. The second part of the sentence implies that you are modeling a computer as a non-finite state machine, since you are assuming that you can traverse all possible states. Pick your model. No matter which one you pick, you cannot solve the halting problem for all programs on a given machine with a program on that machine. -- David Gudeman gudeman@cs.arizona.edu noao!arizona!gudeman
) (05/22/91)
yodaiken@chelm.cs.umass.edu (victor yodaiken) writes: > In article <gZy9221w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) > >In general, though, determining whether a program will halt on a real > >computer by analysing it as a FSM requires knowledge about exactly how much > >memory is available. > If we write a program P that > requests specific machine configuration information, and then choses > a path based on that data, then the operation of the program will not > be the same on all machines. OK. Give me an example of a C program which performs a major useful function, and does not use any features which would make halting analysis of that program dependent upon knowing exact details of the machine hardware. Just one. > I argue that it is possible, in theory, for a C > compiler to detect all paths that lead to non-terminating loops and to > flag these paths. No. It is possible in theory for a C compiler to detect all paths that MIGHT lead to non-terminating loops. Anything more requires exact detailed knowledge about the hardware. And if you opt to flag all paths which MIGHT lead to non-terminating loops or system crashes, you'll end up with most of the program being flagged. > It is not possible for a "C" compiler to determine, > in advance, the result of checking the current time, asking how much > memory is available, or reading the next character typed on a terminal. > These questions are not undecidable in the technical sense, they are > simply undetermined. Right. But the former two are properties of the machine environment the program will be running in, and can seriously affect whether the program halts. Look at all the programs which crashed when the system date on UNIX went past the size of an int. You can, in certain languages, make halting predictions; you can say "If this program is fed a file of finite size, it will terminate". C is not, in general, one of those languages. > If we give X.c to the hypothical smart compiler and it tells us that > X.c contains a path that exceeds the process memory limits on this > machine, or that X.c contains a path that will cause it to enter a > non-terminating loop if "alloc" fails, then the compiler has solved > the halting problem for this code. ...on this particular machine, running at this particular time, with this particular amount of processor stack, and so on. To get any useful GENERAL halting results would require massive restrictions on what you allowed in your "Halting ANSI C". Such severe restrictions that I doubt whether the language would be useful. > >Eh? The programmer wants his program to work. If it works for 32 bit ints > >but not for 16 bit ints, then he has to worry about exact representation. > >Having the compiler say "your program doesn't work because this machine has > >16 bit ints" does not mean that the programmer doesn't have to worry about > >it. > > You've lost me here. The correctness of "C" programs can depend on how > the compiler implements "ints". This is a feature of the "C" language, not > an artificial constraint that I'm inventing. Programmers cannot treat > "C" as if it were more high level than it is, and expect to get away with > it. Right. And whether a C program halts or not can be critically dependent upon the size of integers used in that particular implementation of C. So to perform your magic halting analysis, you need to know exactly how big integers are. And if your analysis is to be useful, this information needs to be directly or indirectly communicated to the programmer. Therefore the programmer needs to worry about exact implementation specifics. > >> A correctly written progra > >>will not depend on exactly how much memory is available, > > > >In that case, there are no correctly-written programs which perform > >non-trivial tasks. > > Can you give me an example of such a program. The "C" compiler, lisp > interpreter, emacs editor, news-reader, operating system, computer > algebra programs and and theorem provers we have around here seem to > be able to tolerate wide variations in available memory. Really? Try running them in 1KB. There will be some lower memory threshold x beneath which the programs will not run. If you have x-1 bytes of memory, the programs will not run. It is therefore important that the exact amount of memory you have is greater than x. This is a condition on exactly how much memory must be available. Equality isn't the only condition which requires you to know exact amounts. > >>If I had a "C" compiler that would run on, say, an IBM PC with at least 6 > >>meg of memory and a large disk, which would tell me within a few days > >>whether or not an arbitrary "C" program would fail given at most X bytes > >>of memory [...] > >Indeed. But you can't do it. Even knowing the exact value of X doesn't help; > >you have to know other things such as the value of the system clock. [...] > Again, you mistake the nature of the problem. If I claim that P > implements the function f: D -> D', then I am claiming that for any > x in the appropriate domain running P on input x will result in > the value f(x). If the compiler can tell me that there is an x inthe > domain that causes P to enter a non-terminating loop, then the compiler > has detected a bug in my program. You are arguing that the compiler will > not be able to predict x. So what? Generally, we want to know whether the > program will compute the correct value given any input, and we include > such things as system time among the inputs. You said "an arbitrary C program". Not "an arbitrary C program which is not allowed to look at the system time". If your C code is not allowed to look at the system time, not allowed to use malloc (which effectively involves looking at available memory), not allowed to look at the file system (for temporary files) and so on, then yes, you probably can find out whether it will compute the correct value given any input. Now, how many useful programs do you know which satisfy those criteria? To look at it another way: Most C programs will end up with a list of inputs which includes system memory, clock, stack size, file system size, and so on. You will need to know the values of these inputs before you can determine whether the program will halt or not. This is not very useful. I'm about ready to give up arguing with you. You seem to be just as clueless as when you started, and I really have better things to do than argue about whether you can solve the halting problem for general ANSI C programs. If you still believe you can determine whether an arbitrary non-trivial ANSI C program will halt or not, and have that information generally applicable enough to be useful to the programmer, then go out there and do it and earn yourself millions. While you're at it, there are some guys in sci.skeptic who have a design for a perpetual motion machine which needs development. mathew
yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/22/91)
In article <3430@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes: >No, it's just that you seem to be trying to be trying to have it both >ways: "Since a computer has a finite number of states, it is possible >in principle to solve the halting problem for it, since you can always >just traverse all possible states." The first part of the sentence >implies that you are modeling a computer as an FSM. The second part >of the sentence implies that you are modeling a computer as a >non-finite state machine, since you are assuming that you can traverse >all possible states. > You are defining the problem incorrectly. I do not argue that for any FSM M, we can use M to decide its own halting problem. Instead, I argue that there is a deterministic *algorithm* with time/space bounded by the number of states of M, and that this algorithm can decide the halting problem for M. If we want, we can compute this algorithm on a FSM with sufficient state space (probably quite a bit larger than M). You are arguing that there are programs which can be executed on any computer which use enough of the state space of the computer as to leave no room for a program that would emulate them. But, of course. However these programs *can* be emulated on a bigger, but still finite, physical or virtual machine. Note that there is no analog for this procedure with TMs. A TM does not get any more powerful when we add additional tapes or heads, but FSMs limited to n states are strictly less powerful than FSMs limited to n+1 states.
mlc%gva.decnet@CONSRT.ROCKWELL.COM ("GVA::MLC") (05/23/91)
Newsgroups: comp.lang.misc From: mlc@aten.cca.rok.com (Michael L. Cook) Subject: Re: Will this *thread* ever halt? Sender: news@zoot.avgrp.cr.rok.com Reply-To: MLC%GVA.DECNET@CONSRT.ROK.COM Organization: Rockwell Collins, Cedar Rapids, IA A general solution to the halting problem does not exist, just as it is impossible to predict when this thread will halt!! -- Michael Cook Internet: mlc%gva.decnet@consrt.rok.com "A little infrastructure goes a long way." -- Stuart Shipman
rockwell@socrates.umd.edu (Raul Rockwell) (05/23/91)
victor yodaiken: I argue that there is a deterministic *algorithm* with time/space bounded by the number of states of M, and that this algorithm can decide the halting problem for M. If we want, we can compute this algorithm on a FSM with sufficient state space (probably quite a bit larger than M). Well, yes. If you're dealing with an arbitrary machine, with 16k ram (8 bit bytes) you have to have a state space of over 10^(10^6) bits to store halting information (halts/doesn't) for every possible state. And this doubles every time you add a bit to the machine -- for a unix system with a gigabyte of disk space, your state space over 10^(8.26x10^10) bits (a number greater than 1 followed by over 82 billion zeros (decimal)). An the other hand, many algorithms can be easily shown to halt. (Any algorithm which is "primitive recursive" in the number-theoretic sense, which includes just about every computer programming technique except arbitrary loops and arbitrary recursion). So I can see why a person might want to say that the halting problem is solvable for a practical machine, but I wouldn't say that using state space is a useful solution. The solution I see is to limit yourself to a class of problems which are primitive recursive. From personal experience, I can say that you're still going to have to resort to state machine style techniques when you deal with i/o or secondary storage (secondary storage being defined as any storage not managed transparently by your target language). And I'm sure that any systems implementor can tell you horror stories about lurking race conditions and other potential bogosities in the os... Raul Rockwell
kwalker@cs.arizona.edu (Kenneth Walker) (05/23/91)
In article <30863@dime.cs.umass.edu>, yodaiken@chelm.cs.umass.edu (victor yodaiken) writes: > > You are defining the problem incorrectly. I do not argue that for any > FSM M, we can use M to decide its own halting problem. Instead, I argue > that there is a deterministic *algorithm* with time/space bounded by the > number of states of M, and that this algorithm can decide the halting > problem for M. If we want, we can compute this algorithm on a FSM with > sufficient state space (probably quite a bit larger than M). The original problem, as I recall, was whether a compiler can solve the halting problem for real programs. I insist on being able to run your compiler on the same machine I run my program on. (After all, this is a real-world question and not some "silly" theoretical exercise, right? :-) Ken Walker / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721 +1 602 621-4252 kwalker@cs.arizona.edu {uunet|allegra|noao}!arizona!kwalker
yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/24/91)
In article <NqHc34w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) writes: >yodaiken@chelm.cs.umass.edu (victor yodaiken) writes: >> In article <gZy9221w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) >> >In general, though, determining whether a program will halt on a real >> >computer by analysing it as a FSM requires knowledge about exactly how much >> >memory is available. I'm really mystified by this discussion. Part ofthe problem may arise from the odd idea of "exact" that you seem to hold. You write: >There will be some lower memory threshold x beneath which the programs will >not run. If you have x-1 bytes of memory, the programs will not run. It is >therefore important that the exact amount of memory you have is greater than >x. This is a condition on exactly how much memory must be available. Equality >isn't the only condition which requires you to know exact amounts. > Forgive me if I mistake your meaning, but this seems like complete nonsense. The expression x> 10 does not constrain x to an "exact amount". If emacs requires at least 4 meg of memory in order to run correctly, then knowing that x>4meg is certainly good enough. A second problem arises from your misuse of the term "halting problem". The halting problem involves deterministic machines. It does not involve predicting the time of day which you will chose to run a program or the next word which I will type into a keyboard. A computer program can be considered to be a deterministic machine computing a function on its input. You want to say that program behavior is not deterministic, because the input is not deterministic. Generally, however, we are less interested in predicting system input than in making sure that the program can cope correctly with *every* possible input. When I say that we can, in theory, decide whether or not a "C" program contains an non-terminating loop, I do not mean to suggest that we can decide in advance whether or not it will ever enter that loop. The program "read c; if c='\n' while(1); exit" contains a reachable non-terminating loop. You may be content to run this program without ever hitting a cariage return. I'm not, however, interested in predicting your behavior. I wrote: >> Again, you mistake the nature of the problem. If I claim that P >> implements the function f: D -> D', then I am claiming that for any >> x in the appropriate domain running P on input x will result in >> the value f(x). If the compiler can tell me that there is an x inthe >> domain that causes P to enter a non-terminating loop, then the compiler >> has detected a bug in my program. You are arguing that the compiler will >> not be able to predict x. So what? Generally, we want to know whether the >> program will compute the correct value given any input, and we include >> such things as system time among the inputs. > You reply: >You said "an arbitrary C program". Not "an arbitrary C program which is not >allowed to look at the system time". > Here, I'm considering the program as a function with input varying over its domain. In this case, consider the domain to be the set of possible time values. We don't want to attempt to predict which elements of this domain will be encountered: i.e., we don't want to try to predict the time of day when the program will be run. We just want to make sure that the program behaves "correctly" for any time value. >I'm about ready to give up arguing with you. You seem to be just as clueless >as when you started, and I really have better things to do than argue about >whether you can solve the halting problem for general ANSI C programs. Go to 'em.
) (05/28/91)
yodaiken@chelm.cs.umass.edu (victor yodaiken) writes: > Forgive me if I mistake your meaning, but this seems like complete nonsense. > The expression x> 10 does not constrain x to an "exact amount". No. It does, however, constrain x to be greater than some exact amount. As opposed to being greater than some fuzzily-defined amount which is approximately 10. > If emacs requires at least 4 meg of memory in order to run correctly, > then knowing that x>4meg is certainly good enough. It depends. It might require significantly more memory than that if you're editing large files. The amount of memory required by Emacs is not an exact amount. It depends what you're doing with the program. Part of the job of determining whether an ANSI C program will halt will involve placing exact bounds on how much memory it might use in running, and hence producing exact specifications of input values/conditions for which it might not terminate. And remember that a single byte might make a difference; so if it's December, your calendar program might fail to work properly whereas it worked fine in May, because somewhere the program is allocating space for the name of the month and this is resulting in its not having enough space to complete some other task. > The halting problem involves deterministic machines. It does not > involve predicting the time of day which you will chose to run > a program or the next word which I will type into a keyboard. I didn't mean to imply that it did. > A computer program can be considered to be a deterministic machine > computing a function on its input. You want to say that program behavior > is not deterministic, because the input is not deterministic. Generally, > however, we are less interested in predicting system input than in making > sure that the program can cope correctly with *every* possible input. Right. So in addition, your Terminating ANSI C has to check whether all possible return values from standard library calls will result in termination or not. For example, if the program uses time.h, we must check to see that it terminates for all possible start times. Basically, it adds another dimension of complexity to the problem. > >You said "an arbitrary C program". Not "an arbitrary C program which is not > >allowed to look at the system time". > > Here, I'm considering the program as a function with input varying over > its domain. In this case, consider the domain to be the set of possible > time values. We don't want to attempt to predict which elements of this > domain will be encountered: i.e., we don't want to try to predict the > time of day when the program will be run. We just want to make sure that > the program behaves "correctly" for any time value. Right. That's fine. Basically, for any moderately complicated ANSI C program the dimensions of the input domain and the range of each dimension will be so large that trying all combinations will be infeasible on even the largest computers. I'll leave it to someone else to do some sums to demonstrate this. In short, determining the halting status of ANSI C programs is not feasible for real programs; and it's not possible for general ANSI C programs, because general ANSI C programs might have arbitrarily large input spaces making analysis require impossible amounts of memory and CPU time. If you want to write programs and be told whether they will halt or not, you will have to use a different language; one which is restricted so that you don't have to rely on brute-force methods for determining halting status. For example, a functional language which uses map operations (iteration over a structure) rather than while loops and which limits general recursion. mathew
doug@netcom.COM (Doug Merritt) (05/29/91)
In article <k3ZN38w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) writes: >In short, determining the halting status of ANSI C programs is not feasible >for real programs; and it's not possible for general ANSI C programs, because >general ANSI C programs might have arbitrarily large input spaces making >analysis require impossible amounts of memory and CPU time. I don't see why all that stuff is even being discussed; like I pointed out the other day, even determining halting for self-contained C programs (no malloc, no clock dependency, etc) is impossible, so why bring in the other junk? >If you want to write programs and be told whether they will halt or not, you >will have to use a different language; one which is restricted so that you >don't have to rely on brute-force methods for determining halting status. For >example, a functional language which uses map operations (iteration over a >structure) rather than while loops and which limits general recursion. If you are aware of work that shows that the halting problem is soluble for functional languages, I would dearly love to see the reference. Or even that it *might* be soluble. (In other words, I don't believe it for a minute; functional languages are universal computational models and as such can be mapped one-to-one with Turing machines and the lambda calculus etc.) Also, you and your conversational opponent are starting to sound like you're in violent agreement. No doubt that's just because I've lost track of your actual base positions. Doug -- Doug Merritt doug@netcom.com (apple!netcom!doug) -OR- sun.com!jfrank!doug -OR- doug@eris.berkeley.edu Professional Wild-eyed Visionary Member, Crusaders for a Better Tomorrow
yodaiken@chelm.cs.umass.edu (victor yodaiken) (05/29/91)
In article <k3ZN38w164w@mantis.co.uk> mathew@mantis.co.uk (CNEWS MUST DIE!) writes: >Basically, for any moderately complicated ANSI C program the dimensions of >the input domain and the range of each dimension will be so large that trying >all combinations will be infeasible on even the largest computers. I'll leave >it to someone else to do some sums to demonstrate this. > >In short, determining the halting status of ANSI C programs is not feasible >for real programs; and it's not possible for general ANSI C programs, because >general ANSI C programs might have arbitrarily large input spaces making >analysis require impossible amounts of memory and CPU time. This discussion has passed into a terminally dull zone. So, I'm going to give up, with some final words. First, I'd like to return to my orignal point: computer scientists overuse and misuse the notion of undecidable problems and in doing so may be shutting off some interesting research avenues. It is not clear that there are any interesting problems in compiler design and program verification which are undecidable. This is simply because we are seldom interested in knowing whether computations can be done in unbounded time and space, rather we generally want to know if a problem can be solved using some bounded level of resources, e.g., within 10 nanoseconds, within polynomial/exponential time, within n time units where n is a parameter, within 100 million steps times the size of the input, using no more than MAXPROC space etc. etc. Second, I want to reiterate that it is important to precisely distinguish between "hard" and "impossible". There are some problems that we know are impossible to solve algorithmically, and there are other problems which seem intractable. But, just because the readers of this newsgroup are incapable of thinking up an algorithm or theoretical result on the fly, does not imply that there is no such algorithm, or no result that will melt a currently intractable problem. Finally, a quote: \begin{quote} Turing machines are widely considered to be the abstract prototype of digital computers; workers in the field, however, have felt more and more that the notion of a Turing machine is too general to serve as an accurate model of actual computers. It is well known that for even a simple calculation it is impossible to give an {\it a priori} upper bound on the amount of tape a Turing machine will neeed for any given computation. It is precisely this feature which renders Turing's concept unrealistic. In the last few years the idea of a {\em finite automata} has appeared in the literature. These are machines having only a finite number of internal states that can be used for memory and computation. The restriction of finiteness appears to give a better approximation to the idea of a physical machine. Of course, such machines cannot do as much as Turing machines, but the advantage of being able to compute a general recursive function is questionable, since very few of these functions come up in practical applications." \end{quote} author="Rabin M.O. and Dana Scott", title="Finite automata and their decision problems", journal="IBM Journal of Research and Development", vol={3}, number={2}, month={April}, year={1959}, page={114-125}
rockwell@socrates.umd.edu (Raul Rockwell) (05/30/91)
Victor Yodaiken: \begin{quote} Turing machines are widely considered to be the abstract prototype of digital computers; workers in the field, however, have felt more and more that the notion of a Turing machine is too general to serve as an accurate model of actual computers. It is well known that for even a simple calculation it is impossible to give an {\it a priori} upper bound on the amount of tape a Turing machine will neeed for any given computation. It is precisely this feature which renders Turing's concept unrealistic. \end{quote} author="Rabin M.O. and Dana Scott", [etc...] While this quote may be literally accurate (workers in the field have felt...) it misses what I feel is a rather major point: The reason it is impossible to give an upper bound of tape on a Turing machine for a given computation is because these computations are traditionally expressed for *arbitrarily large objects*. If any object can be arbitrarily large then it doesn't matter WHAT the computation is, it will take an arbitrarily large amount of tape to represent the computation on that object. Note that if you limit your objects to some finite size (like, integers whose absolute magnitude is less than 2 billion), you can represent (for example) multiplication of two of these numbers using only two tape cells. Of course, then you have to deal with overflow conditions, but that's a problem with any finite number representation. Note that you do *not* want to write out each of the quadrapoles by hand for this kind of turing machine--so don't. Just use parameterized descriptions. Raul Rockwell
) (05/30/91)
doug@netcom.COM (Doug Merritt) writes: > If you are aware of work that shows that the halting problem is soluble > for functional languages, I would dearly love to see the reference. Or > even that it *might* be soluble. Sorry, this bit of the thread was talking not about solving the halting problem in general, but about getting useful (but limited) information about halting conditions for certain classes of program. I don't believe that Victor Yodaiken thinks that *general* C programs are analyzable for halting status. My contention is that even this limited sort of halting information is impractical to work out for ANSI C, but that I understand that some functional languages might allow you to gain useful information about the likelihood of some useful programs halting. I'm not really an expert in functional programming, so I'll leave it at that. mathew
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/12/91)
In article <1991May28.204434.16396@netcom.COM> doug@netcom.COM (Doug Merritt) writes: > I don't see why all that stuff is even being discussed; like I pointed out > the other day, even determining halting for self-contained C programs (no > malloc, no clock dependency, etc) is impossible, There does not exist a single self-contained C program which will, all by itself, determine the halting status of *every* self-contained C program. However, for any given *size* (including program size and memory use) of self-contained C program, there exists a self-contained C program which will determine the halting status of every self-contained C program under that size. Therefore ``determining halting for self-contained C programs'' is quite possible. In fact, it's rather easy to construct the program with approximately twice the given size, plus a fixed amount of housekeeping. So ``determining halting for self-contained C programs'' is not only possible, it's practical. In fact, it's rather easy. Now, Doug, would you like to explain on what basis in fact or fiction you decided to call this ``impossible''? ---Dan
smryan@clipper.ingr.com (Steven Ryan) (06/13/91)
In article <22012:Jun1205:25:0491@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >However, for any given *size* (including program size and memory use) of >self-contained C program, there exists a self-contained C program which >will determine the halting status of every self-contained C program >under that size. Therefore ``determining halting for self-contained C >programs'' is quite possible. Not all programs have an a priori space/time bounds. Your conclusion therefore only applies to bounded programs. You have no proof that unbounded programs do not exist. Even if you wish to assume that the universe has only finite space and time, therefore any 'practical' program must be bounded, your argument is not proved. If the universe is bounded and if there exists a program which requires exactly all the space of the universe, the program to analyze it requires double the size of universe and is thus 'practically' impossible. >In fact, it's rather easy. Perhaps you should spend a little time study the math you so easily dismiss. >Now, Doug, would you like to explain on what basis in fact or fiction >you decided to call this ``impossible''? Not that you'd pay attention to this. -- |-In compliance with DoC,-----------------------Steven Ryan--------------------| | this posting is devoid | 2400 Geng Road, Palo Alto, CA | | of all information. | [clipper![wyrmham!]] smryan@ingr.com | |-And Oliver North married William Secord/and gave birth to a little Teheran.--|
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/14/91)
In article <1991Jun12.194212.21395@clipper.ingr.com> smryan@clipper.ingr.com (Steven Ryan) writes: > In article <22012:Jun1205:25:0491@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: > >However, for any given *size* (including program size and memory use) of > >self-contained C program, there exists a self-contained C program which > >will determine the halting status of every self-contained C program > >under that size. Therefore ``determining halting for self-contained C > >programs'' is quite possible. > Not all programs have an a priori space/time bounds. Since you obviously weren't paying attention: ``Self-contained C program'' was defined by Doug to mean one that used neither malloc() nor an unpredictable external input. > Your conclusion > therefore only applies to bounded programs. No shit. That's what ``for any given *size*'' means: it's a *bound*. (``Welcome to Mr. Rogers' Neighborhood. Can you say `bound'? B-O-U-N-D. Bound. Very good, boys and girls. This is a bound.'') > You have no proof that unbounded > programs do not exist. Sure I do. Computers have finite memory. To determine halting for an IBM PC in some state, I need merely hook up another IBM PC and a bit of extra hardware, and run both machines a bit slower than usual. To determine halting for a given language environment, the interpreter or compiler need only duplicate its memory and do a bit of housekeeping. This is eminently practical. And, by the way, here's a bit of elementary logic: Even if unbounded programs ``exist'' in one sense or another (certainly not in the real world), that would not affect the validity of my statements, since I quantified them only over the set of self-contained C programs, all of which are bounded. So, Stevie, what does your claim have to do with what I said? As your mother told you: If you aren't contributing to the discussion at hand, shut the fuck up. > >In fact, it's rather easy. > Perhaps you should spend a little time study the math you so easily dismiss. Ahahahaha. That's cute. For someone who's just made a fool out of himself, Stevie, you've got a good sense of humor, you know? ---Dan
billaud@geocub.UUCP (Michel BILLAUD) (06/17/91)
In article <10452.Jun1317.26.4791@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >In article <1991Jun12.194212.21395@clipper.ingr.com> smryan@clipper.ingr.com (Steven Ryan) writes: >> In article <22012:Jun1205:25:0491@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >> Your conclusion >> therefore only applies to bounded programs. > >No shit. That's what ``for any given *size*'' means: it's a *bound*. >(``Welcome to Mr. Rogers' Neighborhood. Can you say `bound'? B-O-U-N-D. >Bound. Very good, boys and girls. This is a bound.'') > >> You have no proof that unbounded >> programs do not exist. > >Sure I do. Computers have finite memory. To determine halting for an IBM >PC in some state, I need merely hook up another IBM PC and a bit of >extra hardware, and run both machines a bit slower than usual. To >determine halting for a given language environment, the interpreter or >compiler need only duplicate its memory and do a bit of housekeeping. >This is eminently practical. Another point is that there is only a *FINITE*NUMBER* of computers in the universe. As "Computers have finite memory", there exists a computer CM which has more memory than others. "Practically", there will be no computer to run your program to determine halting on that CM. You could object that you may "hook up" several CM with additional hardware : but there is a finite number of networks, and so on. Also we mortals have a limited amount of time, so ... M Billaud -- Michel BILLAUD : billaud@geocub.greco-prog.fr Departement d'Informatique : IUT "A", Universite Bordeaux I : phone W: 56.84.57.92 // 56.84.69.22 33405 Talence (FRANCE) : H: 56.36.65.90 (answering mach.)
smryan@clipper.ingr.com (Steven Ryan) (06/18/91)
In article <10452.Jun1317.26.4791@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >So, Stevie, what does your claim have to do with what I said? As your >mother told you: If you aren't contributing to the discussion at hand, >shut the fuck up. > >> >In fact, it's rather easy. >> Perhaps you should spend a little time study the math you so easily dismiss. > >Ahahahaha. That's cute. For someone who's just made a fool out of >himself, Stevie, you've got a good sense of humor, you know? > >---Dan Ooooooooo! Touche! A delightfully well thought out and logically incisive argument. You're sexy when you're mathematical. -- |-In compliance with DoC,-----------------------Steven Ryan--------------------| | this posting is devoid | 2400 Geng Road, Palo Alto, CA | | of all information. | [clipper![wyrmham!]] smryan@ingr.com | |-And Oliver North married William Secord/and gave birth to a little Teheran.--|
smryan@clipper.ingr.com (Steven Ryan) (06/18/91)
>Since you obviously weren't paying attention: ``Self-contained C >program'' was defined by Doug to mean one that used neither malloc() >nor an unpredictable external input. > >> Your conclusion >> therefore only applies to bounded programs. > >No shit. That's what ``for any given *size*'' means: it's a *bound*. >(``Welcome to Mr. Rogers' Neighborhood. Can you say `bound'? B-O-U-N-D. >Bound. Very good, boys and girls. This is a bound.'') And of course I missed were you put a bound on all external storage, including temporary files, discs, mag tapes, and whatever else. You're just so sexy. -- |-In compliance with DoC,-----------------------Steven Ryan--------------------| | this posting is devoid | 2400 Geng Road, Palo Alto, CA | | of all information. | [clipper![wyrmham!]] smryan@ingr.com | |-And Oliver North married William Secord/and gave birth to a little Teheran.--|
smryan@clipper.ingr.com (Steven Ryan) (06/19/91)
OOOOOO Danny you're so rough. I guess you like playing public school master. All us bad little boys better bare our bottoms and let you flail away. >Since you obviously weren't paying attention: ``Self-contained C >program'' was defined by Doug to mean one that used neither malloc() >nor an unpredictable external input. (1) Is recursion also prohibited? It's simple to write a pda which uses no heap. Does this mean pda = fsm ? (2) Now that you've demonstrated all kinds of wonderful things about `self-contained C programs' can you give some examples of this class? cat? cp? Not wc unless you're going to restrict the size of the input. Or does predictable input mean bounded size? I guess no exabyte tape drives this semester. >Sure I do. Computers have finite memory. To determine halting for an IBM. . . >. . .This is eminently practical. (3) Of course, how stupid of me, this is soooo obvious. Let's assume your analyzer uses twice as much space (speak dirty to me: say bytes) as the analyzee. But does the analyzer halt? I know, I'll build an analyzer for it. The analyzer analyzer only uses four times times as much space as the analyzee, but does it halt? I know, I'll build an analyzer for it. The analyzer analyzer analyzer only uses eight times as much space as the analyzee, but does it halt? I know, I'll build an analyzer for it. The . . . analyzer analyzer analyzer analyzer analyzer analyzer analyzer . . . Dan, you sexy programmer, you. I guess all goosebumpy just thinking about how you've solved the halting problem. And you only need infinite space. -- |-In compliance with DoC,-----------------------Steven Ryan--------------------| | this posting is devoid | 2400 Geng Road, Palo Alto, CA | | of all information. | [clipper![wyrmham!]] smryan@ingr.com | |-And Oliver North married William Secord/and gave birth to a little Teheran.--|
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/21/91)
In article <1991Jun18.172115.7185@clipper.ingr.com> smryan@clipper.ingr.com (Steven Ryan) writes: > >Since you obviously weren't paying attention: ``Self-contained C > >program'' was defined by Doug to mean one that used neither malloc() > >nor an unpredictable external input. > (1) Is recursion also prohibited? In any reasonable computational model with a reasonable definition of ``size'', unbounded non-tail recursion implies unbounded space. > (2) Now that you've demonstrated all kinds of wonderful things about > `self-contained C programs' can you give some examples of this class? Sure: A program which integrates a fluid flow field over a 10000x10000 grid. A graph manipulation program. A program which tests a given number for primality. A program which finds the next billion zeros of the Riemann zeta function. Have you forgotten the derivation of the word ``computer''? > cat? cp? Not wc unless you're going to restrict the size of the input. > Or does predictable input mean bounded size? I guess no exabyte tape > drives this semester. Predictable input means that you can replace all input statements with a bounded set of coroutines which supply that input. It does not imply ``finite input'', which more precisely might mean ``the n'th read() will return 0 for some n''; an infinite stream of x's is perfectly predictable. Figure out the rest for yourself. > >Sure I do. Computers have finite memory. To determine halting for an IBM. . . > >. . .This is eminently practical. > (3) Of course, how stupid of me, this is soooo obvious. Let's assume your > analyzer uses twice as much space (speak dirty to me: say bytes) as the > analyzee. But does the analyzer halt? I know, I'll build an analyzer for > it. Don't be stupid. The analyzer will always halt, because the original computer either enters a loop (in which case the analyzer detects the loop) or itself halts (in which case the analyzer halts too). > Dan, you sexy programmer, you. I guess all goosebumpy just thinking about > how you've solved the halting problem. And you only need infinite space. Are you babbling because you were lost somewhere in the evolutionary tree, or because you simply don't know what you're talking about? ---Dan
silver@xrtll (Hi Ho Silver) (06/21/91)
Sayeth smryan@clipper.ingr.com (Steven Ryan):
$And of course I missed were you put a bound on all external storage, including
$temporary files, discs, mag tapes, and whatever else.
This is even more effective if your girlfriend is bound as well.
$You're just so sexy.
Particularly since I managed to bind myself to my girlfriend and
we're both STARK NAKED in the back of our BLACK WINDOWLESS VAN trying
to find a way to get out so that we can run across the parking lot and
into the 7-11 with our PRIVATE PARTS BOUNCING ALL OVER THE PLACE where
EVERYONE CAN SEE THEM.
It wouldn't have been so bad if she'd finished shaving my PUBIC HAIR
because then I wouldn't have gotten any of them caught in the ropes.
--
.--------------------------------------.nexus.yorku.edu!xrtll!silver
|Silver, perpetually searching for SNTF|----------------------------
`--------------------------------------'a vaguely phallic .signature
smryan@clipper.ingr.com (Steven Ryan) (06/22/91)
In article <4457.Jun2021.35.1991@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >In article <1991Jun18.172115.7185@clipper.ingr.com> smryan@clipper.ingr.com (Steven Ryan) writes: >> >Since you obviously weren't paying attention: ``Self-contained C >> >program'' was defined by Doug to mean one that used neither malloc() >> >nor an unpredictable external input. >> (1) Is recursion also prohibited? > >In any reasonable computational model with a reasonable definition of >``size'', unbounded non-tail recursion implies unbounded space. Golly. I would have never thought of that of myself. Thanks. Do you permit recursion and temporary files in your self-contained C program? >> (2) Now that you've demonstrated all kinds of wonderful things about >> `self-contained C programs' can you give some examples of this class? > >Sure: A program which integrates a fluid flow field over a 10000x10000 >grid. A graph manipulation program. A program which tests a given number >for primality. A program which finds the next billion zeros of the >Riemann zeta function. Have you forgotten the derivation of the word >``computer''? Oh, I get it now. Any program with fixed size input is equivalent to a FSM. Wow. That's a really important result. Are you going to write about a paper on this? What about an n*n grid where n is not known when the program is coded? What n+e graph manipulator when n and e are not known when the program.... A program that tests any number for primality? The next m zeros? Computer? Compter? I reckon so. (Inside joke for our Dutch and German friends.) >Predictable input means that you can replace all input statements with a >bounded set of coroutines which supply that input. It does not imply >``finite input'', which more precisely might mean ``the n'th read() will >return 0 for some n''; an infinite stream of x's is perfectly >predictable. Figure out the rest for yourself. Well, the coroutines have to be fixed size to compile. If you're serious about bounded recursion and space, that means the only unbounded inputs are type 3 languages. Wow, another important result. It only takes a FSM to recognise a type 3 language. You are indeed a tower of intellect. I bet you got all kinds of net.groupies. >Don't be stupid. The analyzer will always halt, because the original >computer either enters a loop (in which case the analyzer detects the >loop) or itself halts (in which case the analyzer halts too). Of course! Ah-ha! It's so obvious. If you say it's true, it must be. You don't happen to have a proof, do you? >Are you babbling because you were lost somewhere in the evolutionary >tree, or because you simply don't know what you're talking about? Oh, Dan, I am the dust beneath your feet. I grovel for enlightment. Please answer the query, which is too much for my feeble brain. I'm running my program on a machine #1, maxing out its memory. You run your analyzer on a slightly larger machine #2. Well, I'm maxed and need an upgrade--I use machine #2. I guess you need a slight larger machine #3. Oh, but I'll that one instead. Now you need a slightly larger machine #4. Silly me, but it seems that I can consume machines faster than you can produce them. If, as you claim, resources are inherently limited, it seems I'll be using some machine #n and there won't be a machine #n+1 for your analyzer. Is my program still a self-contained C program? If so, how are you going to run your analyzer? -- |-In compliance with DoC,---------------------Steven Ryan----------------------| | this posting is devoid | 2400 Geng Road, Palo Alto, CA | | of all information. | smryan@wyrmham.clipper.ingr.com | |-And Oliver North married William Secord/and gave birth to a little Teheran.--|
rockwell@socrates.umd.edu (Raul Rockwell) (06/22/91)
Dan Bernstein: Don't be stupid. The analyzer will always halt, because the original computer either enters a loop (in which case the analyzer detects the loop) or itself halts (in which case the analyzer halts too). Detecting loops on a machine with n states requires O(2^n) states in the analyzer. Except for trivial cases, this is far more than n+n states. -- Raul <rockwell@socrates.umd.edu>
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/23/91)
In article <ROCKWELL.91Jun22010647@socrates.umd.edu> rockwell@socrates.umd.edu (Raul Rockwell) writes: > Dan Bernstein: > Don't be stupid. The analyzer will always halt, because the > original computer either enters a loop (in which case the analyzer > detects the loop) or itself halts (in which case the analyzer halts > too). > Detecting loops on a machine with n states requires O(2^n) states in > the analyzer. This is the equivalent in computer ``science'' of an old wives' tale. Would you like to prove it, Raul? Let me guess: You were taught in computer science class that the halting problem could be solved for n-state machines with a 2^n-state machine that simply kept track of which states had been seen so far. *Therefore*, said your professor, the halting problem was solvable for FSMs in theory, but not in practice. How's that for a guess? ``This method for solving the problem takes exponential time. Therefore the problem is not in P.'' Anyone who reads Knuth (which, I gather, has fallen out of favor among computer scientists, as their students no longer know enough math to understand the first two pages) will find out that to detect loops in a sequence X_{n+1} = f(X_n), you do NOT need to store all previous X_k and compare X_{n+1} to each in turn. All you need to do is generate X_n and X_{2n} simultaneously, then compare the two. Can you prove this, Raul? Can you figure out how to adapt it to make a practical solution to the halting problem? Congratulations. Now what do you think of the statement ``Detecting loops on a machine with n states requires O(2^n) states in the analyzer''? ---Dan
mhcoffin@tolstoy.waterloo.edu (Michael Coffin) (06/23/91)
In article <29254.Jun2219.22.4491@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: > >Anyone who reads Knuth (which, I gather, has fallen out of favor among >computer scientists, as their students no longer know enough math to >understand the first two pages) will find out that to detect loops in a >sequence X_{n+1} = f(X_n), you do NOT need to store all previous X_k and >compare X_{n+1} to each in turn. All you need to do is generate X_n and >X_{2n} simultaneously, then compare the two. Can you prove this, Raul? >Can you figure out how to adapt it to make a practical solution to the >halting problem? Congratulations. Now what do you think of the statement >``Detecting loops on a machine with n states requires O(2^n) states in >the analyzer''? Lighten up, Mr. Bernstein. (As the inimitable J. Buffet says, "Don't ever forget---you might end up being wrong.") The technique you describe is well known by most computer scientists, and great for finding cycles in graphs. It doesn't make a "practical solution" to the halting problem because can take a long, long time to run. Here's a quick exercise---maybe a 10 in Knuth's rating system. How many steps can the above algorith run, as a function of the number of bits of state? Estimate how long the algorithm would take to decide that the following program indeed halts: for i := 1 to 2^31 do for j := 1 to 2^31 do for k := 1 to 2^31 do ; Can a solution to the halting problem be said to be "practical" if it takes millenia to terminate when applied to a simple program that uses less than 12 bytes of state? The loop detection algorithm might actually be handy once in a while, to detect programs that get into tight loops that don't change the state. But don't you think that calling it a "practical solution to the halting problem" is a tad overblown? -mike
rockwell@socrates.umd.edu (Raul Rockwell) (06/23/91)
Dan Bernstein: Anyone who reads Knuth (which, I gather, has fallen out of favor among computer scientists, as their students no longer know enough math to understand the first two pages) will find out that to detect loops in a sequence X_{n+1} = f(X_n), you do NOT need to store all previous X_k and compare X_{n+1} to each in turn. All you need to do is generate X_n and X_{2n} simultaneously, then compare the two. Can you prove this, Raul? Can you figure out how to adapt it to make a practical solution to the halting problem? Congratulations. Now what do you think of the statement ``Detecting loops on a machine with n states requires O(2^n) states in the analyzer''? Ok, so I need to buy myself a copy of Knuth... (again) There's still the minor problem that a machine with n bits has 2^n states... -- Raul <rockwell@socrates.umd.edu>
oz@ursa.ccs.yorku.ca (Ozan Yigit) (06/24/91)
Dan Bernstein writes:
Anyone who reads Knuth (which, I gather, has fallen out of favor among
computer scientists, as their students no longer know enough math to
understand the first two pages) will find out ...
Those volumes are not out of favor, but they are out-of-date by a
decade and a half, not to mention the use a prehistoric notation to illustrate
(and I use the term loosely) some of the algorithms. Actually, exhaustive
math and detail are the things that make the boring "mix" scraps almost
tolerable.
oz
---
Often it is means that justify ends: Goals | email: oz@nexus.yorku.ca
advance technique and technique survives even | phone: 416-736-5257 x 33976
when goal structures crumble. -- A. J. Perlis | other: oz@ursa.ccs.yorku.ca
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/24/91)
In article <OZ.91Jun23145702@ursa.ccs.yorku.ca> oz@ursa.ccs.yorku.ca (Ozan Yigit) writes: [ on Knuth's Art of Computer Programming ] > Those volumes are not out of favor, but they are out-of-date by a > decade and a half, Really? Within the subject matter covered by volumes 1 through 3, can you name any important algorithm (i.e., algorithm used by a significant percentage of today's programmers) not mentioned by Knuth? Can you name an important field---newer than 1973, or 1981 for the second volume--- that Knuth doesn't cover but that would logically fit into his books? (Keep in mind that compilers, for example, will be covered in a later volume if Knuth ever stops working on TeX.) I'm listening. ---Dan
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/24/91)
In article <1991Jun22.213750.18954@watserv1.waterloo.edu> mhcoffin@tolstoy.waterloo.edu (Michael Coffin) writes: > The technique you > describe is well known by most computer scientists, Some, at least. So how come there are so many published statements that to detect loops in an n-bit computer requires 2^n bits? I first saw this statement in a DDJ article---surely DDJ's editors don't all suffer the same delusion! > It doesn't make a "practical solution" to > the halting problem because can take a long, long time to run. But in *practice* most loops cycle through a very small set of states. This is true of nearly all the infinite loops I've seen, both in my code and in code I've had to maintain. > Can a solution to the halting problem be said to be "practical" if it > takes millenia to terminate when applied to a simple program that uses > less than 12 bytes of state? There will always exist programs of length n that run for 2^2^{kn} steps before stopping. In *practice* very few programs exhibit such behavior. ---Dan
plyon@ut-emx.uucp (Paul Lyon) (06/24/91)
In article <12358.Jun2400.03.5891@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >In article <OZ.91Jun23145702@ursa.ccs.yorku.ca> oz@ursa.ccs.yorku.ca (Ozan Yigit) writes: > [ on Knuth's Art of Computer Programming ] >> Those volumes are not out of favor, but they are out-of-date by a >> decade and a half, > >Really? Within the subject matter covered by volumes 1 through 3, can >you name any important algorithm (i.e., algorithm used by a significant >percentage of today's programmers) not mentioned by Knuth? It strikes me that there are some obvious examples here, some of which Knuth himself had a hand in discovering. For example: the Knuth-Morris- Pratt string searching algorithm and the Knuth-Bendix completion algorithm for term-rewriting systems (the basis for a good deal of the activity in "Logic Programming" relating to equations). Other obvious examples are the Boyer-Moore string searching algorithm and the Simplex algorithm. Then there is Robinson's Unification Algorithm, the alpha-beta minimax algorithm, and so it goes.. For, bear in mind that besides the projected Volume 7 on compilers, there were (are?) meant to be Volume 4, on Combinatorial Algorithms (which, judging by the handful of references to NP-Completeness in Volumes 1 - 3, was where complexity theory would be discussed), and Volume 5, on Lexical Scanning and Parsing (see the outline on pages vii-viii of the Preface to Volume 1). Perhaps the claim would be correct if the remaining volumes were published, but since this hasn't happened yet ....
rockwell@socrates.umd.edu (Raul Rockwell) (06/24/91)
Dan Bernstein: There will always exist programs of length n that run for 2^2^{kn} steps before stopping. In *practice* very few programs exhibit such behavior. So? I can burn hours, possibly days of cpu time with a single command line. And there's not even an infinite loop to detect. It's true that in practice I try and avoid such constructs, but I don't see it as a meaningful, qualitative, statement that such things don't exist. -- Raul <rockwell@socrates.umd.edu>
sw@smds.UUCP (Stephen E. Witham) (06/24/91)
In article <29254.Jun2219.22.4491@kramden.acf.nyu.edu>, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: > In article <ROCKWELL.91Jun22010647@socrates.umd.edu> rockwell@socrates.umd.edu (Raul Rockwell) writes: > > Detecting loops on a machine with n states requires O(2^n) states in > > the analyzer. > > This is the equivalent in computer ``science'' of an old wives' tale. There is a difference of assumptions here that has been going on through this whole thread. If you know all the inputs to your program, then it's equivalent to an FSM starting in a certain state, and you can find out whether it halts with two copies of the FSM--twice the memory, O(n^2) states. If you don't know all the inputs in advance, then you're trying to find out whether a whole class of FSMs halt. For n bits of input, that's 2^n different problems. --Steve
macrakis@osf.org (Stavros Macrakis) (06/25/91)
In article <12358.Jun2400.03.5891@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
Within the subject matter covered by [Knuth] volumes 1 through 3,
can you name any important algorithm (i.e., algorithm used by a
significant percentage of today's programmers) not mentioned by
Knuth?
Since someone has already answered the question, let me question your
question. Since when is an algorithm's ``importance'' measured by the
percentage of programmers using it? By that metric, probably 80% of
volume 1 is ``unimportant'', about 99% of volume 2, and perhaps 90% of
volume 3. Maybe a little more in some very high-tech environments.
Knuth is full of important algorithms, but many important algorithms
have been published since. Alas, neither old nor new algorithms seem
to be widely understood and used.
Unfortunate, no doubt, but that's life.
-s
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/25/91)
In article <MACRAKIS.91Jun24135614@next15.osf.org> macrakis@osf.org (Stavros Macrakis) writes: > Since when is an algorithm's ``importance'' measured by the > percentage of programmers using it? You're right. A better definition would be an algorithm or set of algorithms worthy of a section in volumes 1 through 3, or an algorithm that renders some of the information in volumes 1 through 3 obsolete. If Knuth's books are ``out of date'' then surely such algorithms exist, no? ---Dan
oz@ursa.ccs.yorku.ca (Ozan Yigit) (06/25/91)
Dan Bernstein writes: > Those volumes are not out of favor, but they are out-of-date by a > decade and a half, Really? Yes, really. Within the subject matter covered by volumes 1 through 3, can you name any important algorithm (i.e., algorithm used by a significant percentage of today's programmers) not mentioned by Knuth? If your silly response is any indication, you would still be staring at my finger after I point to some of those algorithms, with no hope of you ever discontinuing your semi-literate ramblings. oz
smryan@clipper.ingr.com (Steven Ryan) (06/26/91)
>But in *practice* most loops cycle through a very small set of states. >This is true of nearly all the infinite loops I've seen, both in my code >and in code I've had to maintain. > >There will always exist programs of length n that run for 2^2^{kn} >steps before stopping. In *practice* very few programs exhibit such >behavior. Oh, how faux profund. You can write an analyzer for any practical program. And what is a practical program? One for which you can write an analyzer. -- |-In compliance with DoC,---------------------Steven Ryan----------------------| | this posting is devoid | 2400 Geng Road, Palo Alto, CA | | of all information. | smryan@wyrmham.clipper.ingr.com | |-And Oliver North married William Secord/and gave birth to a little Teheran.--|
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/26/91)
In article <OZ.91Jun25114153@ursa.ccs.yorku.ca> oz@ursa.ccs.yorku.ca (Ozan Yigit) writes: > > Those volumes are not out of favor, but they are out-of-date by a > > decade and a half, > Really? > Yes, really. So why don't you back up your claim? Here, let me give you a head start. Knuth does spend quite a bit of time on asymptotically fast matrix multiplication. Today we not only know methods that are asymptotically even faster, but we have a somewhat better understanding of *why* the methods work. That portion of that section is clearly out of date, albeit by only a few years. I can name a few other examples, each deserving an exercise or two and perhaps a mention in the text. None of them are important, the way that each new sorting algorithm has found its niche, or that the spectral test caught on, or that so many memory allocators depend on the buddy system. None of them could be discussed at Knuth's level for more than half a page. C'mon, Oz. You said the books are out of date by a decade and a half. That means that there have been discoveries throughout the last 15 years which, had they been known at the time, would have been included in the books. What are they? I'm perfectly willing to believe that they exist, but you have to prove it by example rather than rhetoric. ---Dan
mhcoffin@tolstoy.waterloo.edu (Michael Coffin) (06/26/91)
The first thing that occurs to me that has occured since Knuth wrote The ACP, and ought to be in any programmers toolbox, is skip lists. Robin Hood hashing would probably be mentioned as well. (That took me 2 minutes.) -mike
schwartz@groucho.cs.psu.edu (Scott Schwartz) (06/26/91)
mhcoffin@tolstoy.waterloo.edu (Michael Coffin) writes:
...Skip lists. Robin Hood hashing...
Better, splay trees and amortised analysis.
gudeman@cs.arizona.edu (David Gudeman) (06/26/91)
In article <5979.Jun2519.38.4991@kramden.acf.nyu.edu> Dan Bernstein writes:
[about Knuth books being out of date]
]
]C'mon, Oz. You said the books are out of date by a decade and a half.
]That means that there have been discoveries throughout the last 15 years
]which, had they been known at the time, would have been included in the
]books.
No, Dan, that is not what "out of date" means. "Out of date" in this
context means that (1) there are much better ways to express
algorithms than the opaque, archaic notations in ACP, and (2) there
are new ways to approach the teaching of algorithms which many people
feel are better than the approach in ACP (your opinion on this is
irrelevant).
Besides, you just want to play this game where people suggest
algorithms that should have been in the books and aren't, then you get
to chortle and say "that doesn't fit the subject matter of the books",
or "that was going to be in the fourth book", or "that isn't an
important algorithm". (I've had this discussion with Dan before...)
Let me answer each objection once, to save net bandwidth:
Objection: "that doesn't fit the subject matter of the books"
Answer: then the subject matter of the books is outdated as material
for teaching algorithms. There are different ways of teaching
algorithms today (as before, your opinion on whether these new methods
are better, is without relevance).
Objection: "that was going to be in the fourth book"
Answer: if the fourth book ever gets written, then it presumably won't
be out of date when it is first published.
Objection: "that isn't an important algorithm"
Answer: Is so.
Basically, Dan, this is a matter of opinion. Most CS people think
that the books are out of date -- for teaching at least. The fact
that you have a different opinion is neither interesting nor
enlightening. So why do you insist on afflicting us with your
argumentative opinions? Surely you don't think you are going to
change anyone's mind.
--
David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman