chris@umcp-cs.UUCP (Chris Torek) (07/09/86)
In article <12221138454.23.PATTIS@WASHINGTON.ARPA> PATTIS@WASHINGTON.ARPA (Richard Pattis) writes: >For introductory programming classes, I try to emphasize the teaching >of critical thinking. I attempt teach techniques for trying to >prove that code is incorrect (such an activity may be labeled as >aggressive debugging - even in the absence of manifest bugs). This can be done `on line' as well: it is good practice when testing code to try to provoke any `boundary conditions' or unusual code paths, as such bugs tend to show up less often, but more dramatically (or drastically), simply because the program is `time-tested' and now `can be relied upon'. >Mathematically proving and disproving code are identical, A nit: not really. Disproving can be done with a single counterexample, but a lack of counterexamples does not constitute a proof. >but in training future programmers, I would much prefer that my >students say, "I haven't found anything wrong with my code yet," >to the all too frequently heard, "It's correct." A proof of correctness does not mean that a piece of code works, for the proof can be wrong or for some reason inapplicable, so I think this is indeed a good thing. It is, however, easy to fall into the trap of thinking `no one has found anything wrong with it yet, so it must be right'. A few embarrassing errors can do wonders for one's humility: >Of course, it's easier to spot errors in someone else's code, so >one good way to teach this skill is to ask students to analyze code >written by others (known to be wrong) and find any errors. You might also have students analyse each other's code. I wonder, though, if it is indeed easier to spot errors in another's code, or whether there is just more incentive for that than for finding one's own errors. On occasion I will look at some of my own old code and note `obvious' mistakes. If everyone did this we might have much better code available, but on the other hand we would probably have much less code as well, though on the third hand (first foot?) that might not be so bad after all :-). >On formalism: I believe that formalism is useful only when it can >be applied to extend what we can do, or accurately check what we >have done. Formalism for its own sake is pedantry. For example, >I advocate the use of simple control structures such as Pascal's, >which subscribe to the 1-in/1-out rule. Yet in Modula-2 we may >violate this rule by using LOOP with EXIT and subroutines that use >RETURN. I use these features to simply implement simple algorithms: >often such code is easier to understand, because it omits the >introduction and use of extraneous state variables and other >circuitous control flow. Indeed. What I cannot understand is why anyone thinks that EXIT and RETURN statements are bad. It is trivial to come up with an EXIT axiom, even if it requires the introduction of a pseudo-variable in the formal proof. (Simply mechanically insert a boolean in the loop, and an `if' around the part of the loop that follows the EXIT. The code constructs are equivalent, so the formal constructs are as well.) >Some comments on these solutions: Search1 uses two RETURN statements, >one in the FOR loop. Many programmers consider the premature RETURN >from the FOR loop to be bad programming practice. Again, I ask: *Why*? An argument in favour of RETURN, by _reductio ad absurdum_: Integer variables are bad because the actual variable and the formal description do not match exactly, for computer integers, unlike formals, can overflow and become negative or cause a runtime exception. Because computer integers are hard to describe formally, using integers is bad programming practice. [Incidentally, the introductory programming course at the University of Maryland actually *does* avoid integers. For their first year, students use something called `CF Pascal', in which the only legal data types are characters and files. The students are supposed to be able to prove their programs correct, and having few data types (one) and `Turing-tape-like' stores (files) helps keep the proof complexity down. I do not believe that these restrictions are sensible beyond the stage of learning to write and prove algorithms, and I am not even certain they are sensible there. It does provoke novel solutions to old problems, however.] >... now remove the ability to perform short-circuit boolean >evaluation (SCBE) and rewrite this function in Pascal. Some >programmers may argue that SCBE should be disallowed for similar >reasons of formalilty, because it doesn't support the intuitive >semantics for boolean expressions (commutivity); in the example >above, we must very carefully select our post-loop IF test. Why is commutivity `intuitive' here? Having used SCBE's for years, I find it very intuitive; indeed, at least one of the last few Pascal programs I wrote broke because of code like the following: if (n <> 0) and ((x mod n) = 0) then ... I had forgotten that Pascal semantics implied commutivity, and had taken advantage of a nonexistent feature. The error was easily corrected, but this does show that a commutative mindset is by no means universal. And again, it is not hard to formally express a short-circuit evaluation, though in conventional formalism the proof itself will `suggest' a series of IFs. > PROCEDURE Search2 (VAR A : ARRAY OF INTEGER; (* Efficiency *) > Key : INTEGER; > Default : INTEGER) : INTEGER; > VAR I,AnswerIndex : INTEGER; > BEGIN > AnswerIndex:= Default; > I:= 0; > WHILE (I <= HIGH(A)) AND (AnswerIndex = Default) DO > IF A[I] = Key > THEN AnswerIndex:= I END; (* Found first I: A[I] = Key *) > I:= I+1 > END; > RETURN AnswerIndex > END Search2; Curious: if the Default answer is a legal subscript for A, and the desired value appears in A at that point and also at a later point, this routine returns the later point. For example, if A contains the values (4, 7, 2, 7, 3), Key is 7, and Default is 1, this function returns 3. Perhaps this is the `subtlety' you mention (I thought it fairly obvious). [N.B.: The following points were quoted for the purpose of refutation; I am *agreeing* with Pattis below.] >Here again I have abstracted some quotations from Sale's book >(page 401). ... > > "(1) Program fragments containing EXIT statements or unusually > placed RETURN statements which are not at the termination points > of their procedures (call these 'early returns') have a flow and > execution which is not easy to understand.... I think you have provided sufficient examples of cases in which the opposite is true. > (2) Proof techniques or constructively correct design procedures > are not easily applied to these constructs because they have complex > semantic definitions.... I refer again to my _reductio ad absurdum_ argument. > (3) Reading programs with EXIT statements and early returns can be > difficult, especially if long procedure bodies are involved. Indeed. It can be almost as difficult as reading procedures with `extra' booleans that effect the same control flow. > Finding all the relevant embedded statements and in the case of EXITs > determining which loop they belong to will require visual parsing of the > text.... Understanding the code requires a parse and semantic analysis anyway. To quote a certain old lady, `Where's the beef?' > (4) Other reasons for the constructs usually put forward are either > convenience or efficiency. Both arguments are at odds with the primary > aims of Modula-2, which are clarity, abstraction, and correctness.... See the answer to (1). > (5) Most programmers who are taught programming in accordance with > modern computer science concepts have thought patterns which never > evoke them." This is good? >In summary I think that formalism, when correctly applied, is >wonderful. But formalism works best when applied to simple code. >Before jumping into a formal analysis of code, I like to try to >ensure that the code is written as simply and as easy to understand >as possible. This precondition increases the chances that further >formal analysis will supply insight. Or to put it another way: why spend hours proving clumsy code when you can prove a rewrite in a few minutes? -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu
smd@umcp-cs.UUCP (Stanley Dunn) (07/09/86)
In article <2308@umcp-cs.UUCP> chris@maryland.UUCP (Chris Torek) writes: > >[Incidentally, the introductory programming course at the University >of Maryland actually *does* avoid integers. For their first year, >students use something called `CF Pascal', in which the only legal >data types are characters and files. For the first semester course (CMSC 112) the students use CF Pascal. In the second semester course (CMSC 113) the students are exposed tothe full language in a disciplined fashion. > The students are supposed to >be able to prove their programs correct, and having few data types >(one) and `Turing-tape-like' stores (files) helps keep the proof >complexity down. They are able to prove their programs correct. But more importantly, they have learned how to solve problems rigorously. > I do not believe that these restrictions are >sensible beyond the stage of learning to write and prove algorithms, >and I am not even certain they are sensible there. It does provoke >novel solutions to old problems, however.] > I am sure that many others agree, however this is exactly the point of view that people have to change. Many students feel that CF Pascal is a "crutch" and that they should be allowed to use the full language. At least the students have an excuse in that they are young and inexperienced. The fact of the matter is that the opposite is true - everything beyond CF Pascal are merely shorthand notations for cumbersome operations in CF Pascal, and that if the programmer has mastered the concept of Procedural Abstraction there should be no conceptual difference. Learning a science is not learning all the details about the tools you are working with - that sort of training makes you a technician. To be a scientist or an engineer requires that you have a deeper level of understanding of the properties of the tools you are using to solve problems. Computer Science should be no different, and CF Pascal helps development the thought processes of the students (with appropriate exercises, of course). The projects you use have to be geared for CF Pascal without doing any arithmetic or random access manipulation. Two examples of projects I gave my students to write were an interpreter and a program for symbolic evaluaton of boolean expressions. There are many problems that are hard to solve because the concepts are difficult, and not because of CF Pascal. In fact, many of the projects I used were just character and string processing. This approach to teaching should not be condemmed without a thorough understanding of the material, after which you can give a mature, educated evaluation. Next week a short course is being offered here (Dept. of Computer Science, UMCP) for those interested in teaching Computer Science with the Mills' text and this structured approach (CF / D / O Pascal). Harlan Mills, Vic Basili, John Gannon, and myself will be going over the text, and relating personal experience from teaching the two courses CMSC 112 and CMSC 113 here.
chris@umcp-cs.UUCP (Chris Torek) (07/09/86)
>In article <2308@umcp-cs.UUCP> chris@maryland.UUCP (Chris Torek) writes: >>Incidentally, the introductory programming course at the University >>of Maryland actually *does* avoid integers. For their first year, >>students use something called `CF Pascal', in which the only legal >>data types are characters and files. In article <2313@umcp-cs.UUCP> smd@maryland.UUCP (Stanley Dunn) writes: >For the first semester course (CMSC 112) the students use CF Pascal. >In the second semester course (CMSC 113) the students are exposed >to the full language in a disciplined fashion. All right, one semester; that sounds a bit better to me, although I do not know just what length of time is `best'. Less than a semester is likely to be too short; more than a year is likely to be too long. >> The students are supposed to >>be able to prove their programs correct, and having few data types >>(one) and `Turing-tape-like' stores (files) helps keep the proof >>complexity down. >They are able to prove their programs correct. You mean those who make it through the course are able. There will always be those who do not make it---thus the words `supposed to be'. >But more importantly, they have learned how to solve >problems rigorously. This is true, but was not relevant to the original context to which the quoted text was `incidental' (see the first quoted word!). >> I do not believe that these restrictions are >>sensible beyond the stage of learning to write and prove algorithms, >>and I am not even certain they are sensible there. It does provoke >>novel solutions to old problems, however. >I am sure that many others agree, however this is exactly >the point of view that people have to change. Many others agree with what? (Hold on to that thought for a moment.) >Many students feel that CF Pascal is a "crutch" and that they should >be allowed to use the full language. At least the students have an >excuse in that they are young and inexperienced. The fact >of the matter is that the opposite is true - everything beyond >CF Pascal are merely shorthand notations for cumbersome operations >in CF Pascal, and that if the programmer has mastered the >concept of Procedural Abstraction there should be no conceptual >difference. Back to the first paragraph: agree with what? That the restrictions are not sensible once a programmer understands algorithms? Are you arguing that they are? I do not believe you are. Rather, you seem to think that I said that the CF Pascal restrictions are not sensible in an introductory programming course. I DID NOT SAY THAT. I said that I am not positive that they are sensible. I *think* they are, but I cannot be certain: especially not certain, given my limited experience with the results. I *am* certain that `shorthand ... for cumbersome operations' *is* necessary in order to get things done. It is conceptually equivalent for me to type in strings of ones and zeros now rather than ASCII characters, but I would never get anything done that way. I could write a Turing machine simulator and write all my code in that: it is conceptually equivalent to (say) C. I would never get any useful work done though. A proper foundation is essential, but I am not going to live in an unroofed basement. >Learning a science is not learning all the details about >the tools you are working with - that sort of training >makes you a technician. To be a scientist or an engineer >requires that you have a deeper level of understanding >of the properties of the tools you are using to solve problems. Agreed. >Computer Science should be no different, and CF Pascal >helps development the thought processes of the >students (with appropriate exercises, of course). You seem very confident---but then you have taught the course and seen the results. All I have seen is `from the outside', where it looks good, but I am not willing to make a final decision. >This approach to teaching should not be condemmed without >a thorough understanding of the material, after which you >can give a mature, educated evaluation. I was not, I *am* not, condemning it! The original context to which my remarks were `incidental' concerned program proofs; I wanted to mention CF Pascal because it is such a provable language. It may also be a good language for teaching computer science; I *think* so, but I will not say `this is incontrovertible fact'. I repeat: I am not *certain* that the restrictions in CF Pascal are sensible in a first programming course. If you are, fine; but you have more information than I. (Sorry to go on for so long, but I hate being misinterpreted.) >Next week a short course is being offered here (Dept. of Computer >Science, UMCP) for those interested in teaching Computer Science with >the Mills' text and this structured approach (CF / D / O Pascal). How does one find out about these things? I have participated in a few short courses on Unix and C; but I heard of them only by being asked to help teach! -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu