chris@mimsy.UUCP (Chris Torek) (10/06/88)
In article <993@esunix.UUCP> bpendlet@esunix.UUCP (Bob Pendleton) writes: >... I'm in favor of defining the >behavior of every operator in a language on all of its operand set. Why? >Since NULL can be stored in a pointer, the actions of all pointer >operators when applied to NULL should, in my opinion, be defined. I disagree. In particular, with regard to the more general statement, if we can improve the performance of a language on existing architectures by explicitly leaving certain semantics undefined, should we do so? The argumentam [? ablative case anyway] pro is simple: programs run X% faster. The argumentam con appears to be that programmers will use the construct anyway. So what? Those programs are then by definition not portable and may be considered just so many random bits: worthless. It is up to the programmer, and the buyer of programs, to make sure that programs in this language do not depend on undefined semantics. Just because I can use a knife as a screwdriver does not mean that all knives should also be screwdrivers. . . . -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
steveb@cs.utexas.edu (Steve Benz) (10/06/88)
In article <13889@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >In article <993@esunix.UUCP> bpendlet@esunix.UUCP (Bob Pendleton) writes: >... >>Since NULL can be stored in a pointer, the actions of all pointer >>operators when applied to NULL should, in my opinion, be defined. >... >It is up to the programmer, and the buyer of programs, to make sure >that programs in this language do not depend on undefined semantics. First off -- not all operations *can* be "defined" -- for instance, square root of a real number cannot be defined over all real numbers. But that's mostly due to the definition of "define" which seems to be in vogue in this discussion. In fact, square root of a negative number is defined -- to an error. The difficulty here is that not all systems recover or recognize errors in the same way. HOWEVER! That should not be considered a factor in this discussion. Vendors that sell software with semantic errors are selling software with bugs. Requiring every machine to detect and recover from semantic errors in the same way only helps in that bugs will have the same symptoms on every machine. I would simply recommend that when vendors test their software, they turn on compiler switches which enable strict semantic checking (i.e. turn on stray pointer checking, index checking, 0-divide checking, negative root checking....) - Steve Benz
lee@uhccux.uhcc.hawaii.edu (Greg Lee) (10/07/88)
From article <13889@mimsy.UUCP>, by chris@mimsy.UUCP (Chris Torek):
" In article <993@esunix.UUCP> bpendlet@esunix.UUCP (Bob Pendleton) writes:
" >... I'm in favor of defining the
" >behavior of every operator in a language on all of its operand set.
" ...
" I disagree.
"
" In particular, with regard to the more general statement, if we can
" improve the performance of a language on existing architectures by
" explicitly leaving certain semantics undefined, should we do so? The
" argumentam [? ablative case anyway] pro is simple: programs run X%
" faster. ...
Why would programs run any faster in a language with some undefined
semantics? To make a comparison, the same programs would have to
run in both versions of the language, and so could not make any
use of the semantics undefined in one of the versions. So why
then would the definition of semantics cause a program which does
not use it to run slower? I suppose you could arrange things that
way, but why would you need to, or want to?
" The argumentam con appears to be that programmers will use
" the construct anyway. So what? Those programs are then by definition not
" portable and may be considered just so many random bits: worthless.
" It is up to the programmer, and the buyer of programs, to make sure
" that programs in this language do not depend on undefined semantics.
This appears to assume your conclusion -- that a proper version
of the language may leave syntactically correct constructions undefined.
Greg, lee@uhccux.uhcc.hawaii.edu
chris@mimsy.UUCP (Chris Torek) (10/07/88)
>In article <13889@mimsy.UUCP> I wrote: >>... if we can improve the performance of a language on existing >>architectures by explicitly leaving certain semantics undefined, >>should we do so? In article <2472@uhccux.uhcc.hawaii.edu> lee@uhccux.uhcc.hawaii.edu (Greg Lee) argues with my arguments. Let me take the last first: >This appears to assume your conclusion -- that a proper version >of the language may leave syntactically correct constructions undefined. All nontrivial languages have this property. What is 1/0? 0/0? What sort of rounding modes does the hardware use for floating point? Ones complement or two? >Why would programs run any faster in a language with some undefined >semantics? Put it this way: if the language defines 0/0 as `runtime error, code = INDETERMINATE' while `1/0' is `runtime error, code = DIV BY ZERO', but some hardware has imprecise faults and makes all divide by zero errors a single code, then to implement the language correctly on this hardware, we must check every divide before dividing unless we can prove that neither the numerator nor the denominator are zero. >To make a comparison, the same programs would have to >run in both versions of the language, and so could not make any >use of the semantics undefined in one of the versions. This is the thrust behind the `argumentam con'. Recall that this whole discussion came up when someone said that the action of *(type *)NULL in C should be defined: it should do the same thing on every machine. The only reason to define it is because people use it. These people, and their programs, are wrong; but they do exist. And if you do not know that a program depends on strcmp((char *)0,"f(")==0, and sell this program without testing it on something other than a 3B first, then those who buy it and try to run it on a Sun or a Vax will lose out. (Assuming *(char *)0 == "f(" is a real example from a real program!) >So why then would the definition of semantics cause a program which does >not use it to run slower? It requires two assertions to hold. First, the machine's `native' semantics must not match---easily demonstrated for *(type *)NULL and for division by zero and for uninitialised local variables and so forth. Second, and more importantly, it must be sufficiently hard to predict whether a program does in fact use the `nonnative' semantics. Again, this is easily demonstrated for *(type *)NULL and for division by zero and so forth: f() { float a, b = g() ? 1. : 0.; a = 1./b; } Does this program depend on the semantics of 1.0/0.0? If the language defines it and your machine does not match the languages definition, will you have to insert runtime checks? And if you must insert runtime checks, will not the program run slower? -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
lamaster@ames.arc.nasa.gov (Hugh LaMaster) (10/07/88)
In article <3481@cs.utexas.edu> steveb@cs.utexas.edu (Steve Benz) writes: > >I would simply recommend that when vendors test their software, they >turn on compiler switches which enable strict semantic checking >(i.e. turn on stray pointer checking, index checking, 0-divide checking, >negative root checking....) > > > > > An excellent practice. There is a wide disparity among different vendor's compilers with respect to the number of such switches. Such switches are very useful. Another switch that I would like to see added to many compilers is one found on Cray compilers: truncate. This switch affects the code generation phase and masks off the least significant bits of floating point expressions to any desired significance. You would be surprised by how many errors this can catch; it is a very fast way for an engineer or scientist to check that some change to a code hasn't affected the stability of the results. -- Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117
lee@uhccux.uhcc.hawaii.edu (Greg Lee) (10/07/88)
From article <13898@mimsy.UUCP>, by chris@mimsy.UUCP (Chris Torek): "... " In article <2472@uhccux.uhcc.hawaii.edu> lee@uhccux.uhcc.hawaii.edu "... I see. I was thinking only of cases where definedness is predictable from the form of an expression -- (non-)orthogonality of instruction sets. All the same, I still say the second argument was circular. Greg, lee@uhccux.uhcc.hawaii.edu
chris@mimsy.UUCP (Chris Torek) (10/09/88)
In article <16603@shemp.CS.UCLA.EDU> casey@admin.cognet.ucla.edu (Casey Leedom) writes: > As Steve pointed out, what we're probably dealing with is a >misunderstanding about the usage of the word "define". Probably true. Someone wanted a Machine Independent Intermediate Language to include assertions that would pin down the exact action of various `undefined' actions (1/0, *(type*)NULL, etc). I claim that this is, or should be, unnecessary (regardless of the feasibility of MIILs). That these actions are `defined as undefined' is not self contradictory, but it certainly does make for misunderstandings! >... whether you get floating point truncation or rounding (and if >this isn't one of the ANSI C machparam.h defined machine constants, >it should be (or is that a POSIX standard?)), This one is in <float.h>; others are in <limits.h>. The name <sys/machparam.h> is not in the ANSI standard (wrong directory, and more important, too many characters---the magic limit is six, not including the `.h' part). > I think that all Steve is asking is that we don't leave any loop >holes. Standards have very particular language for this; the key words are `defined', `implementation defined', and `undefined'. Defined behaviour is simply that: integer multiplication of 2 and 3 always gives 6. Implementation-defined means that the standard itself does not say what you get, but that the system must tell you what happens: -1>>1 might be -1; it might be 32767; it might be something else; but -1>>1 will always be something predictable (and usually chosen from a limited set of possibilities). `Undefined' means that anything can happen: to borrow one of my own silly examples, maybe the machine sometimes expodes, sending shrapnel through the computer room. When I say `leave some semantics undefined', I mean in this last sense. We will tell you that *(type *)NULL is unpredictable; we will not even limit it to (type)0 or `segmentation fault, core dumped'. If you use it, it is your error. By making it undefined, we need not pin down the semantics of *(type *)NULL in a MIIL for that language. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
henry@utzoo.uucp (Henry Spencer) (10/09/88)
In article <2472@uhccux.uhcc.hawaii.edu> lee@uhccux.uhcc.hawaii.edu (Greg Lee) writes: >Why would programs run any faster in a language with some undefined >semantics? To make a comparison, the same programs would have to >run in both versions of the language, and so could not make any >use of the semantics undefined in one of the versions... But the compiler doesn't necessarily know this, so it may have to take precautions that in fact are unnecessary but nevertheless slow down the code. Checking all pointer dereferences for NULL pointers, for example. The reason for leaving some semantics undefined is to avoid penalizing *all* programs for the sake of predictable behavior of the few that bend the rules. -- The meek can have the Earth; | Henry Spencer at U of Toronto Zoology the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu
bpendlet@esunix.UUCP (Bob Pendleton) (10/10/88)
From article <13889@mimsy.UUCP>, by chris@mimsy.UUCP (Chris Torek): > In article <993@esunix.UUCP> bpendlet@esunix.UUCP (Bob Pendleton) writes: >>... I'm in favor of defining the >>behavior of every operator in a language on all of its operand set. > > Why? So that I can know what the program is going to do without actually having to run it. It is simply not possible to test a program on every machine that it might some day be run on. How can I honestly sell a program in MIIL or even in source code if I can't give some assurance that the program will run correctly? Defining the behavior of an operator on a specific subrange of operands as a run time error is acceptable, and useful. In specific. 1/0, 0/0, *NULL, and sqrt(-1) should all, in my humble? opinion, cause specific run time errors > In particular, with regard to the more general statement, if we can > improve the performance of a language on existing architectures by > explicitly leaving certain semantics undefined, should we do so? NO! > The > argumentam [? ablative case anyway] pro is simple: programs run X% > faster. The argumentam con appears to be that programmers will use > the construct anyway. So what? Those programs are then by definition not > portable and may be considered just so many random bits: worthless. > It is up to the programmer, and the buyer of programs, to make sure > that programs in this language do not depend on undefined semantics. Trying to avoid the religous argument lets look only at economics. A year of software engineer time costs ~$100,000. Your mileage may vary. I've heard numbers ranging from $70K to $150K. If I'm paying to develop code, I can't take the chance that because of a poor language definition my multimillion dollar program will turn out to be just random worthless bits on the next generation of computers. Or, even this generations computers from some other vendor. For the cost of one year of porting effort I can afford to pay for an awful lot of extra machine cycles. Not to mention that every year that passes the cost of the cycles drops and the cost of engineering time increases. So the argument gets stronger every day. This argument fails when you start talking about programs that are only barely possible on existing computers. But, that is the exception, not the rule.-- Bob Pendleton, speaking only for myself. An average hammer is better for driving nails than a superior wrench. When your only tool is a hammer, everything starts looking like a nail. UUCP Address: decwrl!esunix!bpendlet or utah-cs!esunix!bpendlet
bpendlet@esunix.UUCP (Bob Pendleton) (10/10/88)
From article <13914@mimsy.UUCP>, by chris@mimsy.UUCP (Chris Torek): > In article <16603@shemp.CS.UCLA.EDU> casey@admin.cognet.ucla.edu > (Casey Leedom) writes: >> As Steve pointed out, what we're probably dealing with is a >>misunderstanding about the usage of the word "define". > > Probably true. Someone wanted a Machine Independent Intermediate ^^^^^^^ Fame is so short lived. Leave the discussion for 48 hours and you become just an anonymous "someone." > Language to include assertions that would pin down the exact action of > various `undefined' actions (1/0, *(type*)NULL, etc). I claim that > this is, or should be, unnecessary (regardless of the feasibility of > MIILs). That these actions are `defined as undefined' is not self > contradictory, but it certainly does make for misunderstandings! Before I go on let me say that I've realized that I've been using the term portable to mean machine independent. Where portable should mean something like "can be moved to a new machine with some effort", and machine independent should mean something like "can be moved to a new machine with at most a recompilation." Defined as undefined is a perfectly valid way to define the actions of an operand on particular subranges, like * on NULL and square root on negative numbers. But, from the point of view of trying to make programs machine independent it is a useless definition. This kind of definition forces me to test a program on every existing machine architecture, using every existing compiler for each architecture, and even on every new release of each compiler, before I can claim that the program is machine independent. Using a language that has features that are defined as undefined and limited resources I can have no assurance that the program is machine independent. I can only hope that it is portable. Clearly, I hope, there is a gap between the goals of a MIIL (Machine Independent Intermediate Language) and a langauge that is defined in such a way as to be machine dependent. This gap requires that the machine dependent parts of such a language must be bound to MIIL dependent definitions before the language can used as a source language for compilation to MIIL. -- Bob Pendleton, speaking only for myself. An average hammer is better for driving nails than a superior wrench. When your only tool is a hammer, everything starts looking like a nail. UUCP Address: decwrl!esunix!bpendlet or utah-cs!esunix!bpendlet
nevin1@ihlpb.ATT.COM (Liber) (10/12/88)
In article <2472@uhccux.uhcc.hawaii.edu> lee@uhccux.uhcc.hawaii.edu (Greg Lee) writes: >Why would programs run any faster in a language with some undefined >semantics? To make a comparison, the same programs would have to >run in both versions of the language, and so could not make any >use of the semantics undefined in one of the versions. So why >then would the definition of semantics cause a program which does >not use it to run slower? The things that usually (never make blanket all-encompassing statements; uh, I mean, *almost* never ... :-)) need defining are exceptions to a rule (eg: dereferencing NULL is an exception to the dereferencing operation). If exceptions are to be defined, they are usually defined so that they are consistent with the rest of the rules. In mathematics, for example, 1 is defined to not be prime, 0! is defined to be 1, x to the 0 power (x<>0) is defined to be 1, etc., because the rest of the rules need not make exceptions for these cases (the rules for what a prime is, what factorial means, what exponentiation means, etc., however, have these exceptions). So, given that *NULL is an exception to dereferencing, what would be a good way to define it? One way is to allow it to be dereferenced just like any other pointer. What does this definition give you? With this definition, you never (at the compiler level) have to check to see whether or not the value of a pointer is NULL. Now, what does this definition mean in terms of the semantics of the high-level language? Since pointers and memory locations are OS/machine/compiler dependent, the semantics of *NULL are implementation-dependent, and are considered undefined in terms of the semantics of the language. Most (if not all) of the other definitions of *NULL on the high-level (semantics of C) cause more exceptions (which translates into slower and larger code) to be required on the lower-level (implementation level). -- _ __ NEVIN J. LIBER ..!att!ihlpb!nevin1 (312) 979-4751 IH 4F-410 ' ) ) "I catch him with a left hook. He eels over. It was a fluke, but there / / _ , __o ____ he was, lying on the deck, flat as a mackerel - kelpless!" / (_</_\/ <__/ / <_ As far as I know, these are NOT the opinions of AT&T.
lee@uhccux.uhcc.hawaii.edu (Greg Lee) (10/13/88)
From article <8916@ihlpb.ATT.COM>, by nevin1@ihlpb.ATT.COM (Liber):
" In article <2472@uhccux.uhcc.hawaii.edu> lee@uhccux.uhcc.hawaii.edu (Greg Lee) writes:
"
" >Why would programs run any faster in a language with some undefined
" >semantics? To make a comparison, the same programs would have to
" ...
" like any other pointer. What does this definition give you? With this
" definition, you never (at the compiler level) have to check to see
" whether or not the value of a pointer is NULL. Now, what does this
Hmmm. I thought the main point here was that you can't check this at
compile time, except occasionally for constant expressions, and so
it would have to be checked at run time, slowing up the program.
Greg, lee@uhccux.uhcc.hawaii.edu
news@ism780c.isc.com (News system) (10/13/88)
In article <8916@ihlpb.ATT.COM> nevin1@ihlpb.UUCP (55528-Liber,N.J.) writes: > >So, given that *NULL is an exception to dereferencing, what would be a >good way to define it? > _ __ NEVIN J. LIBER ..!att!ihlpb!nevin1 (312) 979-4751 IH 4F-410 Might I suggest the definition used in the Pascal Standard. First, the term 'error' means an error in the source program. The standard says: "It is an error if the pointer-variable of an of an identified-variable denotes a nil-value". Or translated in to English, it is an error to dereference a pointer to no object. The standard also says: "A complying processor is required to provide documentation concerning its treatment of errors". So the way in which a a processor treats an error is up to the implementer, the only requirement is the implementor must inform the user. A user who wants erroneous programs to behave the same way on all implementations is just plain out of luck. I hope the proposed C standard says approximately the same thing about erroneous source programs. Marv Rubinstein
news@ism780c.isc.com (News system) (10/14/88)
In article <1000@esunix.UUCP> bpendlet@esunix.UUCP (Bob Pendleton) writes: >Defined as undefined is a perfectly valid way to define the actions of >an operand on particular subranges, like * on NULL and square root on >negative numbers. But, from the point of view of trying to make >programs machine independent it is a useless definition. This kind of >definition forces me to test a program on every existing machine >architecture, using every existing compiler for each architecture, and >even on every new release of each compiler, before I can claim that >the program is machine independent. Defining it won't make it so. Even if I wrote programs in a language where the semantics for every legal syntactical construct was *EXACTLY* specified, I would still test my program on every combination of machine and compiler for which I would like to guarantee that my program runs correctly. For although I never make mistakes :-), I cannot say the same for the people who write operating systems and compilers. (I have even encountered hardware that does not work as advertised.) Would anyone guarantee a program distributed in a form that required that the customer compile it and link it before using it? Marv Rubinstein