jrv@siemens.UUCP (04/07/86)
>If R and L are longs, and I is an int, then the expression > > L = R + 2 + i; > >is non-portable. Lint, however, does not complain about this. I must be missing something. Why is this non-portable? Jim Vallino Siemens Research and Technology Lab. Princeton, NJ {allegra,ihnp4,seismo,philabs}!princeton!siemens!jrv
tim@ism780c.UUCP (Tim Smith) (04/15/86)
In article <32700003@siemens.UUCP> jrv@siemens.UUCP writes: >>If R and L are longs, and i is an int, then the expression >> >> L = R + 2 + i; >> >>is non-portable. Lint, however, does not complain about this. > >I must be missing something. Why is this non-portable? > Since + is associative and commutative, the compiler is free to re-order the expression. If you are on a machine where and int is shorter than a long, and if the compiler decided to do the 2+i first, you may get an overflow, whereas if the compiler does the R+2 first there is no problem. To make this expression portable, it needs to be L = R + 2 + (int)i; or L = R + 2L + i; -- Tim Smith sdcrdcf!ism780c!tim || ima!ism780!tim || ihnp4!cithep!tim
gwyn@BRL.ARPA (VLD/VMB) (04/19/86)
No, the compiler is not free to associate right-to-left in an expression long + int + int. Addition associates left-to-right, and the integral widening conventions apply, so the middle term must be converted to long and added to the leftmost term before the rightmost term is added. The compiler is allowed to reorder this only if it guarantees the same answer as would be obtained by following the rules. On machines where integer overflow is ignored, the opportunity arises more frequently than on those where it traps. However, in this example, information could be lost by overflow if the compiler treated the expression as long + (int + int) so it is not allowed to do that.
tim@ism780c.UUCP (Tim Smith) (04/22/86)
When I ask various people about what compilers can do to LONG + INT1 + INT2 I get two different answers. The first is that the compiler can play games with associativity and commutativity before it changes the ints to longs. The second is that it must change INT2 to a long because of the left associative nature of addition. Then it can use associativity and commutativity to change things. Reading K&R, I can convince myself of either answer. If the first case is correct, then this is a non-portable expression, and I think lint should warn about it. In the second case, there is no problem with that expression, but how about this one: INT1 + INT2 + LONG I should have used this one in my original posting. On a machine with different sizes for ints and longs, this expression can have problems. Lint says nothing about this. Is this a problem with lint, or am I (again) putting my foot in my mouth? -- Tim Smith sdcrdcf!ism780c!tim || ima!ism780!tim || ihnp4!cithep!tim
greg@utcsri.UUCP (Gregory Smith) (04/22/86)
In article <130@brl-smoke.ARPA> gwyn@BRL.ARPA (VLD/VMB) writes: >No, the compiler is not free to associate right-to-left in an >expression > long + int + int. >Addition associates left-to-right, and the integral widening >conventions apply, so the middle term must be converted to long >and added to the leftmost term before the rightmost term is >added. The compiler is allowed to reorder this only if it >guarantees the same answer as would be obtained by following >the rules. On machines where integer overflow is ignored, If this is strictly true, then the widening semantics are applied to the expression tree before any reassociation is done: long L; int I1,I2; L+I1+I2, parses as (L+I1)+I2, after widening is (L+(long)I1)+(long)I2 The compiler can still do this as L + ( (long)I1+(long)I2), though, since everything is now the same width and so the result won't be affected. >the opportunity arises more frequently than on those where >it traps. However, in this example, information could be >lost by overflow if the compiler treated the expression as > long + (int + int) >so it is not allowed to do that. What if you *write* L+(I1+I2)? If the widening semantics *are* applied before any reassociation is done, the result is L+(long)(I1 + I2), using an integer add for I1+I2, which presumably may not be re-ordered because the result would be affected - besides it is no longer of the form a+(b+c). However, I tried this ( using 16-bit short ints for I1,I2 ) on our native vax compiler and on our 68K compiler. Both produced identical code for (L+I1)+I2 and L+(I1+I2) and L+I1+I2: specifically, ((long)I1+L)+(long)I2. I guess the a+b+c is recognized as a special case, and all three are widened to a common width regardless of the original associativity. Interestingly, writing the expression as L+(short)(I1+I2) had no effect on the vax code, but forced the 68K compiler to do a 16-bit add of I1 and I2 before adding L. The 68K code is what I would expect - although deliberate use of casts as %65536 operators is pretty shaky stuff. A final question - It seems that L+I1+I2 will work regardless of if and where you put the ()'s, since a 'chained' addition is treated as a special case. But is this a feature of the language, or of the compiler? Any 'portability' which is based on a non-essential compiler feature is not really true portability, is it...? -- "If you aren't making any mistakes, you aren't doing anything". ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg
henry@utzoo.UUCP (Henry Spencer) (04/23/86)
> No, the compiler is not free to associate right-to-left in an > expression > long + int + int. > Addition associates left-to-right, and the integral widening > conventions apply, so the middle term must be converted to long > and added to the leftmost term before the rightmost term is > added. The compiler is allowed to reorder this only if it > guarantees the same answer as would be obtained by following > the rules... Sorry, Doug, not so. Not in the draft standard, anyway, unless serious changes have been made since my last copy (November). Yes, there is a statement early on to the effect that "the compiler can do things any way it pleases if the results are the same". However, somewhat later, in C.3, we also find: An expression involving more than one occurrence of the same commutative and associative binary operator (*, +, &, ^, |) may be regrouped arbitrarily... provided the types of the operands or of the results are not changed by this regrouping. Note that no mention is made of the types of intermediate values, unless that ambiguous reference to "results" instead of "result" is meant to imply it. And there is no constraint that the value of the expression be unchanged. -- Support the International League For The Derision Henry Spencer @ U of Toronto Zoology Of User-Friendliness! {allegra,ihnp4,decvax,pyramid}!utzoo!henry
gwyn@BRL.ARPA (04/23/86)
The reason "lint" doesn't warn about possible overflow on INT + INT + LONG is that ANY addition/subtraction can potentially overflow. "lint" warns when a (long) is being converted implicitly to a shorter integral type, because this is often a source of inadvertent bugs. But if it were to warn about every addition, people would stop using "lint".
ARPA@brl-smoke (04/28/86)
Re: Long + (Int1 + Int2) This really does call for Int1 and Int2 to be added first, without widening to (long), then the sum widened and added to Long. You have to watch out when trying to test this on a system where sizeof(int)==sizeof(long), though: the "integral widening conventions" will cause (short)s to be widened to (int) before adding if you're trying to test this by making Int1 and Int2 (short)s. The Ritchie PDP-11 C compiler does correctly distinguish between Long + (Int1 + Int2) and Long + Int1 + Int2 which result in different machine instruction sequences. An additional note: I've had a lot of trouble with code semantics getting screwed up by PCC's derived from Berkeley's (which in turn seems to have been borrowed from USG 3.0). Types don't propagate correctly in expressions, casts are ignored, etc. A real mess. I tried using the 4.2BSD PCC to generate code for my UNIX System V emulation, but finally out of disgust adapted the SVR2 PCC instead. I know Donn Seeley has been working on the Berkeley PCC, so perhaps the 4.3BSD version will be better. Why Berkeley insists on not using any AT&T software released after 1979 I don't know; their "licensing" argument doesn't hold water. I would urge all UNIX system vendors to start with SGS2/PCC2, or at least the latest AT&T "old PCC", rather than the 4.2BSD PCC that some seem to have used.
ARPA@brl-smoke (04/28/86)
"provided the types of the operands ... are not changed by this regrouping" to me covers the case of a change in "widening" also.
bright@dataioDat (04/29/86)
In article <2609@utcsri.UUCP> greg@utcsri.UUCP (Gregory Smith) writes: >However, I tried this ( using 16-bit short ints for I1,I2 ) on our >native vax compiler and on our 68K compiler. Both produced identical code >for (L+I1)+I2 and L+(I1+I2) and L+I1+I2: specifically, ((long)I1+L)+(long)I2. >I guess the a+b+c is recognized as a special case, and all three are widened >to a common width regardless of the original associativityy Shorts in C are always converted to ints before they are used. On the vax and most 68k compilers, ints are the same as longs. Therefore, this isn't any 'special case' recognized by the compiler. Different code for (L+I1)+I2 and L+(I1+I2) will only be generated if ints are smaller than longs.
greg@utcsri.UUCP (Gregory Smith) (05/05/86)
In article <972@dataioDataio.UUCP> bright@dataio.UUCP (Walter Bright writes: >In article <2609@utcsri.UUCP> greg@utcsri.UUCP (Gregory Smith) writes: >>However, I tried this ( using 16-bit short ints for I1,I2 ) on our >>native vax compiler and on our 68K compiler. Both produced identical code >>for (L+I1)+I2 and L+(I1+I2) and L+I1+I2: specifically, ((long)I1+L)+(long)I2. >>I guess the a+b+c is recognized as a special case, and all three are widened >>to a common width regardless of the original associativityy > >Shorts in C are always converted to ints before they are used. On the >vax and most 68k compilers, ints are the same as longs. Therefore, this >isn't any 'special case' recognized by the compiler. > >Different code for (L+I1)+I2 and L+(I1+I2) will only be generated if >ints are smaller than longs. You're quite right about the vax and 68K - since there's nothing wider than an int, you can't get in trouble. This hadn't occurred to me. Maybe someone with a PDP11 compiler could play with this and tell us what happens. About the 'special case', though, let me quote 'A Tour through the UNIX C Compiler', Dennis Ritchie ( about the PDP-11 compiler ): The *acommute* routine, called for associative and commutative operators, discovers clusters of the same operator at the top level of the current tree, and arranges them in a list: for `a+((b+c)+(d+f))' the list would be `a,b,c,d,e,f' [sic]. After each subtree is optimized, the list is stored in increasing difficulty of computation [...] the code generation algorithm works best when left operands are the difficult ones. [...] a constant is considered simpler than the address of a static or external, which is simpler than reference to a variable. This makes it easy to fold all the constants together, and also to merge together the sum of a constant and the address of a static or external [...]. A special routine is invoked to handle sums of products. *Distrib* is based on the fact that it is better to compute `c1*c2*x + c2*y' as `c1*(c2*x+y)' and makes the divisibility tests required to assure the correctness of the transformation. This transformation is rarely possible with code directly written by the user, but it invariably occurs as a result of the implementation of multidimensional arrays. Finally, *acommute* reconstructs a tree from the list of expressions which result. Given this device, it wouldn't be hard, on a 16-bit machine, to detect the presence of a long in a chained add, and to widen everything in the list to long to avoid overflow problems. Apparently there is no *Distrib* in the vax compiler: The following code is generated for x=p[i]; int x,i,p[10]; movl _i,r0 ; get i in register movl _p[r0],_x ; x = mem( &p + i*4 ) The following code is generated for x=a[i][j]: int x,i,j,a[10][20] ( with -O option ): movl _j,r0 ; r0 = j mull3 $80,_i,r1 ; r1 = 80*i addl2 $_a,r1 ; r1 = &a + 80*i ashl $2,r0,r0 ; r0 = 4*j addl2 r0,r1 ; r1 = &a + 80*i + 4*j movl (r1),_x ; do the move this *could* have been done as mull3 $20,_i,r1 ; r1 = 20*i addl2 _j,r1 ; r1 = 20*i + j movl _a[r1],_x if the compiler had rearranged &a+80*i+4*j as &a+4*(20*i+j). Why wouldn't the vax compiler be at least as smart as the PDP compiler for things like this? Especially in the presence of a '_a[r1]' addressing mode with implicit scaling. -- "Canabee be said2b or not2b anin tire b, if half thabee isnotabee, due2 somain chunt injury?" - Eric's Dilemma ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg
flaps@utcs.uucp (Alan J Rosenthal) (05/09/86)
In article <972@dataioDataio.UUCP> bright@dataio.UUCP (Walter Bright) writes: >Different code for (L+I1)+I2 and L+(I1+I2) will only be generated if >ints are smaller than longs. This is strictly true, but it is not true that different code for (L+I1)+I2 and L+(I1+I2) will necessarily be generated if ints are smaller than longs. (Besides which, it is unwise to assume this; see K&R p34...) Anyway, the official K&R quote here is from page 185 which says: ...[besides precedence rules] the order of evaluation of expressions is undefined. ... Expressions involving a commutative and associative operator (*, +, &, |, ^) may be rearranged arbitrarily, even in the presence of parentheses; to force a particular order of evaluation an explicit temporary must be used. So you see that the expressions (L+I1)+I2 and L+(I1+I2) are not necessarily distinct (though not necessarily identical either). Alan J Rosenthal {linus|decvax}!utzoo!utcs!flaps, {ihnp4|allegra}!cbosgd!utcs!flaps