peter@ficc.uu.net (Peter da Silva) (04/08/90)
In article <14309@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) misses the point and goes off on a tangent: > Well, as has already been pointed out, the pA=A assignment is actually > _illegal_ in C. It's not illegal in C. It's an implicit type-cast on assignment. To avoid a warning you must now use: pA = (t *)A; > As for "known at compile time" - well, N and M aren't if they were > passed as parameters. Sure, because A was passed as a 3-tuple. N and M aren't known explicitly unless declared, but N*M is known. And that's all you need to bound-check this baby. The point I'm making isn't that array indexing is or is not better than pointers, but that for the purpose at hand neither array indexing or pointers is better or worse at bounds checking. Just think of *A as a shorthand for A[0]. -- _--_|\ `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ 'U` \_.--._/ v
jlg@lambda.UUCP (Jim Giles) (04/09/90)
From article <ZZR2:M4xds13@ficc.uu.net>, by peter@ficc.uu.net (Peter da Silva): > [...] > Just think of *A as a shorthand for A[0]. Very good! Now we've got grounds for some sort of agreement. If you will review my preceeding articles on this subject, you will find that I have said almost exactly this. In fact, I will repeat my earlier statement: except for aliasing and bounds checking, pointer arithmetic is semantically IDENTICAL to one dimensional array indexing. Under your proposed 3-tuple idea, the distinction between pointers and arrays is further narrowed. In fact, because of the way you've constructed it, the two concepts are now nearly identical since the aliasing can only be within the _same_ bounded object. But this also can be done with arrays (ie. A[i] and A[j] for unknown i and j are assumed aliased). So, since there is now _NO_ remaining difference between the two concepts, why have both in the language? Arrays are the more flexible (since they can be multidimensional). Further, the array syntax tends to be more legible. So, I will just think of A[0] as a legible (therefore preferable) substitute for *A. Note that the above argument only applies to the issue of pointer _arithmetic_. For recursive data structures, either pointers or some other mechanism is still required. (Actually, even recursive data structures may be implemented with arrays, but it is less elegant in the general case.) I prefer a more direct syntax for recursive data structures than to go through user-visable pointers - but that's another issue. The point of this discussion is that pointers are not, and should not be treated, as a useful method of implementing arrays. J. Giles
peter@ficc.uu.net (Peter da Silva) (04/09/90)
> Under your proposed 3-tuple > idea, the distinction between pointers and arrays is further narrowed. Not at all. The *semantics* of pointers as addresses, and pointers as 3-tuples, are identical. A legal ANSI C program should operate identically under either implementation. It's just a safety net... not a new concept. [ And it's not my concept, either... much as I'd like to claim authorship, I didn't originate it. ] > In > fact, because of the way you've constructed it, the two concepts are now > nearly identical since the aliasing can only be within the _same_ bounded > object. That was always true, it was just unchecked. > For recursive data structures, either pointers or some other mechanism is > still required. And this is an important point. C can be implemented with a certain set of minimal constructs. It's really a very low-level language. And at this level pointers and arrays are the same... and the internals of recursive data structures are visible. If you don't like that, use a higher-level language. Treat C as a fancy assembler. > (Actually, even recursive data structures may be implemented > with arrays, but it is less elegant in the general case.) Exactly. Pointers are more general than arrays. > The point of this discussion is > that pointers are not, and should not be treated, as a useful method of > implementing arrays. Not at all. Since pointers and arrays are isomorphic, that pointers are a useful method of implementing arrays becomes a tautology. -- _--_|\ `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ 'U` \_.--._/ v
jlg@lambda.UUCP (Jim Giles) (04/10/90)
From article <1BT2FU7ggpc2@ficc.uu.net>, by peter@ficc.uu.net (Peter da Silva): > [...] >> (Actually, even recursive data structures may be implemented >> with arrays, but it is less elegant in the general case.) > > Exactly. Pointers are more general than arrays. Pointers are more general than arrays in that they are free from bounds checking and allow arbitrary aliasing. Arrays are more general than pointers in that they can be multiply dimensioned. Arrays also tend to be more legible and easier to optimize. Neither should be used to perform tasks better suited to the other. > [...] >> The point of this discussion is >> that pointers are not, and should not be treated, as a useful method of >> implementing arrays. > > Not at all. Since pointers and arrays are isomorphic, that pointers are a > useful method of implementing arrays becomes a tautology. Pointers are only isomorphic to _one_dimensional_arrays_, and only if their aliasing and freedom from bounds are both curtailed. Having now removed the only features that made pointers even remotely interesting (as an array-like mechanism) I see no reason for them to remain. J. Giles
diamond@tkou02.enet.dec.com (diamond@tkovoa) (04/10/90)
In article <1BT2FU7ggpc2@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: >And this is an important point. C can be implemented with a certain set of >minimal constructs. It's really a very low-level language. And at this level >pointers and arrays are the same... and the internals of recursive data >structures are visible. If you don't like that, use a higher-level language. >Treat C as a fancy assembler. Hear, hear! Now, if only we could teach all these programmers not to abuse this assembler, that they should do most of their programming in high-level languages, and use this fancy assembler only for rare exceptional cases such as memory allocators and device drivers..... -- Norman Diamond, Nihon DEC diamond@tkou02.enet.dec.com This_blank_intentionally_left_underlined________________________________________
peter@ficc.uu.net (Peter da Silva) (04/10/90)
In article <14316@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: > Pointers are more general than arrays in that they are free from bounds > checking and allow arbitrary aliasing. The first statement here is false, as I have demonstrated. By making a pointer a 3-tuple (low, current, high) it is subject to the same bounds checking. The second statement is irrelevant, because arrays can also be arbitrarily aliased. Fortran sidesteps this problem by declaring that this aliasing is illegal, but really doesn't provide a mechanism for ensuring that no illegal aliasing is going on. Other languages make no such guarantees. > Arrays are more general than > pointers in that they can be multiply dimensioned. The C semantics of multiply dimensioned arrays as arrays of arrays demonstrates the opposite. > Pointers are only isomorphic to _one_dimensional_arrays_, and only if > their aliasing and freedom from bounds are both curtailed. The first statement is false, because C semantics treats pointers as multiply dimensioned arrays (given a proper declaration), and the second statement is only true for programs that are broken in any case... just like Fortran programs where you do something like: INTEGER A(10) CALL PROC(A,A) -- _--_|\ `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ 'U` \_.--._/ v
preston@titan.rice.edu (Preston Briggs) (04/11/90)
In article <S4U2APBggpc2@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: >In article <14316@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >> Pointers are more general than arrays in that they are free from bounds >> checking and allow arbitrary aliasing. > >The first statement here is false, as I have demonstrated. Ah. Then pointers are not more general than arrays :-) > By making a >pointer a 3-tuple (low, current, high) it is subject to the same bounds >checking. And why not be practical? Implementing pointers with triples!? Going to be hard to keep in a register. Going to be hard to cast into a long int and back. With enough work by the compiler, the penalty for array bounds checking can be reduced to less than 10% on systems code (hearsay, from Peter Markstein) and much less on scientific code (Optimization of Range Checking, Vicky Markstein, John Cocke, Peter Markstein). Care to speculate on the cost of a 3-tuple scheme for pointers? If not, can we dispose of them as a topic? >The second statement is irrelevant, because arrays can also be >arbitrarily aliased. Fortran sidesteps this problem by declaring that this >aliasing is illegal, but really doesn't provide a mechanism for ensuring >that no illegal aliasing is going on. Arrays in fortran can be aliased at procedure calls (illegally) and in equivalence statements (legally). Equivalence statements are easy. Calls are more difficult but can be handled correctly, even allowing for separate compilation. That is, it's possible to detect the aliases at compile-time and either complain or handle them correctly (but with less efficient code). Aliasing due to pointers can be introduced at any assignment. It's possible to detect aliasing, but the analysis is either very imprecise or very expensive or both. By imprecise, I mean overly conservative -- pointers will appear to point to many places. -- Preston Briggs looking for the great leap forward preston@titan.rice.edu
flee@shire.cs.psu.edu (Felix Lee) (04/11/90)
I just had a nasty thought. If pointers were implemented as 3-tuples, then the compiler could generate run-time checks for aliasing, and then switch to either a fast- or slow-coded loop. This is probably doable to some extent without 3-tuple pointers. -- Felix Lee flee@shire.cs.psu.edu *!psuvax1!flee
preston@titan.rice.edu (Preston Briggs) (04/11/90)
In article <Ekg#.d5@cs.psu.edu> flee@shire.cs.psu.edu (Felix Lee) writes: >If pointers were implemented as 3-tuples, >then the compiler could generate run-time checks for aliasing, and >then switch to either a fast- or slow-coded loop. Jim Giles has suggested this in terms of arrays and fortran (presumably extended to allow aliasing). Ershov (in 1967) suggested compiling 2 versions of a procedure, one assuming aliasing and the other assuming no aliasing, and determining the appropriate one to call (at each call site) via compile-time analysis. Cooper and Kennedy (POPL, 1989) have shown how to do the necessary compile-time interprocedural alias analysis. Cooper (in '83) showed a practical approach to interprocedural analysis in the presence of separate compilation. None of these ideas, including 3-tuples, are adequate for analyzing the effects of pointers. They might be adequate to analyze "pointers used as arrays" if that were the only use of pointers. Unfortunately, pointers are very general and are used in many ways (after all, that's why they're popular). In particular, it's very difficult to determine (at compile time) what memory locations might be modified by assignments like *(root->left->right) = *(root->right) -- Preston Briggs looking for the great leap forward preston@titan.rice.edu
jlg@lambda.UUCP (Jim Giles) (04/11/90)
From article <6531@brazos.Rice.edu>, by preston@titan.rice.edu (Preston Briggs): > In article <Ekg#.d5@cs.psu.edu> flee@shire.cs.psu.edu (Felix Lee) writes: >>If pointers were implemented as 3-tuples, >>then the compiler could generate run-time checks for aliasing, and >>then switch to either a fast- or slow-coded loop. > > Jim Giles has suggested this in terms of arrays and fortran > (presumably extended to allow aliasing). Actually, I recommended it as something _NOT_ to do. The only way to make it genuinely useful is to have special versions of the code code each combination of aliased/non-aliased arguments. Since this increases combinatorially (by the definition of combinatorial :-) with the number of _prtentially_ aliased items, this leads to an unacceptably large number of alternatives. You can't efficiently implement this as a run-time test. Hence (as I've said before), the only solutions are to outlaw aliasing or to insist on sophisticated implementations which do interprocedural dataflow analysis. J. Giles
jlg@lambda.UUCP (Jim Giles) (04/11/90)
In article <S4U2APBggpc2@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: > [...] >The second statement is irrelevant, because arrays can also be >arbitrarily aliased. Fortran sidesteps this problem by declaring that this >aliasing is illegal, but really doesn't provide a mechanism for ensuring >that no illegal aliasing is going on. Neither the Fortran standard NOR the C standard provide mechanisms for implementing or insuring ANYTHING. The mechanisms are the responsibility of the implementor. As it happens, the implementaion of a test for illegal aliasing is fairly simple. (It consists of having the optimizer tag the data flow graph every time it took advantage of the fact that aliasing was illegal. The entry sequence for the procedure would then check all the flagged dependencies and issue an error if any of them _really_ were dependencies. This entry sequence test would be as fast as the 3-tuple test you proposed as a method in C for determining which version of a loop to take.) The fact that few (if any) implementations have an option to turn on run-time alias detection is not because it is hard to do (it isn't) but because there has been no demand for such a feature. The error of aliasing things illegally rarely occurs. The ability to detect aliasing at compile-time is hampered by the need to do interprocedural analysis. This the compiler can't do in the presence of separate compilation. If, on the other hand, the loader did code generation, it could detect the aliasing and generate code accordingly. If this technology were widely available and acceptably cheap, I suspect that the Fortran committee would remove the constraint against aliasing in procedure calls. C generates slow code by default in order to cater to a rarely needed ability to alias arguments through the procedure call. Fortran 'sidesteps' the problem by always generating better code. The test for aliasing (which I agree should be available) is possible, but few compilers have it because few users run into the problem. J. Giles
peter@ficc.uu.net (Peter da Silva) (04/11/90)
In article <6529@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes: > Ah. Then pointers are not more general than arrays :-) Not if you can allocate new arrays at will, and reference them somehow in a consistent manner. > > By making a > >pointer a 3-tuple (low, current, high) it is subject to the same bounds > >checking. > And why not be practical? Implementing pointers with triples!? > Going to be hard to keep in a register. Going to be hard to cast > into a long int and back. Doctor! Doctor! It hurts when I do this! So don't do this. Casting pointers to long ints and back is inherently dangerous. There is no guarantee that you can put a pointer in a register (you can't on an 80{,2}86 in large model), nor that you can put it in a long int (you can't on most word-addressed machines), so don't do it. The world is not a VAX. > With enough work by the compiler, the penalty for array bounds checking > can be reduced to less than 10% on systems code (hearsay, from > Peter Markstein) and much less on scientific code (Optimization of > Range Checking, Vicky Markstein, John Cocke, Peter Markstein). > Care to speculate on the cost of a 3-tuple scheme for pointers? About the same as the costs of bounds-checking on arrays passed to functions. About the same as the costs of bounds-checking on locally declared and allocated arrays. The only place it's going to cost more is when you use pointers for things arrays just can't be used for, such as lists. If you're doing scientific stuff, 3-tuple pointers and arrays are isomorphic. And if your code is worth a damn you're checking those pointers anyway. [ if(p == NULL) return FAILURE; ] Doesn't it make sense to let the compiler do this check where it can optimise it more effectively? -- _--_|\ `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ 'U` \_.--._/ v
njk@diku.dk (Niels J|rgen Kruse) (04/11/90)
peter@ficc.uu.net (Peter da Silva) writes:
->Using the syntax #A as the address of A (since &a now refers to a 3-tuple
->(#a, #a, #a)...
->> t A [N] [M]; /* 2d array of type t */
->A looks like a tuple (#A[0][0], #A[0][0], #A[N][0]) of type (t (*)[M]).
Agreed.
->> t *pA;
->> ...
->> pA = A;
->pA is the same tuple, but now looks like (#(t*)A[0], #(t*)A[0], #(t*)A[M*N])
->and has the type (t *).
------------------------------------------------------------------------^^^^^
Should be N, otherwise agreed.
->> Now, what bounds are associated with pA under your scheme? N clearly
->> doesn't make sense. But should the bound be M or N*M? Whichever you
->> pick, I'll (sooner or later) want it to be the other.
->It's N*M. If you want it to be M, then you would have said:
-> pA = A[0];
->Since that's a tuple (#A[0][0], #A[0][0], #A[0][M]) with type (t[]).
I disagree. It is my impression (i may be wrong), that the bounds
of a pointer to a subobject derive from the enclosing object,
not from the subobject itself.
To test your model of bounds-checking in C, what are the bounds of
foo->buf
after
struct v { int bar; char buf[1]; } *foo;
foo = malloc (sizeof (struct v) + 2);
?
(ignoring possible allocation failure)
Is foo->buf[2] within bounds according to your model?
For your convenience, i quote the relevant passage of a summary
of X3J11's first meeting in the interpretations phase, which
Doug Gwyn was so considerate to post to comp.std.c.
(various disclaimers in all caps (not official etc.) deleted)
+ There are no technical problems with using malloc()ed objects
I.e. my dirent implementation is portable: bounds checking is NOT
allowed in cases like the common struct record_header {...buffer[1];
/*more than 1 actually allocated*/} technique where the right size
is allocated for the buffer in the object via malloc() (as per
Karl Heuer's argument).
Considering the disclaimers, you might choose to ignore this.
->--
-> _--_|\ `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
->/ \ 'U`
->\_.--._/
-> v
BTW, why to you have a map of Australia in your .sig? (Just curious).
--
Niels J|rgen Kruse DIKU Graduate njk@diku.dk
jlg@lambda.UUCP (Jim Giles) (04/11/90)
From article <S4U2APBggpc2@ficc.uu.net>, by peter@ficc.uu.net (Peter da Silva): > In article <14316@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >> Pointers are more general than arrays in that they are free from bounds >> checking and allow arbitrary aliasing. > > The first statement here is false, as I have demonstrated. [...] You have 'demonstrated' no such thing. You have proposed an implementation which would _make_ my first statement false. You have asserted (perhaps correctly) that the new C standard actually supports your interpretation. However, you have _NOT_ demonstrated that pointers which obey bounds are more general that arrays (they aren't). You have _NOT_ demonstrated that pointers _should_ obey bounds. In fact, what you are recommending is a constraint on pointers which removes one of the _only_ features that I thought pointers were useful for. I already had a pretty low opinion about the value of pointers and you are busy destroying what respect I had left. If it is your intent to make pointers seem _less_ useful than they might be, you are succeeding. > [...] >> Arrays are more general than >> pointers in that they can be multiply dimensioned. > > The C semantics of multiply dimensioned arrays as arrays of arrays > demonstrates the opposite. How does a misinterpretation of the semantic properties of multi- dimensional arrays demonstrate anything but that C has not been well designed for array manipulation? Multidimensional arrays are NOT arrays-of-arrays. The subscripts of a multidimensional array are independent and carry _equal_ weight. There are no 'master' dimensions to which the others are 'slaves'. The user of multidimensional arrays should seldom have to know whether arrays are stored row- or column-wise (he should _care_ about this difference even less often). C forces such irrelevant distinctions into every context where multidimensional arrays are used: as args to procedures, the user must provide for the array indexing calculation itself by manual means. > [...] >> Pointers are only isomorphic to _one_dimensional_arrays_, and only if >> their aliasing and freedom from bounds are both curtailed. > > The first statement is false, because C semantics treats pointers as > multiply dimensioned arrays (given a proper declaration), [...] Oh! Gee, I was unaware of that. Please enlighten me as to the proper declaration for the following case: main(){ int A[10][20]; ... sub1(A); ... } sub1(pa) ... /* somewhere here there should be a declaration which allows me to treat pa as a multiply dimensioned array. I hope you can show it to me. Everyone else claims that I have to declare a macro to do the subscript calculation (or, shudder, do the calculation by hand). */ {...} Actually, of course, C has no way of doing this. The procedure can't find out the dimensions of the argument. And even if I were to pass the values 10 and 20 as separate arguments, the C declaration rules would not permit me to declare 'pa' as "int pa[bnd1][bnd2]". No, 'pa' is isomorphic to a one-dimensional array with bound 10*20==200. You can move 'pa' around linearly by adding and subtracting to it, but you can _never_ double subscript it. > [...] and the second > statement is only true for programs that are broken in any case... just > like Fortran programs where you do something like: > > INTEGER A(10) > > CALL PROC(A,A) Not necessarily broken. Only if PROC modifies one or more of its arguments. Many C programs which do this sort of thing also turn out to be buggy. That's why you don't lose much by making it illegal. J. Giles
gudeman@cs.arizona.edu (David Gudeman) (04/11/90)
[about the 3-tuple method for bounds checking...] >...You have _NOT_ demonstrated that pointers _should_ obey bounds... > >In fact, what you are recommending is a constraint on pointers which >removes one of the _only_ features that I thought pointers were useful >for. I already had a pretty low opinion about the value of pointers >and you are busy destroying what respect I had left. If it is your >intent to make pointers seem _less_ useful than they might be, you >are succeeding. OK, I'm confused. I thought one of your objections to pointers was that they can point to undefined values, and that there was no way to do run-time error checking for this as is done for array indexes. I showed a way to do it, and now you say that I've removed the only advantage they have. How can you criticize pointers for having a feature and also say that that feature is their only advantage? The only thing I've done is provide a way to ensure that the undefined operations are detected, causing a meaningful error message rather than the endearing "bus error: core dumped" or just peculiar (undefined) behavior. My proposal puts no restrictions on the well-defined pointer operations at all. As to whether pointers _should_ obey bounds, it would be hard to justify any other interpretation. Notice that I am _not_ restricting what objects pointers can point to, only ensuring that they point to real objects. Any other interpretation would be either non-portable or force compilers on some machines to produce extremely poor code. Are you suggesting that it would be an _advantage_ if pointers could be used like this? int foo() { int *p, i,j,k,l,m,n; for (p = &i; p <= &n; p++) do_something(p); } This sort of thing can be useful I suppose, but it is not defined in the C language, and all I'm doing is checking that it doesn't happen. -- David Gudeman Department of Computer Science The University of Arizona gudeman@cs.arizona.edu Tucson, AZ 85721 noao!arizona!gudeman
gudeman@cs.arizona.edu (David Gudeman) (04/11/90)
In article <14324@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: [about compiling different versions of procedures for different aliasing patterns...] >Actually, I recommended it as something _NOT_ to do. The only way >to make it genuinely useful is to have special versions of the code >code each combination of aliased/non-aliased arguments. Since this >increases combinatorially (by the definition of combinatorial :-) >with the number of _prtentially_ aliased items, this leads to an >unacceptably large number of alternatives. Actually, it increases exponentially with the number of aliasing patterns in the arguments of the functions. For a function of 3 array arguments, the maximum number of versions needed is 2^3 = 8. Just how many array arguments do your functions have? And I wouldn't even consider trying to compile a seperate function for _every_ possibility. Just 2 or so, maybe a worst case and an average case. -- David Gudeman Department of Computer Science The University of Arizona gudeman@cs.arizona.edu Tucson, AZ 85721 noao!arizona!gudeman
gudeman@cs.arizona.edu (David Gudeman) (04/11/90)
In article <6529@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes: >And why not be practical? Implementing pointers with triples!? >Going to be hard to keep in a register. Going to be hard to cast >into a long int and back. Obviously the triple method is only intended to be used for code developement, the finished product would be compiled without bounds checks of any kind. The casting is going to cause some trouble, but then anyone doing that kind of thing is on his own. (Actually, I think I could handle that case also, but I'm not sure it would be worth the effort.) >With enough work by the compiler, the penalty for array bounds checking >can be reduced to less than 10% on systems code. > >Care to speculate on the cost of a 3-tuple scheme for pointers? >If not, can we dispose of them as a topic? Dispose of them as a topic because they are inapropriate for one particular application (finished products)? Hardly sounds reasonable. I usually don't keep the "asserts" in my finished products either, does that make them useless? >Aliasing due to pointers can be introduced at any assignment. >It's possible to detect aliasing, but the analysis is either >very imprecise or very expensive or both. By imprecise, I mean >overly conservative -- pointers will appear to point to many places. True enough for general pointer use. However, the statement made by Jim Giles -- that array indexes can be optimized more than pointer arithmetic used for the same purpose -- is incorrect. When a pointer is being used with pointer arithmetic in such a way that it could be easily replaced by array indexing, then the structure of the operations is such that it can be optimized just like the indexing could (in most cases). -- David Gudeman Department of Computer Science The University of Arizona gudeman@cs.arizona.edu Tucson, AZ 85721 noao!arizona!gudeman
brnstnd@stealth.acf.nyu.edu (04/11/90)
In article <14309@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: [ about the dimensions of an array ] > As for "known at compile time" - well, N and M aren't if they were > passed as parameters. Oh, but that's illegal in C too. I've never > heard a rational reason for it to be illegal, but that's how things > go. Efficiency and simplicity, as is obvious from the machine language. I and many others agree that variable dimensions are useful enough to offset the angst they cause the compiler. That's why some available compilers, and probably future standards, support them. In the meantime you'll just have to settle for manual indexing. ---Dan
brnstnd@stealth.acf.nyu.edu (04/11/90)
In article <14324@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
[ basically, it doesn't work to compile once for aliasing and once ]
[ for non-aliasing, because you have to do this for each potentially ]
[ aliased parameter, and there's a combinatorial explosion, and it ]
[ can't be done efficiently, blah blah blah ]
Jim, you're being silly. As you delight in pointing out, most parameters
ever passed in real programs aren't aliased to each other. Hence it's
enough to compile one version where no parameters are aliased and one
version where all parameters are aliased if at all possible. (If it's
likely that some parameters may be aliased, the compiler can introduce
an intermediate level where just those parameters are aliased, provided
that the programmer or a global analysis points them out.) Say ``oops.''
---Dan
brnstnd@stealth.acf.nyu.edu (04/11/90)
In article <14327@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: > From article <S4U2APBggpc2@ficc.uu.net>, by peter@ficc.uu.net (Peter da Silva): [ Peter once again reminds Jim that pointers can be implemented to ] [ reliably detect what ANSI C calls illegal pointer operations ] > In fact, what you are recommending is a constraint on pointers which > removes one of the _only_ features that I thought pointers were useful > for. Say what? Jim, are you saying that you thought pointers were only useful for operations that ANSI C calls illegal? Once again, the ``constraint'' embodied by, say, pointer 3-tuples changes absolutely none of the ANSI C pointer semantics---because that constraint *is* the semantics. I must admit I've never found any use for going outside those semantics, except in device drivers. > Multidimensional arrays > are NOT arrays-of-arrays. The subscripts of a multidimensional > array are independent and carry _equal_ weight. Both assembly code and mathematics say otherwise: at some level of specification (the opposite of abstraction), a multidimensional array really is an array of arrays. I agree, the user often doesn't think of it that way; but it's still the truth. > > INTEGER A(10) > > CALL PROC(A,A) > Not necessarily broken. Only if PROC modifies one or more of its > arguments. Note that Fortran has no way to express that concept explicitly, while C can declare const int *A. Hmmm, shall we start flogging this horse too? ---Dan
jlg@lambda.UUCP (Jim Giles) (04/11/90)
From article <20079@megaron.cs.arizona.edu>, by gudeman@cs.arizona.edu (David Gudeman): > [...] > OK, I'm confused. I thought one of your objections to pointers was > that they can point to undefined values, and that there was no way to > do run-time error checking for this as is done for array indexes. I never said so. My objection to pointers is (and always has been) the use of them when arrays are more appropriate. Pointers _should_ be used for things that arrays are _not_ suited to. One instance of this is to implement dynamic memory. Here, it is the _purpose_ of the code to return a reference to undefined values (undefined until allocation is complete anyway). Arrays are already bounded. Why have two features with the same functionality? > [...] > As to whether pointers _should_ obey bounds, it would be hard to > justify any other interpretation. [...] > [...] Any other interpretation would be either non-portable > or force compilers on some machines to produce extremely poor code. On any machine (even segmented ones) the memory model presented to the user code is a contiguous memory address space. It is within this linear address space that pointers should be free to roam. (On virtual memory systems, the system may actually map the user's memory allocation to discontiguous hardware memory locations. But, this is all invisable to the user code - or _should_ be.) > [...] > Are you suggesting that it would be an _advantage_ if pointers could > be used like this? > int foo() > { int *p, i,j,k,l,m,n; > for (p = &i; p <= &n; p++) do_something(p); > } No. I would not recommend such a thing. Nor do I suggest using pointers as a poor cousin of an array. I want pointers to behave as addresses within the linear address space presented to the user by the system. J. Giles
jlg@lambda.UUCP (Jim Giles) (04/11/90)
From article <20080@megaron.cs.arizona.edu>, by gudeman@cs.arizona.edu (David Gudeman): > [...] > Actually, it increases exponentially with the number of aliasing > patterns in the arguments of the functions. For a function of 3 > array arguments, the maximum number of versions needed is 2^3 = 8. No, the number increases combinatorially. Aliasing occurs pairwise between arguments (an/or global data). So, the number of potentially aliased pairs is N choose 2, ie. a binomial coefficient. Here, N is the number of variables in the pool of potentially aliased objects. > Just how many array arguments do your functions have? And I wouldn't > even consider trying to compile a seperate function for _every_ > possibility. Just 2 or so, maybe a worst case and an average case. Actually, the procedures which are required to be fast may have lots of parameters (like the RKF differential equation solver). You give me the choice of alias free performance or very _worst_ case. I'll take the first and skip the second by not allowing _any_ aliasing (at least until the tools required for interprocedural analysis are cheap and widely available). J. Giles
jlg@lambda.UUCP (Jim Giles) (04/11/90)
From article <296:Apr1105:06:4490@stealth.acf.nyu.edu>, by brnstnd@stealth.acf.nyu.edu: > In article <14327@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: > [...] >> Multidimensional arrays >> are NOT arrays-of-arrays. The subscripts of a multidimensional >> array are independent and carry _equal_ weight. > Both assembly code and mathematics say otherwise: at some level of > specification (the opposite of abstraction), a multidimensional array > really is an array of arrays. I agree, the user often doesn't think of > it that way; but it's still the truth. Who cares about what assembly code does? It is the purpose of a high level language is to present an environment that is relatively free from such concerns. Mathematically, what you say is not true. The ordering of subscripts is an arbitrary decision by the mathematician who lays out the calculation. Mathematically, such operations as TRANSPOSE are regarded as 'feebies'. The level of specification you are refering to is _below_ that at which mathematics works. Certainly such concerns as row- vs. column-wise storage are (rightly) regarded as irrelevant to the user. The only time such concerns become important is when efficiency forces them into the limelight. Even in this case, an implementation which _efficiently_ handles such details is preferable to one which requires user intervention. > [...] >> > INTEGER A(10) >> > CALL PROC(A,A) >> Not necessarily broken. Only if PROC modifies one or more of its >> arguments. > Note that Fortran has no way to express that concept explicitly, while C > can declare const int *A. Hmmm, shall we start flogging this horse too? Why not? I _support_ the addition of an explicit way of declaring which procedure arguments are changed and which are not. I am NOT and Fortran fanatic. I don't maintain that it is perfect or even nearly so. In this case, I would rather deal with Fortran's shortcomings than with C's though. J. Giles
peter@ficc.uu.net (Peter da Silva) (04/11/90)
In article <14327@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: > However, you have _NOT_ demonstrated that pointers which obey bounds are > more general that arrays (they aren't). Fine. How do you implement dynamic storage with arrays? > In fact, what you are recommending is a constraint on pointers which > removes one of the _only_ features that I thought pointers were useful > for. It would help if you would explain what this feature is, rather than just telling me I've recommended a constraint on pointers (I've done no such thing) that removes it (whatever it is). > main(){ > int A[10][20]; > ... > sub1(A); > ... > } > sub1(pa) > ... /* somewhere here there should be a declaration > which allows me to treat pa as a multiply dimensioned > array. I hope you can show it to me. Everyone > else claims that I have to declare a macro to do the > subscript calculation (or, shudder, do the calculation > by hand). */ int (*pa)[10][20]; (*pa)[4][5]; > No, 'pa' is isomorphic to a one-dimensional array with bound 10*20==200. > You can move 'pa' around linearly by adding and subtracting to it, but > you can _never_ double subscript it. (*pa)[4][5]; Nope. Never. -- _--_|\ `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ 'U` \_.--._/ v
peter@ficc.uu.net (Peter da Silva) (04/11/90)
In article <1990Apr10.215213.18124@diku.dk> njk@diku.dk (Niels J|rgen Kruse) writes: > I disagree. It is my impression (i may be wrong), that the bounds > of a pointer to a subobject derive from the enclosing object, > not from the subobject itself. I think that is probably a good idea > To test your model of bounds-checking in C, what are the bounds of > foo->buf > after > struct v { int bar; char buf[1]; } *foo; > foo = malloc (sizeof (struct v) + 2); > ? > (ignoring possible allocation failure) > Is foo->buf[2] within bounds according to your model? My immediate reaction would be to say "no", but since X3J11 has seen fit to allow this special case I think it would be OK for the compiler to treat structures and arrays differently here, because the use of substructures to refer to the whole of an extended structure has been widespread. For another example: typedef struct link { struct link *next; struct link *prev; }; struct foo { link foo_header; ... }; addlist(struct link *list, struct link *element); struct link *gethead(struct link *list); struct foo *bar; addlist(list, bar); On the other hand, you *do* need to expand bounds going the other way: bar = gethead(list); > Considering the disclaimers, you might choose to ignore this. It's a tough call. I'd have been happy to go back and fix my code that used the foo[1] convention if they'd gone the other way, I think that just special-casing this convention would be acceptable. > BTW, why to you have a map of Australia in your .sig? (Just curious). I'm Australian. -- _--_|\ `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ 'U` \_.--._/ v
mjs@hpfcso.HP.COM (Marc Sabatella) (04/11/90)
>In fact, what you are recommending is a constraint on pointers which >removes one of the _only_ features that I thought pointers were useful >for. I already had a pretty low opinion about the value of pointers >and you are busy destroying what respect I had left. If it is your >intent to make pointers seem _less_ useful than they might be, you >are succeeding. I must come from a different background than you do. Where I come from, pointers are used for things like linked lists, binary trees, etc. Arrays can be used for this only if you know a priori the ultimate size of the data structure in question, or if your language allows dynamically reallocatable arrays.
brnstnd@stealth.acf.nyu.edu (04/12/90)
In article <14333@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: > From article <296:Apr1105:06:4490@stealth.acf.nyu.edu>, by brnstnd@stealth.acf.nyu.edu: > > In article <14327@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: [ the following overgeneralization: ] > >> Multidimensional arrays > >> are NOT arrays-of-arrays. The subscripts of a multidimensional > >> array are independent and carry _equal_ weight. > > Both assembly code and mathematics say otherwise: at some level of > > specification (the opposite of abstraction), a multidimensional array > > really is an array of arrays. I agree, the user often doesn't think of > > it that way; but it's still the truth. > Who cares about what assembly code does? It is the purpose of a high > level language is to present an environment that is relatively free > from such concerns. Relatively but not completely. Any program will show a speed hit if you bounce all around memory rather than going through it sequentially; as the user can notice this and (in C but not in Fortran!) write code to prevent it, your overgeneralization is inaccurate. > Mathematically, what you say is not true. The ordering of subscripts > is an arbitrary decision by the mathematician who lays out the calculation. Not at a certain level. The ordered pair (a,b) is usually defined as the set {a,{a,b}}, which is not symmetric---though of course you knew that. Then again, the isomorphism between ((a,b),M(a,b)) and (a,(b,M(a,b))) makes it obvious that a matrix M of n dimensions is a one-dimensional array consisting of matrices M_i of n - 1 dimensions. In other words, a multidimensional array is an array of arrays. Yeah, that's it, lay on the heavy terminology and he'll crumple. :-) > The only > time such concerns become important [ but your overgeneralization implies they're nonexistent! ] > is when efficiency forces them > into the limelight. Golly, I never worry about uh-fish-in-sea. Duh, Pee-tuh, do you ever worry about uh-fish-in-sea? 1/2 :-) > >> > INTEGER A(10) > >> > CALL PROC(A,A) > >> Not necessarily broken. Only if PROC modifies one or more of its > >> arguments. > > Note that Fortran has no way to express that concept explicitly, while C > > can declare const int *A. Hmmm, shall we start flogging this horse too? > Why not? I _support_ the addition of an explicit way of declaring which > procedure arguments are changed and which are not. Aha! I _support_ the addition of an explicit way of using parameters as array dimensions (and in fact even more generally---but that's a start). Let's make a pact: You shut up about C's failure to provide parametrized dimensions and I'll shut up about Fortran's failure to detect errors in changing what are meant to be constant values. Agreed, at least for this discussion? ---Dan
gudeman@cs.arizona.edu (David Gudeman) (04/12/90)
In article <14332@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
]From article <20080@megaron.cs.arizona.edu>, by gudeman@cs.arizona.edu (David Gudeman):
]> [...]
]> Actually, it increases exponentially with the number of aliasing
]> patterns in the arguments of the functions. For a function of 3
]> array arguments, the maximum number of versions needed is 2^3 = 8.
]
]No, the number increases combinatorially. Aliasing occurs pairwise
]between arguments (an/or global data). So, the number of potentially
]aliased pairs is N choose 2, ie. a binomial coefficient. Here, N is the
]number of variables in the pool of potentially aliased objects.
Oops. I was simplifying to the assumption that each argument was
either potentially aliased or it was not. I suppose that the
combinatorial analysis would allow better code generation in some
cases, but no one want to compile more than two versions anyway.
--
David Gudeman
Department of Computer Science
The University of Arizona gudeman@cs.arizona.edu
Tucson, AZ 85721 noao!arizona!gudeman
gudeman@cs.arizona.edu (David Gudeman) (04/12/90)
(given the number of articles I'm writting, I must have a lot of spare time these days...) In article <14330@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >Here, it is the _purpose_ of the code to >return a reference to undefined values (undefined until allocation is >complete anyway). There is a difference between an undefined _operation_ on a pointer and pointing to an undefined _value_. I'm sorry Jim, but it should be obvious that the 3-tuple scheme does not prevent malloc() from working, and it really looks like you are resorting to desperate rhetorical ploys in an attempt to support your position. >> Are you suggesting that it would be an _advantage_ if pointers could >> be used like this? >> int foo() >> { int *p, i,j,k,l,m,n; >> for (p = &i; p <= &n; p++) do_something(p); >> } >No. I would not recommend such a thing. Nor do I suggest using pointers >as a poor cousin of an array. I want pointers to behave as addresses >within the linear address space presented to the user by the system. And the 3-tuple system does exactly that. It only prevents the user from doing things that have no defined meaning. As to "poor cousins", I wish you would stop pretending that there is something objective about your personal distaste for the way pointers are used. There are a lot of people who think that for (p = A; p < A + A_SIZE; p++) foo(*p); is more elegant than for (i = 0; i < A_SIZE; i++) foo(A[i]); it is just a matter of taste. You sound like an art critic who thinks that his personal tastes are universal constants, and that others who don't like the same things are "wrong". -- David Gudeman Department of Computer Science The University of Arizona gudeman@cs.arizona.edu Tucson, AZ 85721 noao!arizona!gudeman
jlg@lambda.UUCP (Jim Giles) (04/12/90)
From article <20145@megaron.cs.arizona.edu>, by gudeman@cs.arizona.edu (David Gudeman): > [...] I'm sorry Jim, but it should be > obvious that the 3-tuple scheme does not prevent malloc() from > working, and it really looks like you are resorting to desperate > rhetorical ploys in an attempt to support your position. No. And I'll test your claim right here. Using C, with you 3-tuple idea, write the following routines: brk(), sbrk(), malloc(), and free(). You won't succeed. This is because _at_least_one_ of these routines has to have direct explicit access to the internal representations of your 3-tuples (something has to be responsible for setting them). Since I maintain that the best use of explicitly visable pointers is in the implementation of routines like those mentioned above, and your 3-tuple prevents me from doing so with C's pointers, I claim that you have removed one of the only interesting properties of pointers. > [...] There are > a lot of people who think that > for (p = A; p < A + A_SIZE; p++) foo(*p); > is more elegant than > for (i = 0; i < A_SIZE; i++) foo(A[i]); And I guess it's just a matter of taste that I consider: for (i = 1; i < N-1; i++ ) for (j = 1; j < M-1; j++ ) A[i][j] = 0; is better than: for (p = A; ???; ???) /* Oh what the hell. I don't feel like working it out. Why don't _you_ tell me how to complete this code to zero the interior of a two-dimensional array. I can't be bothered to work out something that the _compiler_ should be responsible for (not at this time of night anyway). */ What ever you come up with will either _not_ be as clear as the first above, or it will use some macro (whose definition won't be clear). Either way, I think my preference is _more_ than just personal taste. (And, if it _is_ just personal taste, I think that those who share it will outnumber those who don't - and very probably those who share it will out-program those who don't.) J. Giles
jlg@lambda.UUCP (Jim Giles) (04/12/90)
From article <8960015@hpfcso.HP.COM>, by mjs@hpfcso.HP.COM (Marc Sabatella): > [...] > I must come from a different background than you do. Where I come from, > pointers are used for things like linked lists, binary trees, etc. Where I come from, all those things are called recursive data structures. These are _sometimes_ implemented using user-visable pointers (but only if the language doesn't provide them directly). If the language provides recursive structures directly, even then, the implementation _may_ use pointers internally (or it may not). Pointers are related to recursive data structures only in that pointers are _one_ of the possible implementation strategies for them. > Arrays can be used for this only if you know a priori the ultimate size of > the data structure in question, or if your language allows dynamically > reallocatable arrays. The second choice is something I wholeheartedly support. J. Giles
jlg@lambda.UUCP (Jim Giles) (04/12/90)
From article <7=U2P58xds13@ficc.uu.net>, by peter@ficc.uu.net (Peter da Silva): > In article <14327@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >> However, you have _NOT_ demonstrated that pointers which obey bounds are >> more general that arrays (they aren't). > > Fine. How do you implement dynamic storage with arrays? That's EXACTLY MY POINT!!! Pointers are for dynamic memory implementation (but, not necessarily visable to the user of the allocation mechanism). At least, that's the one application that I'm really certain that pointers should be used for. I can't think of many others. Pointers which obey bounds are worthless as dynamic memory implementors. The dynamic memory allocator is what _sets_ the bounds - be pretty silly to force it to obey them. > [...] >> main(){ >> int A[10][20]; >> ... >> sub1(A); >> ... >> } >> sub1(pa) >> ... /* somewhere here there should be a declaration >> which allows me to treat pa as a multiply dimensioned >> array. I hope you can show it to me. Everyone >> else claims that I have to declare a macro to do the >> subscript calculation (or, shudder, do the calculation >> by hand). */ > > int (*pa)[10][20]; > > (*pa)[4][5]; Nope, Nope, Nope, .... how does the guy who wrote 'sub1' know that the array that will be passed has bounds 10 and 20? Answer: he doesn't. In fact, the routine might be called many times - with a differently bounded array each time. Keep looking! You may find the answer yet! J. Giles
mph@lion.inmos.co.uk (Mike Harrison) (04/12/90)
In article <14341@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >From article <8960015@hpfcso.HP.COM>, by mjs@hpfcso.HP.COM (Marc Sabatella): >> [...] >> Arrays can be used for this only if you know a priori the ultimate size of >> the data structure in question, or if your language allows dynamically >> reallocatable arrays. > >The second choice is something I wholeheartedly support. Sounds like another vote for Algol 68 :-) Mike, Michael P. Harrison - Software Group - Inmos Ltd. UK. ----------------------------------------------------------- UK : mph@inmos.co.uk with STANDARD_DISCLAIMERS; US : mph@inmos.com use STANDARD_DISCLAIMERS;
peter@ficc.uu.net (Peter da Silva) (04/12/90)
> Pointers which obey bounds are worthless as dynamic memory implementors. > The dynamic memory allocator is what _sets_ the bounds - be pretty silly > to force it to obey them. Since you can't implement a dynamic memory allocator portably (of course not: it has to know what pointers look like and how to request blocks from the O/S) who cares? But suppose you want to implement a pool of fixed-sized memory between malloc and the user program (say, you want to allocate 27-byte and 192-byte chunks all the time, and rarely other sizes)... with arrays? > Nope, Nope, Nope, .... how does the guy who wrote 'sub1' know that the > array that will be passed has bounds 10 and 20? Answer: he doesn't. But that's not what you asked. > In fact, the routine might be called many times - with a differently > bounded array each time. Keep looking! You may find the answer yet! Why? So this particular thing is easier in Fortran than in C (or, I might add, in a bunch of other languages)... so what? I'd rather put up with a minor inconvenience (and it is minor... the macro to do dynamic sizing is TRIVIAL) than have to put up with all the things you just *can't* do in Fortran. Then again, I could work in some less-common language and find it quite impossible to even *get* compilers for most of the machines I use, let alone *compatible* ones. Fortran and C are about the only widely available languages with reasonably complete portable subsets. -- _--_|\ `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ 'U` \_.--._/ v
brnstnd@stealth.acf.nyu.edu (04/13/90)
In article <14330@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: > I want pointers to behave as addresses > within the linear address space presented to the user by the system. Ugh. I want pointers to behave as addresses within particular objects, like ANSI says they do. I don't care about the arrangement of those objects within memory; why do you? (Leave device drivers aside for this discussion; they're neither portable nor meant to be portable.) ---Dan
machaffi@fred.cs.washington.edu (Scott MacHaffie) (04/13/90)
In article <20083@megaron.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes: >Obviously the triple method is only intended to be used for code >developement, the finished product would be compiled without bounds >checks of any kind. So, you suggest that people ship code which they haven't tested? If people are testing code with bounds checking turned on and sell it with bounds checking turned off, they're going to have problems. Scott MacHaffie
gudeman@cs.arizona.edu (David Gudeman) (04/13/90)
In article <14340@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >... Using C, with you 3-tuple >idea, write the following routines: brk(), sbrk(), malloc(), and >free(). You won't succeed. This is because _at_least_one_ of these >routines has to have direct explicit access to the internal representations >of your 3-tuples (something has to be responsible for setting them). The problem has not changed at all. I wouldn't know how to write those functions portably with "normal" pointers either unless (1) there is a system function of some sort to tell me where to allocate from (usually brk() or sbrk()) or (2) I declare my own array to allocate from. Either way, the code is identical for "normal" pointers as for bounds-checked pointers. Any code that has well-defined behavior with normal pointers has exactly the same well-defined behavior with bounds-checked pointers. If the semantics has changed at all, it is only in the sense that some behavior that was previously undefined is now defined to produce an error result. >And I guess it's just a matter of taste that I consider: > > for (i = 1; i < N-1; i++ ) > for (j = 1; j < M-1; j++ ) > A[i][j] = 0; > >is better than: > > for (p = A; ???; ???) > /* Oh what the hell. I don't feel like working it out. Why do these multi-dimensional arrays keep popping up? I don't think anyone has said that pointers make a good indexing method for multi-dimensioned arrays. OK, I agree, pointers are not usually a very good way to traverse multi-dimensional arrays. That observation, however has nothing to do with the larger question of whether it is useful to have pointers (and pointer arithmetic) in a language. Incidentally, you chose one of the few examples where I wouldn't mind using pointers to traverse a multi-dimensional array: for (p = (t *)A; p < A + M * N; p++) *p = 0; Of course, you should expect to loose a few points for "obscure code" on your grade... -- David Gudeman Department of Computer Science The University of Arizona gudeman@cs.arizona.edu Tucson, AZ 85721 noao!arizona!gudeman
ske@pkmab.se (Kristoffer Eriksson) (04/13/90)
In article <14327@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >You have 'demonstrated' no such thing. You have proposed an implementation >which would _make_ my first statement false. You have asserted (perhaps >correctly) that the new C standard actually supports your interpretation. >However, you have _NOT_ demonstrated that pointers which obey bounds are >more general that arrays (they aren't). Pointers can still be assigned to, and thus point to or into different arrays or structures at different times. Arrays can't. Just because you have bounds in the pointers, doesn't mean that a particular pointer variable has to contain the same bounds at every time. > You have _NOT_ demonstrated that pointers _should_ obey bounds. Of course they "should". At least the C standard explicitely leaves room for bounds checking. You are not required not to check bounds. Thus, you may choose freely whether to do it or not. If you like bounds checking in general, naturally you want it on pointers too. >In fact, what you are recommending is a constraint on pointers which >removes one of the _only_ features that I thought pointers were useful >for. I don't know what you are thinking of here. If it is possible for pointer variables to point at different objects at different times, and there is a mechanism for obtaining pointers with smaller bounds from pointers with larger bounds, for instance some kind of casting, I can't imagine what more you could posible desire. > I already had a pretty low opinion about the value of pointers >and you are busy destroying what respect I had left. If it is your >intent to make pointers seem _less_ useful than they might be, you >are succeeding. It seems to me that here, you are fooling yourself somehow. Either you had unrealistic views of how pointer could be used, to start with (but I haven't seen any other evidence for that), or you are somehow misinterpreting the effects of putting bounds into the pointers. >main(){ > int A[10][20]; > ... > sub1(A); > ... >} > sub1(pa) >Actually, of course, C has no way of doing this. The procedure can't >find out the dimensions of the argument. And even if I were to pass >the values 10 and 20 as separate arguments, the C declaration rules >would not permit me to declare 'pa' as "int pa[bnd1][bnd2]". Doesn't GNU C allow that kind of variable declarations? Ok, they're not allowed by the C standard, but there's no inherent difficulty in adding them. >No, 'pa' is isomorphic to a one-dimensional array with bound 10*20==200. >You can move 'pa' around linearly by adding and subtracting to it, but >you can _never_ double subscript it. As has been pointed out, it's never been a problem to double subscript function arguments, if you declare them like arrays with constant sizes, and expanding the language to allow variable sizes wouldn't change the pointer concept of C in the least, thus this particular point is not inherent to pointers. Maybe we should try not to confuse the general usefulness of pointers and what particular features for using pointers are included in the C-standard at one particular moment in time. Also, when you for instance say that C should be changed to include real arrays rather than bounds-checking pointers, don't you rather mean _non-aliased_ and constant references (whether arrays or pointers acting as arrays) ? Just because aliased arrays are disallowed in Fortran, does that make non-aliasing an inherent feature of arrays? Isn't this, again, confusing arrays with rules that apply to arrays in one (or several) particular language? Why don't you simply say that you'd like C to include non-aliased arrays, to be able to optimize the particular kind of applications that you are involved in, and leave it at that, rather than saying that pointers are bad and C is bad and Fortran or anything else is better. Can't we just agree that C and Fortran have radically different historical backgrounds? I'd say that on many points you are technically right in what you say, except, sometimes, about C details, but your _perspective_ is what upsets most people. The views about what is important in a particular language, what is useful, what should be required by every language regardless of background, and so on. -- Kristoffer Eriksson, Peridot Konsult AB, Hagagatan 6, S-703 40 Oerebro, Sweden Phone: +46 19-13 03 60 ! e-mail: ske@pkmab.se Fax: +46 19-11 51 03 ! or ...!{uunet,mcvax}!sunic.sunet.se!kullmar!pkmab!ske
seanf@sco.COM (Sean Fagan) (04/13/90)
In article <14340@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >No. And I'll test your claim right here. Using C, with you 3-tuple >idea, write the following routines: brk(), sbrk(), malloc(), and >free(). You won't succeed. This is because _at_least_one_ of these >routines has to have direct explicit access to the internal representations >of your 3-tuples (something has to be responsible for setting them). Even in current implementations, such things either a) are written in assembly (i.e., brk is a system call in SysV), or b) call some other routine which is written in assembly (e.g., malloc calls sbrk calls brk, which is written in assembly). Tell me, is all of the FORTRAN stuff you're so fond of going to be written in FORTRAN? Somehow, I doubt it. >your >3-tuple prevents me from doing so with C's pointers, You *can't* write brk in C only, as far as I know! You can get awfully close, but it's still kinda difficult. -- -----------------+ Sean Eric Fagan | "It's a pity the universe doesn't use [a] segmented seanf@sco.COM | architecture with a protected mode." uunet!sco!seanf | -- Rich Cook, _Wizard's Bane_ (408) 458-1422 | Any opinions expressed are my own, not my employers'.
new@udel.EDU (Darren New) (04/13/90)
In article <20197@megaron.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes: >The problem has not changed at all. I wouldn't know how to write >those functions portably with "normal" pointers either unless (1) >there is a system function of some sort to tell me where to allocate >from (usually brk() or sbrk()) or (2) I declare my own array to >allocate from. Either way, the code is identical for "normal" >pointers as for bounds-checked pointers. Except that you lose the bounds checking in the array case. I.e., if you malloc(50) and then access element 52, as long as it is within your static array you won't get any violation. Also the case with sbrk() uless you can somehow set the bounds. >> for (i = 1; i < N-1; i++ ) >> for (j = 1; j < M-1; j++ ) >> A[i][j] = 0; > for (p = (t *)A; p < A + M * N; p++) *p = 0; These two are not identical. Note that the array-case does not clear the "outside edge" (0 or N) of the array (assuming that is what was meant). -- Darren
gudeman@cs.arizona.edu (David Gudeman) (04/14/90)
In article <11423@june.cs.washington.edu> machaffi@fred.cs.washington.edu (Scott MacHaffie) writes: >In article <20083@megaron.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes: >>Obviously the triple method is only intended to be used for code >>developement, the finished product would be compiled without bounds >>checks of any kind. > >So, you suggest that people ship code which they haven't tested? Please explain how you derived your conclusion from my statement. I see no relationship. -- David Gudeman Department of Computer Science The University of Arizona gudeman@cs.arizona.edu Tucson, AZ 85721 noao!arizona!gudeman
gudeman@cs.arizona.edu (David Gudeman) (04/14/90)
In article <16777@snow-white.udel.EDU> new@udel.EDU (Darren New) writes: >In article <20197@megaron.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes: >>[implementing malloc() with 3-tuple pointers] > >Except that you lose the bounds checking in the array case. I.e., if >you malloc(50) and then access element 52, as long as it is within your >static array you won't get any violation. Also the case with sbrk() >uless you can somehow set the bounds. Lets try to gain some perspective here. The whole 3-tuple thing started because Jim Giles was stomping on C's pointer arithmetic and one of his alleged criticisms was that pointers that access arrays could not be bounds checked, and were therefore inherently less "safe" than indexes. The 3-tuple technique was proposed to show that this argument is wrong. Now, since that has been disposed of, Jim has retreated to the position that 3-tuple pointers are not as general as regular pointers. How this relates to the use of pointers-instead-of-indexes is beyond me, since the bounds checking was only intended to answer that single criticism. Even so however, the assertion was wrong, and I pointed it out. Now someone points out that a malloc() implemented in a C with bounds checked pointers will not return bounds checked pointers (except possibly that pointers will be bounded to the whole region). What does this have to do with either (1) pointers-instead-of-indexes or (2) whether the semantics of bounds-checked pointers is compatible with the semantics of unchecked pointers? What can I say except (1) try to implement malloc() with arrays and indexes to get bounds-checked pointers (or indexes) and (2) the malloc() implemented with 3-tuple pointers has the same semantics as the malloc() implemented with unchecked pointers. If you want malloc() to return bounds-checked pointers, you obviously can't implement it entirely in C, whether or not the C pointers are bounds checked. >>> for (i = 1; i < N-1; i++ ) >>> for (j = 1; j < M-1; j++ ) >>> A[i][j] = 0; >> for (p = (t *)A; p < A + M * N; p++) *p = 0; > >These two are not identical. Note that the array-case does not clear >the "outside edge" (0 or N) of the array (assuming that is what was meant). I didn't notice the array bounds. So I wouldn't use a pointer in this case, I would use array indexes. As I've said before, there are some jobs that are better done with arrays and some jobs that are better done with pointers. -- David Gudeman Department of Computer Science The University of Arizona gudeman@cs.arizona.edu Tucson, AZ 85721 noao!arizona!gudeman
barmar@think.com (Barry Margolin) (04/14/90)
In article <14330@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >On any machine (even segmented ones) the memory model presented to >the user code is a contiguous memory address space. Not on the machine I use, a Symbolics Lisp Machine. The memory model presented to user code is object-oriented. It's possible to get actual numeric memory addresses for objects, but these are almost useless outside low-level implementation routines as objects may move around in memory as a result of garbage collection. In fact, this whole 3-tuple thing is not just an academic discussion; it is actually used in the Symbolics C implementation. A C pointer is implemented (in software) as a pair <lisp-array, offset>, and lisp-array objects are implemented (by hardware and microcode) as tuples <type, base, dimensions, other-stuff>. The array reference instructions perform bounds checking in parallel with the memory reference so there's very little overhead. They *could* implement bounds checking of pointer arithmetic, but they haven't; it will detect most attempts to indirect through pointers that exceed their bounds (the one case where it doesn't is with pointers to automatic data -- all the automatic variables in an activation record are stored in one lisp-array, so it will only detect attempts to reference outside the stack frame to which the pointer points, and thus the yucky int *p, a, b, c; for (p=&a; p++; p<=&c) ; will probably "work"). -- Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
brnstnd@stealth.acf.nyu.edu (04/15/90)
In article <1990Apr10.215213.18124@diku.dk> njk@diku.dk (Niels J|rgen Kruse) writes: > > To test your model of bounds-checking in C, what are the bounds of > > foo->buf > > after > > struct v { int bar; char buf[1]; } *foo; > > foo = malloc (sizeof (struct v) + 2); malloc() returns a char pointer bounded between the bottom and the top of the allocated memory. foo then gets a struct v pointer with the same bounds. buf is an end-of-structure array (hooray for this special case) and hence is (when used as a pointer) a char pointer with lower bound sizeof(int) + (char *) foo and upper bound the end of foo. This works perfectly. ANSI's pointer semantics could be defined solely in terms of (pointer,lower,upper). ---Dan
brnstnd@stealth.acf.nyu.edu (04/15/90)
In article <14340@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: > Using C, with you 3-tuple > idea, write the following routines: brk(), sbrk(), malloc(), and > free(). malloc() and free() are provided as library routines, so under a 3-tuple implementation of pointers they're defined to use the correct bounds. (Of course, 3-tuples are invisible to compliant programs.) They don't need to be implemented in portable C, as long as they act correctly. But I understand what you're aiming at, and I'll answer on that level. At least one of brk() and sbrk() must be provided by the OS for malloc() to get any memory in the first place. Now let's assume malloc(n) has set up its data structures, found a character array, and made a char pointer x that points to the n requested characters. The bounds of x are the bounds of the memory array malloc() took from sbrk(); the problem is to change its bounds into x lower, x + n upper. C doesn't provide any explicit casts to do the job; is that your point? In a more powerful C-like language, malloc() would finish off with return (*(char [n] at x)) x; ``char [n]'' is the type for character arrays of size n; ``at x'' is the array type qualifier for arrays starting at x; and so malloc() returns a pointer to a character array of size n---i.e., the 3-tuple (x,x,x+n). > > for (p = A; p < A + A_SIZE; p++) foo(*p); > > is more elegant than > > for (i = 0; i < A_SIZE; i++) foo(A[i]); > And I guess it's just a matter of taste that I consider: > for (i = 1; i < N-1; i++ ) > for (j = 1; j < M-1; j++ ) > A[i][j] = 0; > is better than: > for (p = A; ???; ???) [ proceeds to complain that he doesn't understand what to do ] Come on, Jim. foo A[N][M]; foo (*p)[M]; foo *q; for (p = A; p < A + N; p++) for (q = &(*p)[0]; q < &(*p)[0] + M; q++) *q = 0; I'm assuming array bounds of (0,0) through (N - 1,M - 1); you seem to have taken (1,1) through (N - 2,M - 2). Whatever. > What ever you come up with will either _not_ be as clear as the first > above, or it will use some macro (whose definition won't be clear). Care to repeat that, now that you've seen how simple the solution is? Ada types probably wouldn't mind using macro FIRST(p) for (&((*p)[0])); that definition is perfectly clear, and for (p = A; p < A + N; p++) for (q = FIRST(p); q < FIRST(p) + M; q++) *q = 0; is very readable. It also provides an advantage that you still haven't admitted: namely, that p and q can be bound-checked the moment they're assigned values. ---Dan