brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (02/03/91)
C supports several elementary data structuring mechanisms: pointers, structs, unions, arrays. It's possible to implement any of the usual higher-level mechanisms---lists, generalized Lists, sets, relations, whatever---in terms of the lower-level mechanisms. But you shouldn't forget that you started with the higher-level mechanism, because you can often optimize computation or register use more effectively with that knowledge than if you simply regarded the program as a big pile of pointer, struct, and array manipulations. So: What kinds of assertions should the compiler understand in order to effectively optimize higher-level data structures expressed in C? And what particular assertions are useful for specific operations on specific data structures? For example, in doubly-linked list insertion it's useful to tell the compiler that a node is not the same as either of its neighbors. For this it's enough to have invariants like ``p != p->next''. Note that I am asking these questions as a language designer: I want constructive suggestions that I can use in Q, rather than complaints about how difficult it is to handle assertions like this in C or Ada or whatever language. I'm a little worried about bringing this issue up after the last set of pointer-versus-array flame wars, but I figured there was at least a hope of some useful technical discussion that time, and it's worth trying again. I hope that we can avoid Jim Giles's confusion between syntax and semantics by asking purely semantic questions like ``What assertions help optimization?'' One last meta-comment: Jim Giles implied in the last discussion that a language that ``understands'' some high-level data structures will be more efficient than C plus assertions. I find this position quite silly, as all optimizations must eventually be expressed in terms of machine language, which has only pointers and numeric types. I can't believe that there are any optimizations for machine-language target code which can't be expressed just as effectively as assertions in a lower-level language. But I also realize that arguments about issues like this are pointless. If I'm right, then I don't have to prove it with some theoretical argument; all that matters is that every piece of code Jim writes in his high-level language, the rest of the world can write just as efficiently in an easy-to-use, low-level language. And that's why I'm restarting this discussion. ---Dan