fouts@lemming.nas.nasa.gov.nas.nasa.gov (Marty Fouts) (09/21/88)
From the point of view of a language designer, one way to think of arrays is that they provide a notation for naming a block of contiguous memory and a mapping references to single elements of that block. Consider the Fortran array A(2,2) (using 1 based indexing.) This array names 4 consecutive locations: Some_Address + 0 A(1,1) Some_Address + 1 A(2,1) Some_Address + 2 A(1,2) Some_Address + 3 A(2,2) This approach generalizes to multiple dimensions, structures don't have to fit into a single memory word, and indices can range from any minimum to any maximum. In this model, an Array is precisely a rectangular array with adjacent elements. This is exactly equivalent to using pointer arithmetic to accomplish the naming, in which the arithematic is given using the classic offset formulae: Pointer = Some_Address Address_Of(A(i,j)) = Pointer + (i-1) + ( (j-1) * 2 ) It doesn't matter if your language allows you to explicitly express the latter form or not, if you implement arrays as blocks of memory, the equivalence exists. [It is in fact, how the compiler does array reference calculations without optimization in most languages. . .] You can notice that (j-1) * 2 requires a multiplication. (Bit shift in this case, but multiplication if the array dimension isn't a power of two.) This can be avoided by building a table of precomputed addresses, called a dope vector. Some compilers do this as an optimization for multiple dimension arrays, to avoid a lot of runtime calculations. A 'Trick' many Fortran programmers use to avoid array index calculations when marching through a multiple dimension array is to use Fortran to "flatten" the array: PROGRAM XMPL PARAMETER *** Specify values for N, M, and L DIMENSION X(N,M,L) CALL INIT(X,N*M*L) *** etc *** END SUBROUTINE INIT(X,ISIZE) DIMENSION X(ISIZE) DO 10 I = 1, ISIZE X(I) = *** initialization value 10 CONTINUE RETURN END Of course, if your language has explicit address manipulation, you can avoid the subroutine call via P = &X(1,1,1) DO 10 I = 0, ISIZE - 1 *(P+I) = *** initialization value 10 CONTINUE You can also do your own 'dope vector' optimization: DO 30 I = 1, L DO 20 J = 1, M DO 10 K = 1,N *** Calculate using X(I,J,K) 10 CONTINUE 20 CONTINUE 30 CONTINUE can also be expressed as INDEX = 0 DO 30 I = 1, L DO 20 J = 1, M DO 10 K = 1,N *** Calculate using *(P+INDEX) INDEX = INDEX + 1 10 CONTINUE 20 CONTINUE 30 CONTINUE an optimization one can argue that the compiler should make, but not all compilers do. Also, you can write your own function for mapping names to physical address and implement sparse matrices, triangular matrices, etc, in a way you can control. You can also generalize all of these things. Current Fortran doesn't give you the kind of control to do this, because it has no mechanism for explicit control of name to address mapping and aliasing at this level. Current C does some of this, although not all of it well. In addition, current C adds more general addres to name aliasing functionality. Contrary to some of the arguments in this list, from the point of view of a language designer, pointers/arrays serve very much the same purpose. contiguous arrays are a compact special purpose mechanism for annotating a mapping from names to addresses. Pointers also provide such a mechanism, plus give additional semantics. C recognizes the existing equivalance by allowing the two syntactic structures which express the same semantics to be used interchangably. This is a notational convenience which can make algorithms easier to read. (Also harder, it is still possible to write unreadable code in any language.) To see the value of multiple syntax for the same semantics, consider the result (proven long ago) that the only structures needed to express control in sequential programs are sequence, branching, and loop-while-true. Most languages offer several variations on branching and looping, for the notational use of the programmer, and few people would argue that we should do away with all control structures except goto and while-condition-do-block. A language with pointers has more expressive capabilty (although, of course, it can't compute anything one without them can) than one which doesn't. This gives the programmer more power in expressing computation, at the expensive of providing more ways to make mistakes. Some programmers would prefer not to have access to the expressive power, as a way of avoiding making bugs. Others, myself included, are willing to take the chance. Marty +-+-+-+ I don't know who I am, why should you? +-+-+-+ | fouts@lemming.nas.nasa.gov | | ...!ames!orville!fouts | | Never attribute to malice what can be | +-+-+-+ explained by incompetence. +-+-+-+
lamaster@ames.arc.nasa.gov (Hugh LaMaster) (09/21/88)
In article <1023@amelia.nas.nasa.gov> fouts@lemming.nas.nasa.gov.nas.nasa.gov (Marty Fouts) writes: > >Contrary to some of the arguments in this list, from the point of view >of a language designer, pointers/arrays serve very much the same >purpose. contiguous arrays are a compact special purpose mechanism >for annotating a mapping from names to addresses. Pointers also You lost me here. Arrays are a particular type of data structure. They are no different from any other type of data structure. So, if I generalize your argument, you could say "any data structure is a compact mechanism for annotating a mapping from names to addresses." Taken to an extreme, I don't need data structures at all, just pointers and memory cells. Which is true, but... ALMOST everyone thinks data structures are much clearer and less error prone than pointers and raw memory, and Portability always suffers when you drag in raw memory. Contrary to popular belief, not all machines are byte addressable, and not all byte addressable machines allow arbitrary alignment. In fact, not all machines have a single contiguous address space. For all of these reasons, old Fortran programmers prefer to use pointers ONLY for passing the address of a data structure around, and not for other things. >Some programmers would prefer not to have access to the expressive >power, as a way of avoiding making bugs. Others, myself included, are >willing to take the chance. I have not seen anyone in this group suggest abolishing pointers. In fact, it turns out that many Fortrans have pointers, as some people were boasting. The argument is whether pointers should be used for anything fancier than as a pointer to a data structure. I suggest that, from the point of view of project management, other uses of pointers should be treated like assembly language code: OK, if you have to, but, confine it to a small number of replaceable modules. -- Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117
fouts@lemming.nas.nasa.gov.nas.nasa.gov (Marty Fouts) (09/22/88)
In article <15269@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov.UUCP (Hugh LaMaster) writes: > >You lost me here. Arrays are a particular type of data structure. They >are no different from any other type of data structure. So, if I >generalize your argument, you could say "any data structure is a >compact mechanism for annotating a mapping from names to addresses." >Taken to an extreme, I don't need data structures at all, just pointers >and memory cells. Which is true, but... > I haven't lost you. That's exactly my point. Now add to it the notational convenience of using pointers when they make the code clearer to read then array references. All programming can be done with a Turing machine, but we go for notational convenience >Portability always suffers when you drag in raw memory. Contrary to >popular belief, not all machines are byte addressable, and not all >byte addressable machines allow arbitrary alignment. In fact, not all >machines have a single contiguous address space. For all of these >reasons, old Fortran programmers prefer to use pointers ONLY for passing >the address of a data structure around, and not for other things. > Yes, you can write bad code with pointers. That's not my point. I have never said to use *raw memory addresses*. I haven't mentioned byte addressing either. Nor have I mentioned 'a single contiguous address space.' *NONE* of these things are relevant to the correct use of pointers in addressing arrays. Pointer arithmetic is done in "thing" sized units, and it doesn't matter if the thing is 1/8 of a word, 1 word, or N words long. If you don't misuse pointers, you don't see the underlying addressing mode. Also, in this context, pointers are used to address that part of memory which is a contiguous address space --- After all, we are talking about pointers into Fortran style arrays, which are contiguous. Not a single such address space, but many, one for each array. Don't generalize about old Fortran programmers. I am an old Fortran programmer, having written my first "High Level Language" code for an IBM 1620 and having written > 100K lines of Fortran; some of which was ported to many machines. I use pointers for all kinds of name aliasing, all of the time. >I have not seen anyone in this group suggest abolishing pointers. In fact, The context of 'expressive power' in the above was not abolishing pointers, it was abolishing the expression of pointer arithmetic/array naming equivalence in a programming language, which has been espoused in this list. >it turns out that many Fortrans have pointers, as some people were >boasting. The argument is whether pointers should be used for anything >fancier than as a pointer to a data structure. I suggest that, from the >point of view of project management, other uses of pointers should be >treated like assembly language code: OK, if you have to, but, confine it >to a small number of replaceable modules. > You are thinking of how to misuse pointers. I am thinking of how to correctly use them. Even Wirth, that bastion of safe-code, thought that pointers were tame enough for Pascal and Modula-2. I can write unmaintainable code using any facility (remember the GOTO debate and how the first round was finally ended by Knuth's well reasoned when to/ when not to paper?) or I can write elegant code, using any facility. +-+-+-+ I don't know who I am, why should you? +-+-+-+ | fouts@lemming.nas.nasa.gov | | ...!ames!orville!fouts | | Never attribute to malice what can be | +-+-+-+ explained by incompetence. +-+-+-+
jlg@lanl.gov (Jim Giles) (09/22/88)
From article <1030@amelia.nas.nasa.gov>, by fouts@lemming.nas.nasa.gov.nas.nasa.gov (Marty Fouts): > [...] Now add to it the > notational convenience of using pointers when they make the code > clearer to read then array references. [...] That is - _NEARLY_NEVER_!! A pointer is a way of placing the structural template defined by a data type over an arbitrary memory location. ^ PERIOD!! That's all a pointer is for. If I don't need that functionality, I also don't want any pointers! You're right in your argument: the purpose of a programming language is to provide convenient and readable access to the functionality of the machine. You're wrong in your conclusion: pointers (especially the C syntax) usually obscure the meaning of the code because they are used to implement functionality that's inappropriate to them. By the way, the use of pointers in the above way _is_ sometimes useful and should be provided in the language. But it shouldn't be used to implement other things for which it is _not_ appropriate. J. Giles Los Alamos
fouts@lemming.nas.nasa.gov.nas.nasa.gov (Marty Fouts) (09/22/88)
In article <3968@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >From article <1030@amelia.nas.nasa.gov>, by fouts@lemming.nas.nasa.gov.nas.nasa.gov (Marty Fouts): >> [...] Now add to it the >> notational convenience of using pointers when they make the code >> clearer to read then array references. [...] > >That is - _NEARLY_NEVER_!! A pointer is a way of placing the structural >template defined by a data type over an arbitrary memory location. > ^ >PERIOD!! That's all a pointer is for. NO. NO. NO. (Gee, I can capshout too ;-) A pointer is also a mechanism by which I introduce an arbitrary data structure into my language, allowing me to easily express data structures that the original language designer didn't think to provide. As Wirth pointed out in the Pascal design, structure+new/dispose+pointer-to-structure allows construction of arbitrary data structures. Unless, of course, you believe that you can correctly identify and supply all of the data structures that I'm going to need, and you want to burden all programers with being aware of all these data structures. Consider the case of FIFOs. I can fairly easily implement a fixed length FIFO using an array, but I grew up on data structures texts where it was always done with a linked list and to me such an approach produces more readable code. (Your milage may vary, but such is the notation I'm use to.) Marty +-+-+-+ I don't know who I am, why should you? +-+-+-+ | fouts@lemming.nas.nasa.gov | | ...!ames!orville!fouts | | Never attribute to malice what can be | +-+-+-+ explained by incompetence. +-+-+-+
bga@raspail.UUCP (Bruce Albrecht) (09/23/88)
In article <1030@amelia.nas.nasa.gov>, fouts@lemming.nas.nasa.gov.nas.nasa.gov (Marty Fouts) writes: > You are thinking of how to misuse pointers. I am thinking of how to > correctly use them. Even Wirth, that bastion of safe-code, thought > that pointers were tame enough for Pascal and Modula-2. I can write > unmaintainable code using any facility (remember the GOTO debate and > how the first round was finally ended by Knuth's well reasoned when > to/ when not to paper?) or I can write elegant code, using any > facility. You're only partially right here. Very few strongly typed languages (and neither Pascal nor Modula-2 are in this category) have any pointer operations other than assignment and comparison. The pointer increment and decrement operations are the major reason why many people are against the way C uses pointers as a way to index through arrays. It's interesting that you compare the use of pointers to the use of GOTO. The argument against the GOTO is that unrestricted use of them tended to result in unreadable, unmaintainable code. I think that using pointer increment/decrement tends to produce the same result. It's more difficult to provide bounds checking in C if you are using pointers to traverse the array elements, and I suspect code is more obscure if the code is not accessing each element sequentially. Nearly any random access to an array is going to be more clearly represented using indices than by using pointers.
ok@quintus.uucp (Richard A. O'Keefe) (09/23/88)
In article <3968@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >That is - _NEARLY_NEVER_!! A pointer is a way of placing the structural >template defined by a data type over an arbitrary memory location. > ^ >PERIOD!! That's all a pointer is for. That's a very strange way of thinking about pointers. It is possible to have pointers without having pointer arithmetic and without having any way of obtaining the address of an existing location. For example, in one of my favourite languages, I would write REF [,] REAL two d array = HEAP [1:m,1:n] REAL; which allocates a new m x n array and binds the reference two d array to it. [Thanks to "dereferencing", the fact that two d array is a reference can be ignored in the code that uses it. This is not an unqualified good.] A pointer is the way of referring to a dynamically allocated object, ... ^ COMMA!! A pointer is for lots of other things too. But "placing a structural template" sounds more like EQUIVALENCE (:-)...
jlg@lanl.gov (Jim Giles) (09/24/88)
From article <364@raspail.UUCP>, by bga@raspail.UUCP (Bruce Albrecht): > [...] > It's interesting that you compare the use of pointers to the use of GOTO. It is indeed. Hoare pointed out that when you compare control structures to data structures, the GOTO is isomorphic to the pointer. I don't think this observation is original to Hoare, but it's true whoever originated it. J. Giles Los Alamos> The argument against the GOTO is that unrestricted use of them tended to > result in unreadable, unmaintainable code. I think that using pointer > increment/decrement tends to produce the same result. It's more difficult > to provide bounds checking in C if you are using pointers to traverse the > array elements, and I suspect code is more obscure if the code is not > accessing each element sequentially. Nearly any random access to an array > is going to be more clearly represented using indices than by using pointers.
jlg@lanl.gov (Jim Giles) (09/24/88)
From article <463@quintus.UUCP>, by ok@quintus.uucp (Richard A. O'Keefe): > [...] > A pointer is the way of referring to a dynamically allocated object, ... > ^ > COMMA!! A pointer is for lots of other things too. > But "placing a structural template" sounds more like EQUIVALENCE (:-)... But there are _better_ ways of doing all those other things - ie. with a separate feature for each thing instead of overloading all those completely different features onto the same construct. I can do all numerical work with just integers, but I don't - I _want_ the language to also support floating point, bit, complex, etc.. In the same way, I _want_ the dynamic memory feature to be different from the linked-list (recursive data type) feature to be different from.... A pointer isn't even the _best_ way of referring to a dynamically allocated object (I'd rather just refer to it as a 'whatevertype' variable, not as a 'pointer_to_whatevertype'). The bottom line is that a user visable pointer is just another source of possible error - something I can do without. J. Giles Los Alamos