jlg@lanl.gov (Jim Giles) (09/23/88)
From article <1032@amelia.nas.nasa.gov>, by fouts@lemming.nas.nasa.gov.nas.nasa.gov (Marty Fouts): > [...] > NO. NO. NO. (Gee, I can capshout too ;-) A pointer is also a > mechanism by which I introduce an arbitrary data structure into my > language, allowing me to easily express data structures that the > original language designer didn't think to provide. [...] I have never seen a data structure that couldn't be described without pointers. The first book I read on data structuring was "Structured Programming" by Dahl, Dijkstra, and Hoare [1972 - I read it for the first time that very year]. Hoare's chapter on data structuring contains a complete listing of the mechanisms needed to define any structure that I've ever seen anyone use. The language should have all of these in addition to the basic types of integer and float that it decides to support. 1) Unstructured Data Types: This was Hoare's name for what Pascal calls an enumerated data type. Example: type suit = (club,diamond,heart,spade) 2) The Cartesian Product: Hoare's name for what Pascal calls records and C calls structures. Example: type cardface = (s:suit, r:rank) 3) The Discriminated Union: Hoare's name for Pascal's variant records. I believe even C has this concept too :-). Example: type pokercard = (normal:cardface, wild:(joker1, joker2)) 4) The Array: Yes, array is a separate data type. It's not a structure built up from lists - it _should_ be a distinct type. (Multi-d arrays should also be allowed and not implemented as array-of-array-....) 5) The Powerset: Hoare's name for Pascal's sets. Example: type hand = powerset of cardface A 'hand' may contain from zero up to all 52 cards. 6) The Sequence: Strings, sequential files, etc.. A sequence is a data structure made up of an ordered list of data items all of the same type. Sequences have a property of length, which can be explicitly kept with the sequence or the sequence may have a distinct member which signals the end. Example: type string = sequence of character 7) Recussive Data Structures: Linked lists, etc.. These _are_ defined and used in some high-level languages _without_ user visable pointers. Example: type list = (value:integer, link:list) A variable of any recursive data type _may_ have no value at certain times during the execution of a code - it should be possible to check for this condition. _NONE_ of these require user accessable pointers not even the linked list! Items of type 'list' have values of type 'list' - _not_ values of type 'address_of_list'. In no case may the user of any of these data type alter the memory location of a data object (without copying the data). With the addition of an allocate/deallocate mechanism (which is needed for the recursive data structures anyway - might as well make it available to all data types), the set is capable of representing _any_ data structure that I've ever seen. Now, indeed, the _internal_ representations of these data structures may ABOUND with pointers. It is neither the business nor the desire of most users to concern themselves with the internal implementation of these things (any more than most users care what type of semiconductor logic is used in the machine's hardware). Note: Hoare's chapter goes on to describe how to use the above structures in combination in order to implement sparse versions of many of them - also without explicit pointer usage. So, back to my original statement, pointers are needed _only_ for the purpose of placing a data type over an arbitrary memory location. This is a perfectly useful function. I might, for example, want to treat a 27 word SEQUENCE as if it were a 3x3x3 ARRAY. I _could_ copy it to an array - but that's time consuming. I _could_ manipulate the sequence directly and do the index calculations explicitly - but that's hard to read and maintain. It is more readable - and faster - to define a pointer to a 3x3x3 array and set it to the address of the sequence. This use of pointers for 'structured type coersion' is perfectly acceptable and is the _only_ reason I can see for including pointers in a programming language (of course, a suitably constructed run-time EQUIVALENCE would serve the same function). J. Giles Los Alamos
fouts@lemming.nas.nasa.gov.nas.nasa.gov (Marty Fouts) (09/23/88)
In article <4026@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >From article <1032@amelia.nas.nasa.gov>, by fouts@lemming.nas.nasa.gov.nas.nasa.gov (Marty Fouts): >> [...] >> NO. NO. NO. (Gee, I can capshout too ;-) A pointer is also a >> mechanism by which I introduce an arbitrary data structure into my >> language, allowing me to easily express data structures that the >> original language designer didn't think to provide. [...] > >I have never seen a data structure that couldn't be described without >pointers. Read the note you are responding to. I'm not claiming a structure that can't be described without structures. I'm claiming that many data structures and their use can be expressed in a very readable way by using pointers, and that I like that readability. I use that readability. Take as an example, a fifo (a structure I use sometimes in discrete event modeling.) I can implement a fifo using an array and two integers (to keep track of where I am). Using the convention that the head number records the first used entry in the queue and that the tail number records the first unused entry, except when they both have the same value, in which case the fifo is empty, I can write the two Fortran routines: SUBROUTINE ADDTO(FIFO,ISIZE,IHEAD,ITAIL,VALUE) DIMENSION FIFO(ISIZE) IF ((ITAIL .EQ. IHEAD) .AND. (IHEAD .NE. 0)) THEN C C FIFO FULL ERROR HANDLING C RETURN ENDIF ITAIL = ITAIL + 1 IF (ITAIL .GT. ISIZE) ITAIL = 1 FIFO(ITAIL) = VALUE IF (IHEAD .EQ. 0) IHEAD = 1 RETURN END FUNCTION REMOVE(FIFO,ISIZE,IHEAD,ITAIL,VALUE) DIMENSION FIFO(ISIZE) IF (ITAIL .EQ. IHEAD) THEN C C FIFO EMPTY ERROR HANDLING C RETURN ENDIF REMOVE = FIFO(IHEAD) IHEAD = IHEAD + 1 IF (IHEAD .GT. ISIZE) IHEAD = 1 IF (IHEAD .EQ. ITAIL) THEN IHEAD = 0 ITAIL = 0 ENDIF RETURN END I can also do the same thing using pointers (from C): typedef struct fifo_entry { real value; struct fifo_entry *next; } FIFO_ENTRY, *FIFO_ENTRY_PTR; typedef struct fifo { FIFO_ENTRY_PTR head, tail; } FIFO; void addto(fifo,new_value) FIFO fifo; real new_value; { FIFO_ENTRY_PTR new_entry; new_entry = (FIFO_ENTRY_PTR)malloc(sizeof(FIFO_ENTRY)); if (new_entry == 0) { /* MEMORY ERROR EXIT */ return; } new_entry->value = new_value; if (fifo.tail != 0) (fifo.tail)->next = new_entry; fifo.tail = new_entry; if (fifo.head == 0) fifo.head = new_entry; return; } real remove(fifo) FIFO fifo; { real return_value; FIFO_ENTRY_PTR old_entry; if (fifo.head == 0) { /* EMPTY FIFO ERROR EXIT */ return } old_entry = fifo.head; return_value = old_entry->value; fifo.head = fifo.head->next; free((char *)old_entry); } Neither of these is a good substitute for (Not available in any language I usually use) Declare fifo to be a fifo of real; and using 'fifo = x' to add x to the end of fifo and 'x = fifo' to remove x from the begining of the fifo. In the example, the two forms are (to me, because I've had a lot of experience with both) about equally readable. The Fortran form is somewhat more restricted because it implements a fixed length fifo, but that doesn't matter in the applications I run. I would prefer the third form, but if I have to chose between the first two, I would prefer the second, because it more closely maps to what I'm thinking about when I think of fifos. The code gets even messier with more difficult structures, like balanced binary trees. On the other hand, if you implement recursive structures using Hoare's notation, you have to think about recursive structures, and I just haven't been trained to do that. Try representing a dynamic length fifo using Hoare's structure and answering the questions: 1) What happens when I add to a fifo? 2) How do I represent this structure? 3) How do I keep track of the head/tail? 4) How do I detect an empty fifo? BTW, recursive structures are a very clumsy way to think about bags, which differ from sets by being allowed to contain duplicate entries, and about ordered sets/bags, which require the set to be maintained in a sequence which is externally enforced. In summary: Yes, you can have data types without pointers, either by forcing them on top of existing languages, or by using Hoare's recursive structure technique. I never said otherwise. I said that if I don't have the data types and have to build my own abstractions, I prefer to use pointers to do it. +-+-+-+ I don't know who I am, why should you? +-+-+-+ | fouts@lemming.nas.nasa.gov | | ...!ames!orville!fouts | | Never attribute to malice what can be | +-+-+-+ explained by incompetence. +-+-+-+
peter@ficc.uu.net (Peter da Silva) (09/23/88)
In article <4026@lanl.gov>, jlg@lanl.gov (Jim Giles) writes: > _NONE_ of these require user accessable pointers not even the linked list! > Items of type 'list' have values of type 'list' - _not_ values of type > 'address_of_list'.... This is an interesting arrangement. How do you propose to implement insertion into a doubly linked list without pointers: struct { struct list *prev, *next; data; } *p, *q; ... q->next = p; p->prev->next = q; q->prev = p->prev; p->prev = q; Whatever syntax would you use that would adequately hide the pointers and still let you express this? I am genuinely curious. -- Peter da Silva `-_-' Ferranti International Controls Corporation. "Have you hugged U your wolf today?" peter@ficc.uu.net
jlg@lanl.gov (Jim Giles) (09/24/88)
From article <1580@ficc.uu.net>, by peter@ficc.uu.net (Peter da Silva): > struct { > struct list *prev, *next; > data; > } *p, *q; > ... > q->next = p; > p->prev->next = q; > q->prev = p->prev; > p->prev = q; How about (in Fortran-like syntax): type dlink integer data dlink :: next, prev !recursive data declaration end type dlink dlink :: p, q ... q%next = p !Fortran 8x uses % for the structure ref p%prev%next = q !I know! I don't like % either q%prev = p%prev p%prev = q No pointers! Only variables of type 'dlink'. The advantage is that type checking will prevent me from assigning any address other that another variable of type 'dlink' to any of the links. With pointers, type checking is not performed (if it is, then some of the really useful things that pointers can do are lost) - I am stuck with the possibility of a link in my list pointing off into left field. I only want _that_ to be possible if I really might want to point into left field. J. Giles Los Alamos
peter@ficc.uu.net (Peter da Silva) (09/24/88)
In article <4096@lanl.gov>, jlg@lanl.gov (Jim Giles) writes: > From article <1580@ficc.uu.net>, by peter@ficc.uu.net (Peter da Silva): > > struct { > > struct list *prev, *next; > > data; > > } *p, *q; > > ... > > q->next = p; > > p->prev->next = q; > > q->prev = p->prev; > > p->prev = q; > type dlink > integer data > dlink :: next, prev !recursive data declaration > end type dlink > dlink :: p, q > ... > q%next = p !Fortran 8x uses % for the structure ref > p%prev%next = q !I know! I don't like % either > q%prev = p%prev > p%prev = q > No pointers! Only variables of type 'dlink'. So, how do you differentiate between copying pointers and copying structures? In 'C' I can say: extern struct doubly_linked_list empty_list; /* initialised to empty in runtime initialisation, say */ struct doubly_linked_list header; header = empty_list; /* initialise header by copying data in */ Now in Fortranoid: dlink :: emptylist common /initlist/ emptylist ! initialised to an empty list ! in runtime initialistion. dlink :: header header = emptylist ! Is this a pointer copy or a structure copy? Say what you mean and mean what you say. Before you ask if I ever use such things, the answer is 'yes'. In fact, I frequently copy list elements. Here is an example from a program I'm using on a real-time operating system (Browser on AmigaDOS): struct MessagePort *messageport; /* messageport contains a doubly linked list */ struct Message *msg, copymsg; /* message is a list element */ msg = GetMsg(messageport); if(I'm going to be using this message for a while) { copymsg = *msg; /* copies contents of msg */ ReplyMsg(msg); /* releases it for reuse */ msg = copymsg; /* keep the code tidy */ } use msg, without caring if it's local or not. if(msg != copymsg) ReplyMsg(msg); How would you do this without pointers, and without confusing the hell out of the syntax of your linked lists? And without making a subroutine call (in-line copies can be much faster... efficiency remember). -- Peter da Silva `-_-' Ferranti International Controls Corporation. "Have you hugged U your wolf today?" peter@ficc.uu.net
seanf@sco.COM (Sean Fagan) (09/26/88)
In article <4096@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >How about (in Fortran-like syntax): > type dlink > integer data > dlink :: next, prev !recursive data declaration > end type dlink > dlink :: p, q > q%next = p !Fortran 8x uses % for the structure ref > p%prev%next = q !I know! I don't like % either > q%prev = p%prev > p%prev = q >No pointers! Only variables of type 'dlink'. The advantage is that >type checking will prevent me from assigning any address other that >another variable of type 'dlink' to any of the links. With pointers, >type checking is not performed (if it is, then some of the really useful >things that pointers can do are lost) - I am stuck with the possibility >of a link in my list pointing off into left field. I only want _that_ >to be possible if I really might want to point into left field. Um, to paraphrase, "a pointer by any other name would still point as sweet" (I'm a programmer, not a muse). Just because it doesn't look like a pointer doesn't mean it's *not* a pointer (and if it isn't, it is not useful in the above example). Pascal won't let you do anything other than what you're doing above. In fact, you could, probably, get rid of the '^' operator in Pascal, and you would have something like the above FORTRAN 8x code (except you would use '.' instead of '%'). You're right, over typechecked pointers are not as useful as non-typechecked pointers. Witness Pascal vs. C. However, most C compilers will warn about pointers being assigned to or from arbitrary objects (MSC will, for example, complain about malloc() being assinge to a struct X * without a type-cast). Also, in an earier posting, Jim complained about a lack of a standard for C. There is one, and has been one for quite a while: the White Book (aka the Bible, the Old Testament, K&R, and, of course, "The C Programming Language"). As far as I've been able to tell, any code that conforms to what they describe is portable to all C compilers anywhere (8086's, Pr0mes, Cybers, 3B's, Suns, etc.). It's just not as useful as going outside of the standard is. And, sad to say, FORTRAN has that same problem. In how many ways does VMS FORTRAN deviate from the standard F77? How about CDC NOS FORTRAN 5? Cray CFT? It's very easy to write portable code, and it's very easy to write useful code. It's just difficult, at times, to make the two intersect. -- Sean Eric Fagan | "ld runs like a turtle out of antartica" seanf@sco.UUCP | Steven Ryan (smryan@garth.UUCP) (408) 458-1422 | Any opinions expressed are my own, not my employers'.
ka@june.cs.washington.edu (Kenneth Almquist) (09/29/88)
jlg@lanl.gov (Jim Giles) writes: > I have never seen a data structure that couldn't be described without > pointers. > > [...] > > _NONE_ of these require user accessable pointers not even the linked list! > Items of type 'list' have values of type 'list' - _not_ values of type > 'address_of_list'. In no case may the user of any of these data type > alter the memory location of a data object (without copying the data). When *I* talk about a linked list, I mean (among other things) a data structure in which it is possible to insert an element at an arbitrary location in the list in time O(1). Jim doesn't say how the data structures he talks about being able to describe can be manipulated, but the last comment quoted above certainly suggests that inserting an element in a list using Hoare's mechanism requires copying the entire list up to the point of an insertion, so an insertion at an arbitrary point in the list requires time O(n). In article <4108@lanl.gov>, Jim gives an implementation of a fifo queue using an array. The standard implementation of a queue using a linked list requires time O(1) to add an element, while Jim's code has a worst case time of O(n) to add an element, where n is the number of elements in the queue. The whole point of chosing a data structure is to find one which will handle the operations you want to perform on it efficiently. If Hoare's method of describing data structures does not let us describe linked lists and queues with access times of order O(1), then it is only useful in contexts where efficiency doesn't matter. To quote again from article <4108@lanl.gov>: > Fifo's _can_ be implemented as recursive data structures, but arrays > with indices are easier and more efficient. But if your language supports pointers, implementing a queue using a linked list is easier than using an array. I'm pretty sure I could get a linked list implementation right the first time, whereas getting the array implementation right is considerably harder. Jim's code contains a number of bugs (e.g. queue%empty and queue%full are not set correctly, growing the array is done incorrectly). Fixing the bugs will increase the complexity of the code significantly, and it will *still* suffer from an O(n) worst case time. > Use the structure which best fits the problem. This means: rarely use > pointers In my view, pointers quite often fit the problem. I do see the value of abstract data types, which allow the user to define a queue data type once and then use it where ever a queue is needed, but abstract data types don't eliminate the need for pointers to allow queues to be implemented easily in the first place. Kenneth Almquist