jeff@aiva.ed.ac.uk (Jeff Dalton, JANET: J.Dalton@uk.ac.ed ) (01/01/70)
The discussion of EQ vs. EQUAL seems to be assuming that -- EQ on lists is just an efficiency hack. -- EQ is unreliable and implementation-dependent. So EQ gets classed with goto as something that we'd be better off without. However, EQ is not just a fast-but-unreliable EQUAL. It is right to say that EQ is implementation-dependent, but it still has certain properties that allow it to be used in portable ways. Whether it is still useful to use EQ depends on other properties of the Lisp systems in question. In the "usual" Lisps, there is a ordering of equality predicates. In Common Lisp, the order from most to least strict is: EQ, EQL, EQUAL, EQUALP. (I'm leaving out type- specific tests like = and CHAR=.) If a more strict predicate returns true for some arguments, all less strict predicates will also return true. So, if EQ returns true, this result is no more unreliable or implementation-dependent that one from EQUAL -- because EQUAL will always return true as well. But, one might argue, EQUAL calls EQ first: if you trust EQ only when it returns true, you might as well use EQUAL instead. Here I agree. So the question is: is it ever useful to make the kind of distinction made by EQ? The answer to this question will depend on what other operations are available in your Lisp. In a sense, EQ tests whether two objects are the same identical object. But how, apart from using EQ (or EQL), would one know whether objects were identical? One method would be to modify one of the objects and see whether the other changes too. If your Lisp includes RPLACA or RPLACD, two EQUAL conses can be distinguished by this means, and so it is possible to define a CONS-EQ that is stricter than EQUAL. CONS-EQ would be a useful test for "operational equivalence", not a fast-but-unreliable EQUAL. In the "usual" Lisps RPLACA and RPLACD *are* available, EQ returns true for conses that are the same under destructive modification (and false for conses that are different), and CONS always returns a new, EQ-unique, object each time it's called. Using EQ in this way works in all of these Lisps. You do not need to understand internals of the implementation, only that EQ, CONS, etc. have certain properties. Now, there are some who would like to get rid of RPLACA and RPLACD as well as EQ. And if you do so, it is not longer necessary for CONS to create a new object each time. It might ("hashed cons") quickly look to see if the right (EQUAL) thing already existed. And if CONS always did this, EQUAL for conses could just be a pointer comparison, as EQ is in more standard Lisps. So the remaining question is to ask why anyone would want to keep the operations for destructive modification. One answer is that having objects that can change over time is an effective technique for modeling real-world objects that change over time. For example, if we represent an person as a Lisp object, and the person's address changes, we can follow the change by changing the Lisp object instead of making a new object with a different address (but otherwise the same) and then doing everything else required to give the new object the same role in the database that the old one had. -- Jeff Dalton, JANET: J.Dalton@uk.ac.ed AI Applications Institute, ARPA: J.Dalton%uk.ac.ed@cs.ucl.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!J.Dalton
bart@reed.UUCP (Bart Massey) (07/19/87)
In article <3927@spool.WISC.EDU> neves@ai.WISC.EDU (David M. Neves) writes: > In article <6601@reed.UUCP> bart@reed.UUCP (Bart Massey, me) writes: > >At Reed college we are currently doing almost all of our lisp > >programming in an in-house dialect which supports neither destructive > >list functions nor dotted pairs. > ... > >In case anyone is interested, the following major changes in the > >language were also implemented, and have proven useful: > > > >- Elimination of eq: Using a properly written version of equal, eq is > >the first thing to happen anyhow, and we've never encountered a > >situation where knowing that two lists were non-eq is "good enough". I've gotten several replies already to this statement, so I thought I'd try to give a little more background and detail about what I meant... There are three explicit Reed Lisp design goals that are relevant to this question. These are: -- Lisp should be easy for non-lisp-programmers, and non-programmers, to learn and to read. -- The internal operation of a lisp interpreter/compiler should, as much as possible, be hidden from the user, for the convenience of the implementor as well as the user himself. -- Efficiency is unimportant up to a constant. I'm not sure how to put this clearly... If the language is designed in such a way as to force certain algorithms to be higher "order" or to take more than an "order of magnitude" longer to run, the design should be reconsidered. If, on the other hand, a language design change would allow algorithms to run a "small constant" amount faster, it should not be implemented unless there are other compelling reasons for it. This still isn't very well said, but I think you can appreciate what I mean... > I can construct such a situation. Imagine a list of lists. You want > to find out if a list you have a pointer to is contained within that > list (i.e. MEMBER). You want to use EQ as the comparison function for > efficiency -- especially if the first few characters of the non eq > lists are the same as the one you are looking for. > > e.g. > look for (a b c) object in > List: ( (a b z) (a b a) (a b c d) (a b q) (a b r) (a b c) ) Yep, this is the classic case. Let's look at a little different one, just for the sake of clarity: look for (a b c) in target ( (a b c d) (a b c) (a b c d) (a b c) ) OK, here's what I'm thinking when I see this: Do I know that both instances of (a b c) in target are the same pointer? If not, is this a problem? (In other words, do I really want to look for (a b c) in target, or for a *specific* (a b c)?) There are two cases: I) I want *any* (a b c). This is by *far* the most common case in my (admittedly somewhat limited) experience. In this case, I can still use eq, but I have several problems: I must be very careful that I don't insert (a b c) into target in any way except by inserting the appropriate pointer. Anyone trying to read or debug my code must also be very careful that I have done so. This is generally non-trivial. I must understand the internals of my implementation *very well* to insure to myself that I am always putting the proper pointer in. Such details actually vary a great deal from implementation to implementation, so I must write for a specific interpreter. Case II) I am only interested in finding my pointer. I don't really care that it happens to be pointing to the list (a b c). This is the case which I have never seen occur in practice, and have a hard time thinking of an example of. Cases that *look* like this generally look that way because the programmer has spent time trying to figure out where his (a b c)s are. With respect to the three design goals stated above, I suspect it is now clear why we didn't implement eq. It makes lisp harder to understand, the code harder to read, is implementation dependent. Finally, note what kind of efficiency increase you get from its use in the above example. In compiled lisp, you get effectively a factor of 5, during this one line of code, because you are doing 5 comparisons (the list and four of its elements) in the equal version, to each single test (the list) in the eq version. But note that these 5 comparisons are approximately 20 machine instructions (5 compares, 5 cdrs, and a (machine-level) recursion), versus the single machine instruction of eq. The overhead of executing compiled lisp code effectively swamps this gain in all but the largest of cases, and thus you're not likely to be able to measure the speedup in a piece of real code. The interpreted case is even worse. Ours is a fast lisp interpreter -- approximately 2-10 times as fast as Franz. In our lisp, an equal on even a 1000 element deeply recursive list with all but the last element succeeding is completely swamped by the overhead of the eval associated with executing the equal, in spite of the fact that much of our speed advantage is due to a very fast eval. I hope this clarifies why we left eq out of our lisp interpreter. My friends and I are awfully fond of the concept of "fast enough", and "The Law Of Virtual Efficiency: A "Real Programmer" will spend a week to save the user 10 seconds over the life of the program." But then again, what do we know??? Precious little! Please send replies to me, I will summarize. BTW, thanks much for all the flattering and interested responses I've received so far! This was the first net article I'd posted in a long time that I didn't receive *a single flame* about!! Some useful criticisms, but no flames! How very pleasant!... Bart the Long-Winded :-)
weltyc@b.nyser.net (Christopher A. Welty) (07/20/87)
In article <6638@reed.UUCP> bart@reed.UUCP (Bart Massey) writes: > ... > I) I want *any* (a b c). This is by *far* the most common >case in my (admittedly somewhat limited) experience. In this case, I It is true that this is more common. > > I must understand the internals of my implementation >*very well* to insure to myself that I am always putting the proper >pointer in. Such details actually vary a great deal from >implementation to implementation, so I must write for a specific >interpreter. > Well, this is not really all true. If you just understand the way LISP is supoosed to work, and you are using a more standard form of LISP, there are certain things that are guaranteed to work in a certain way. Of course you are changing this guarantee. I have written portable LISP code using complex pointer manipulations. > > Case II) I am only interested in finding my pointer. I don't >really care that it happens to be pointing to the list (a b c). This >is the case which I have never seen occur in practice, and have a hard >time thinking of an example of. There are applications of this in practice. They are not examples that are easy to think of, nor are they simple enough to explain briefly, but they come up when doing very large real applications in LISP. Being able to get to your pointers is very important. Why do you think C is so popular? Because pointers are very useful and allow you to do flexible things. Sure, some people view them as "too low level" for a language that is supposed to support high level abstractions. And using this apprach can sometimes make *code* less readable. [Of course, if programmers were taught earlier on how to correctly DOCUMENT code as they are writing it, this wouldn't be a problem, and we wouldn't have to worry about making computer languages that read like english (COBOL was an attempt at this).] Hardware, however, has not gotten fast enough that programmers can afford to write LISP code that is not efficient... Don't get me wrong this is not a flame. I use LISP quite a lot, I teach it, I think it's a great language with a lot of great ideas. I think it will be around for a long time. But there are some SERIOUS problems with using it for real work, and you have proposed good ideas for helping it past some of its major problems, this is good! But don't eliminate important elements of the language because you can't think of an example of how it could be used. If it's there, SOMEONE must be using it! Christopher Welty - Asst. Director, RPI CS Labs weltyc@cs.rpi.edu ...!seismo!rpics!weltyc
wagle@iucs.cs.indiana.edu (07/21/87)
I view two objects that are otherwise EQUAL, but not EQ as having different "timestamps". The two objects look the same, but they are different because two different place/time's created them. Hash-tables on symbols and the like add some texture to this model by allowing two different creation place/time's to point to the same object (this is the case for symbols in particular, often subsets of the numbers, etc), which is useful only if done in a predictable (easy to understand) way. Conceptually, these pointers often act as keys for various searches (MEMQ, ASSQ, or homebrew tree searches). Also, since when one is destructively modifying a tree (copying entire trees to modify one piece can sometimes be (1) prohibitly time-expensive (2) eats cons cells and causes the garbage collector to be called more often and (3) isn't necessarily more understandable, and often requires more code to support) one's invariants have a lot of EQ's of subtrees in them. Having a close correspondance between one's invariants and the actual code being written makes life more pleasant, easier, and the code easier to read and understand. While appealing intuitively, your concept of efficiency is horrifying. Roughly, your MEMQ search is order N, while the equivalent MEMBER search (for the example given) is order N squared (roughly, I said). For the small N in your examples, N squared minus N isn't very large and IS negligible. But what about large N? How about N = 1000? 1000000? Just eight times slower means what could have taken 1 hour now takes 8 hours! Try using a 300 BAUD terminal and a 9600 BAUD terminal to print a long file. What takes a minute on the 9600 BAUD terminal takes over half an hour on the 300 BAUD terminal. Even small factors can add up. What you've done by removing EQ is to remove an opportunity for writing more efficient programs, WITHOUT REPLACING IT WITH SOMETHING EQUIVALENT OR BETTER! *THIS* is why people mutter and flame about your LISP being a "toy" language or "fundamentally brain damaged". You've shut out whole classes of large programs without providing any alternative. Now, I don't have an alternative. I think the equivalence classes introduced by pointers on objects by creation times and hash-tables is an essential aspect of LISP-like languages. My programming style is such that correctness is natural, obvious, and automatic once you understand the paradigm -- I have never had problems with objects not being EQ when they should or EQ when they shouldn't be (nor have I had to think about it much). I don't even think I've used EQUAL in a long time. Your challenge is to change my mind by providing a faster or a equal but more elegant alternative. You can't change my mind by taking it away, for I simply won't use your language. By the way, you're stuck with von Neumann architectures for the next decade at least. I realize my pointer semantics are ugly and unneccessary on certain non-standard architectures where efficient alternatives exist. In the meantime, my pointer semantics are explicit in my programs because I haven't yet built optimizing compilers that understand it well enough to allow it to be implicit (and possibly non-existant on appropriate architectures). I'm trying though. - - - - - - - - - - - - - - - - - - - - - - - - - - If you want to bend your mind thinking about other things, consider the idea of HASH-CONS. HASH-CONS on a particular pair of values always returns the same cons cell. What does this allow you to do? What problems does it solve? What problems does it create? Should you be able to RPLACA and RPLACD a cell cell produced by HASH-CONS? But thats trivia (and possibly bogus). If you are really concerned about good language design for LISP, look into SCHEME where you have lexical scoping for identifiers, first class functions (can be passed as arguments as well as returned as values), and many other goodies. Other functional languages (like ML, f'instance) are worth looking into. Here you are talking major overhaul of the entire language, not nit picking certain aspects. Like the man said, its good that you are thinking about these things, but don't get bogged down. Your goal as a language designer should be to allow the programmer to do what he wants, but to give him the best, cleanest, and simplest tools for doing that. You might change HOW he does what he wants, but never WHAT he wants. If you start trying to dictate what he wants to do, you've just murdered your language, because the programmer's just going to get pissed off, and go to a more pleasant language at his first opportunity.
roberts@cognos.uucp (Robert Stanley) (07/30/87)
In article <290@nysernic> weltyc@nic.nyser.net (Christopher A. Welty) writes: >...Why do you think C is so popular? Because pointers are >very useful and allow you to do flexible things. Sure, some people >view them as "too low level" for a language that is supposed to >support high level abstractions.... Until we get much more powerful computing systems out in the field, remembering that the lag to plain old boring commercial DP in the field is about 15 years, all persons writing 'real' applications will at some point be obsessed by efficiency. Further, human nature tends to leave us doing things the way we've always done them. How long has it taken for the concept of GOTO-less programs to reach grunt level? Has it? It will take a long time to wean programmers from pointers, even if there is an adequate alternative. Adequate of course is a mutable measurement, depending on local context. >...But don't eliminate important elements of the language because >you can't think of an example of how it could be used. If it's there, >SOMEONE must be using it! Not so at all. Nearly everything in large-scale software is prey to creeping featurism, and the 'it seemed like a good idea at the time' syndrome. Tools, in which I include all programming languages, are no exception to this. I can name at least one heavy-duty bug that remained undetected for years for each of several different language implementations. Undetected for years means that the feature was essentially unused. We are suffering from the legacy of solutions in search of a problem, and I welcome initiatives that offer a departure from this sorry state of affairs. The key issue is not to remove a feature from the language without providing a way of simulating it. If there are large cries from users that the only way to do something is too slow/clumsy/impractical/etc. then is the time to take note and introduce release two. -- Robert Stanley Compuserve: 76174,3024 Cognos Incorporated uucp: decvax!utzoo!dciem!nrcaer!cognos!roberts 3755 Riverside Drive or ...nrcaer!uottawa!robs Ottawa, Ontario Voice: (613) 738-1440 - Tuesdays only (don't ask) CANADA K1G 3N3