[comp.lang.lisp] Are "destructive" functions really destructive?

folta@tove.umd.edu (Wayne Folta) (11/29/89)

I take it from the Common LISP Reference that functions marked "(destructive)"
are destructive in the sense that they *may be* destructive in some way.
Is it thus true that I cannot count on destructive behavior when I want it?

For example, if I want to delete something from a large datastructure, I
assume that I would use "delete":

   (delete item huge-list :key #'my-key)

Or do I need to do:

   (setq huge-list (delete item huge-list :key #'my-key))

In the Reference, it says, "..., or sequence may be unchanged and another
sequence returned.", which would seem to make "delete" different from "nconc"
in this respect.  So destructiveness is an option for the convenience of LISP
implementation creators, not users?


Wayne Folta          (folta@cs.umd.edu  128.8.128.8)

moore%cdr.utah.edu@cs.utah.edu (Tim Moore) (11/29/89)

In article <1989Nov29.173139.26165@Neon.Stanford.EDU> michaelg@Neon.Stanford.EDU (Michael Greenwald) writes:
>folta@tove.umd.edu (Wayne Folta) writes:
>
>>I take it from the Common LISP Reference that functions marked "(destructive)"
>>are destructive in the sense that they *may be* destructive in some way.
>>Is it thus true that I cannot count on destructive behavior when I want it?
>
>Yes. 

>>In the Reference, it says, "..., or sequence may be unchanged and another
>>sequence returned.", which would seem to make "delete" different from "nconc"
>>in this respect. So destructiveness is an option for the convenience of LISP
>>implementation creators, not users?
>
>No, this behavior is dictated by the semantics of delete.  To wit, 
>
>(let ((x (copy-list '(a b c d e f))))
>  (setq y (delete 'a x)))
>
>There is no reasonable way to destructively "modify" x so that delete
>will work and you maitain EQness, since in this case DELETE == CDR.

There are unreasonable ways to do this, but michaelg correctly points
out that it's far more efficient to return the cdr of the original
list as the new sequence.

But consider the case of 

(let ((x (copy-list '(a a a))))
  (setq x (delete 'a x)))

Here, there's no way that (delete 'a x) could by itself modify the
variable x to be NIL because delete gets a copy of x's value, not of
x's location. 

If delete's behavior really bothers you, you can construct a deletef
macro like:

(defmacro deletef (item place)
  `(setf ,place (delete ,item ,place)))

Obviously this isn't completely equivalent to delete; caveat emptor
and all that. 

Tim Moore                     moore@cs.utah.edu {bellcore,hplabs}!utah-cs!moore
"Ah, youth. Ah, statute of limitations."
		-John Waters

bouma@cs.purdue.EDU (William J. Bouma) (11/30/89)

In article <20991@mimsy.umd.edu> folta@tove.umd.edu (Wayne Folta) writes:
>Is it thus true that I cannot count on destructive behavior when I want it?
>
>In the Reference, it says, "..., or sequence may be unchanged and another
>sequence returned.", which would seem to make "delete" different from "nconc"
>in this respect.  So destructiveness is an option for the convenience of LISP
>implementation creators, not users?

    Destructive operations are not really any easier to implement. Rather
    they are faster and consume less memory than the identical but copying
    operations. Thus they benefit the user.

    Which reference are you quoting? It may be messed up! What good are
    destructive operations if they aren't truely destructive? Even in
    CLtL the wording is strange for this, "The argument SEQUENCE may be
    destroyed and used to construct the result;" (pg. 254, delete). Why
    does it say 'may' instead of 'will'? It could be just to account for
    the case when no element is deleted.

    Note that it is good practice to wrap your destructive operation
    in a 'setf' anyway. It usually isn't too difficult and can keep you
    from getting nailed by special cases you have overlooked. For instance
    what if the element you are deleting is the first one in the list?
    > (setq *b* '(a b c d))
    (A B C D)
    > (delete 'a *b*)
    (B C D)
    > *b*
    (A B C D)
    In this case, the resulting sequence is not EQ to the argument.
-- 
Bill <bouma@cs.purdue.edu>  ||  ...!purdue!bouma

michaelg@Neon.Stanford.EDU (Michael Greenwald) (11/30/89)

folta@tove.umd.edu (Wayne Folta) writes:

>I take it from the Common LISP Reference that functions marked "(destructive)"
>are destructive in the sense that they *may be* destructive in some way.
>Is it thus true that I cannot count on destructive behavior when I want it?

Yes.  (I assume you mean by "count on" .. "when I want it" that
identity of the argument would be preserved.  It would be hard to
"count on" some internal detail not specified by the language
specification.)

>For example, if I want to delete something from a large datastructure, I
>assume that I would use "delete":

>   (delete item huge-list :key #'my-key)

No.

>Or do I need to do:

>   (setq huge-list (delete item huge-list :key #'my-key))

Yes.

>In the Reference, it says, "..., or sequence may be unchanged and another
>sequence returned.", which would seem to make "delete" different from "nconc"
>in this respect. So destructiveness is an option for the convenience of LISP
>implementation creators, not users?

No, this behavior is dictated by the semantics of delete.  To wit, 

(let ((x (copy-list '(a b c d e f))))
  (setq y (delete 'a x)))

There is no reasonable way to destructively "modify" x so that delete
will work and you maitain EQness, since in this case DELETE == CDR.
In other words, you lose the first cons cell, and the rest of the list
is untouched.

(I'm ignoring the pathological implementation of altering the CAR of
every sublist, and knocking off the last cons.)

>Wayne Folta          (folta@cs.umd.edu  128.8.128.8)

shiffman%basselope@Sun.COM (Hank Shiffman) (11/30/89)

In article <20991@mimsy.umd.edu> folta@tove.umd.edu (Wayne Folta) writes:
>For example, if I want to delete something from a large datastructure, I
>assume that I would use "delete":
>
>   (delete item huge-list :key #'my-key)
>
>Or do I need to do:
>
>   (setq huge-list (delete item huge-list :key #'my-key))

Either one of these will work AS LONG AS THE DELETED ITEM WASN'T THE
FIRST ELEMENT OF THE LIST!  You need to use the second version to deal
with the fact that using DELETE to remove the first element leaves the
rest of the list unchanged.  Since the deleted element will not be
modified, it will still point to the rest of the list.

DELETE is a function.  By the time it's called, the HUGE-LIST argument
has been evaluated to the first CONS in the list.  DELETE can't do the
SETQ for you, since it doesn't know what variable to SETQ.

>
>So destructiveness is an option for the convenience of LISP
>implementation creators, not users?
>

On the contrary.  Destructiveness saves the user a lot of extra
consing by reusing the same list.  By and large, ease of
implementation was *not* a major consideration in the design of Common
Lisp.  And it shows.

-- 
Hank Shiffman                                     (415) 336-4658
Marketing Technical Specialist
Software Engineering Technologies               ...!sun!shiffman
Sun Microsystems, Inc.                          shiffman@Sun.com

Zippy sez:
  Do you have exactly what I want in a plaid poindexter bar bat??

gaynor@busboys.rutgers.edu (Silver) (11/30/89)

folta@tove.umd.edu writes:
> [Concerning delete, is it destructive or what?]  In the Reference, it says,
> "..., or sequence may be unchanged and another sequence returned.",

Although I'm not familiar with your reference, the problem may be a simple
misunderstanding.  What can be said about value returned by delete after
removing 1 from the following list?

  L --> (1 -2 3 -5 8 -13)

In this case, it appears reasonable for delete to simply return the cdr of L.
So, L (that is, the list referenced by L) is unchanged, and a list that is not
L (namely, L's cdr) is being returned.

Awkward?  Naw.  Ok, well, maaaybe.  Happy Hunting, [Ag]

koomen@cs.rochester.edu (Hans Koomen) (11/30/89)

Michael Greenwald <michaelg@Neon.Stanford.EDU> writes:
...
>(let ((x (copy-list '(a b c d e f))))
>  (setq y (delete 'a x)))
>
>There is no reasonable way to destructively "modify" x so that delete
>will work and you maitain EQness, since in this case DELETE == CDR.
>In other words, you lose the first cons cell, and the rest of the list
>is untouched.
>
>(I'm ignoring the pathological implementation of altering the CAR of
>every sublist, and knocking off the last cons.)
>

You don't need to walk down the entire list to knock off the last cons.  When
deleting the first element of a 2+ element list, the Interlisp equivalent of
Common Lisp delete, dremove, effectively does
	(progn (rplaca x (cadr x)) (rplacd x (cddr x)))
i.e., splices out the second cons after stuffing value of the second car into
the first car.  No pathology required here to maintain eq-ness.

The real problem lies in removing the first element of a 1-element list, e.g.
if x is bound to '(a) then (delete 'a x) returns nil, but there's no way for
the delete function to know that its second argument, (a), was the binding of
the variable x, and that x should be set to nil!
-- 

- - -	/-/ a n s

schwamb@ics.uci.edu (Karl Schwamb) (11/30/89)

I'm well aware of this problem with "delete" (that you need to setq a
variable upon returning in case its the first element of a list) in
using Lisp.  However, I don't understand why "delete" was not
implemented as a macro to make the semantics of its destructive
operation consistent.  To have a function which applies to different
sequence types but performs differently for one implementation seems 
bogus to me:

> (setq string "123")
"123"
> (setq list '(1 2 3))
'(1 2 3)
> (delete #\1 string)
"23"
> string
"23"
> (delete 1 list)
'(2 3)
> list
'(1 2 3)

It seems cleaner to have (delete item sequence) expand to something
like (no, this doesn't do all the proper checking):

	`(if (listp ,sequence)
	     (setf ,sequence
		   (CL-delete ,item ,sequence)))

Where "CL-delete" would be the current CL delete function.  Of course,
if the types are known at compile-time then we can even remove the
check.  Is this a case where upward compatibility has ruined the
elegance of the language standard?


Karl B. Schwamb                                  schwamb@ics.uci.edu
Information and Computer Science                 
University of California, Irvine 92715

barmar@think.com (11/30/89)

In article <20991@mimsy.umd.edu> folta@tove.umd.edu (Wayne Folta) writes:
>I take it from the Common LISP Reference that functions marked "(destructive)"
>are destructive in the sense that they *may be* destructive in some way.
>Is it thus true that I cannot count on destructive behavior when I want it?

ANSI CL is going to be a bit more clear about which destructive operations
can be "counted on" and which can't.  For instance, NCONC will be specified
pretty fully (it RPLACD's the last cons in each non-null argument except
the last), as will NSUBSTITUTE (it RPLACA's or SETF-AREF's as appropriate).
The specifications of some others, such as DELETE and NREVERSE, will say
"may reuse any part of the sequence."  We addressed these on a case-by-case
basis, trying to balance implementation flexibility against user needs and
expectations.  NCONC is specified explicitly because it has a long history
and programmers have come to expect and rely upon its obvious
implementation; NSUBSTITUTE is specified explicitly because we couldn't
imagine an alternate implementation that would violate the requirement.
DELETE is flexible because its definition prevents it from always reusing
the input argument (e.g. if it has to delete all the elements of a list);
NREVERSE is flexible because there are a number of known implementations of
lists that have different optimal algorithms for reversing in place
(traditional lists are typically reversed by reordering the CDR chain,
cdr-coded lists are best reversed by replacing the CARs).

>So destructiveness is an option for the convenience of LISP
>implementation creators, not users?

In the more flexible cases, it is a way for the programmer to declare his
intentions (he doesn't care about the old structure), and thus permit the
implementation to try to optimize the operation.
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

hoey@ai.etl.army.mil (Dan Hoey) (11/30/89)

In article <1989Nov29.114923.1843@hellgate.utah.edu> moore%cdr.utah.edu@cs.utah.edu (Tim Moore) writes:
...
>If delete's behavior really bothers you, you can construct a deletef
>macro like:
>
>(defmacro deletef (item place)
>  `(setf ,place (delete ,item ,place)))
>
>Obviously this isn't completely equivalent to delete; caveat emptor

Perhaps something more like

(defmacro deletef (item place &rest delete-options &aux (itemv (gensym)))
  (multiple-value-bind (vars vals store store-form access-form)
      (get-setf-method place)
    `(let* ((,itemv ,item)
            ,.(mapcar #'list vars vals)
            (,(car store) (delete ,itemv ,access-form ,@delete-options)))
       ,store-form
       ,(car store))))

Dan

michaelg@Neon.Stanford.EDU (Michael Greenwald) (11/30/89)

schwamb@ics.uci.edu (Karl Schwamb) writes:

>....              However, I don't understand why "delete" was not
>implemented as a macro to make the semantics of its destructive
>operation consistent.  To have a function which applies to different
>sequence types but performs differently for one implementation seems 
>bogus to me:

If you always use delete for value, rather than for effect, then it
always behaves consistently.

>It seems cleaner to have (delete item sequence) expand to something
>like (no, this doesn't do all the proper checking):

>	`(if (listp ,sequence)
>	     (setf ,sequence
>		   (CL-delete ,item ,sequence)))

>Where "CL-delete" would be the current CL delete function.  

By the way, your implementation doesn't work when sequence is the
result of function application.  It is illegal to "apply" (if such
terminology is appropriate for a macro) SETF to something that isn't a
"place".  Or is this one of the things you were referring to when you
said "this doesn't do all the proper checking"?

>Of course,
>if the types are known at compile-time then we can even remove the
>check.  Is this a case where upward compatibility has ruined the
>elegance of the language standard?

I don't think so.

CL-delete doesn't side-effect the environment, while your macroized
delete does.

Yes, CL-delete does side-effect sequence (and if sequence is shared,
and isn't just an intermediate result, then the side-effect is visible
in the "environment"), but if delete is always used "functionally"
(for value) you can argue convincingly that cl-delete is cleaner than
your delete (if you subscribe to the notion, as many do, that
side-effects aren't particularly clean, and therefore should be
minimized).

Reasonable people might differ about whether CL-delete is cleaner than
your macroized delete (I strongly prefer CL-delete), but I don't think
this is "a case where upward compatibility has ruined the elegance of
the language standard"

michaelg@Neon.Stanford.EDU (Michael Greenwald) (11/30/89)

koomen@cs.rochester.edu (Hans Koomen) writes:

>You don't need to walk down the entire list to knock off the last cons.  When
>deleting the first element of a 2+ element list, the Interlisp equivalent of
>Common Lisp delete, dremove, effectively does
>	(progn (rplaca x (cadr x)) (rplacd x (cddr x)))
>i.e., splices out the second cons after stuffing value of the second car into
>the first car.  No pathology required here to maintain eq-ness.

Some of the time.

The reason it's pathological was not because it has to walk down the
entire list (it has to do that anyway, to delete multiple copies of
item), but because the idea of implementing delete that way would only
make sense if you are trying to maintain EQness.  But >in general<
that can't work, because you won't win (as you mention) if the result
of an application of delete is NIL.  Therefore it's pathological to
try to make it work in only a subset of the cases.

It's also pathological because if you try to make delete "consistent"
about side-effecting, then I suppose one could argue that some
definition about the side-effect on sublists should be specified. But
I'm hard pressed to come up with one that would work.  (The Interlisp
definition you quote above certainly won't work, if you have two
people sharing list structure of x and (cdr x)).  Once properties like
that don't work, you might as well require people to only depend on
the value of delete, and not on side-effect.  To go to some lengths to
make a certain property hold (in only some of the cases) seems
pathological to me.  By calling delete instead of remove, you are
stating that you don't care to depend on side-effects (or the lack
thereof) to sequence.

(I must confess that I don't have a strict or well-defined definition
of "pathological", so if you are trying to make some point about, or
that depends upon, a precise definition of pathological, then I'm
probably wrong.)

In general, I think we both probably agree about the larger point:
DELETE can't preserve EQness by simply side-effecting SEQUENCE, and
you should always use the value of (DELETE ITEM SEQUENCE), rather than
depending on details of the implementation, or of your data, to
side-effect your SEQUENCE properly.

>The real problem lies in removing the first element of a 1-element list, 

Or all N elements of an N element list, '(a a a a a a a a).

>e.g.
>if x is bound to '(a) then (delete 'a x) returns nil, but there's no way for
>the delete function to know that its second argument, (a), was the binding of
>the variable x, and that x should be set to nil!
>--

barmar@think.com (11/30/89)

In article <25742E50.28813@paris.ics.uci.edu> schwamb@ics.uci.edu (Karl Schwamb) writes:
>It seems cleaner to have (delete item sequence) expand to something
>like (no, this doesn't do all the proper checking):
>	`(if (listp ,sequence)
>	     (setf ,sequence
>		   (CL-delete ,item ,sequence)))

Lisp is a *functional* programming language, and the above is contrary to
that philosophy.

For instance, what happens when sequence isn't a place that can be SETFed?
For instance, one might write code like:

	(setq result (delete item (append foo bar baz)))

Or what if you didn't really need to store the result back into the
variable the list came from, e.g.

	(let (other-list
	      (*list* (compute-the-list)))
          (declare (special *list*))
	  <do stuff ...>
          (setq other-list (delete item *list*))
	  <do more stuff ...>
	  )

If nothing in <do more stuff ...> references *list* then there is no need
to store the result back there, since the value will bt thrown away when
the LET returns.  If *list* were a lexical variable then the compiler could
discover that the store is unnecessary using live variable analysis, but
since it's a special variable this is not always feasible.

If you want a DELETEF macro, create one.  This would be just like the
distinction between 1+ and INCF.  If you think DELETEF should have been
included in the standard language as INCF was, that's a completely
different issue.  I suspect it wasn't because then the language would have
had to have SETF-like versions of all the destructive sequence operations.
Maclisp and Zetalisp, the primary influences on Common Lisp, had INCF, so
it was kept, but the CL designers apparently didn't want to throw in a
zillion new macros of that type.
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

Schwamb@ics.UCI.EDU ("Karl B. Schwamb") (11/30/89)

michaelg@Neon.Stanford.EDU (Michael Greenwald) writes:

>By the way, your implementation doesn't work when sequence is the
>result of function application.  It is illegal to "apply" (if such
>terminology is appropriate for a macro) SETF to something that isn't a
>"place".  Or is this one of the things you were referring to when you
>said "this doesn't do all the proper checking"?

Good point.  I guess special forms wouldn't help either, sigh.

>CL-delete doesn't side-effect the environment, while your macroized
>delete does.

I'm confused here.  The semantics of delete side-effects the environment
by changing the value of the sequence.  The macro form may side-effect 
a variable through a setf but I don't see any real difference between
that and altering its structure.

>Yes, CL-delete does side-effect sequence (and if sequence is shared,
>and isn't just an intermediate result, then the side-effect is visible
>in the "environment"), but if delete is always used "functionally"
>(for value) you can argue convincingly that cl-delete is cleaner than
>your delete (if you subscribe to the notion, as many do, that
>side-effects aren't particularly clean, and therefore should be
>minimized).

If you advocate a functional approach then it would seem that "remove"
should be used rather than "delete".  Since delete is designed to
be destructive, its only advantage seems to be speed since functional
style would advocate "equal" checks in subsequent code rather than 
"eq".  When functional style has to be abandoned in large systems
requiring efficiency or small systems requiring space, then it becomes
a question of how you'd like your destructive operations to work.
I agree that side-effects should be minimized, but I also think
they're rarely unavoidable in large systems.

barmar@think.com writes:

>For instance, what happens when sequence isn't a place that can be SETFed?

I would think this could be checked during code generation.

>Or what if you didn't really need to store the result back into the
>variable the list came from, e.g.

I think a better implementation would check if a list were being
used and the first element had been deleted.  Only in that special
case would a "setf" occur.  This seems a small price to pay to
get a consistent delete function.

Karl B. Schwamb                                  schwamb@ics.uci.edu
Information and Computer Science                 
University of California, Irvine 92715

andy@Gang-of-Four.Stanford.EDU (Andy Freeman) (12/01/89)

In article <31807@news.Think.COM> barmar@think.com writes:
>In the more flexible cases, it is a way for the programmer to declare his
>intentions (he doesn't care about the old structure), and thus permit the
>implementation to try to optimize the operation.

I note that it is okay for an implementation to have "destructive"
operations that actually don't modify their arguments's value.  For
example, an implementation may have a concurrent garbage collector CPU
and have made other design choices such that destructive operations
are actually more expensive than copying the appropriate data.  (Some
implementations of CDR-coding are almost like this.)

Strings (and general vectors as well, although I don't see the point
for them) can be implemented in many ways.  In some cases, destructive
deletions can be made that return a new string object, but a pointer
to the original object (which is not eq to the new one) sees some of
the modifications.

The purpose of "destructive" operations is to allow the programmer to
say "do this fast, reusing whatever of the argument you want, and give
me the result" to the implementation.  If that's not what you want to
do, don't use them.  In particular, if you want a particular type of
reuse, and you want it to be portable, you have to write it yourself.

-andy
--
UUCP:    {arpa gateways, sun, decwrl, uunet, rutgers}!neon.stanford.edu!andy
ARPA:    andy@neon.stanford.edu
BELLNET: (415) 723-3088

faustus@yew.Berkeley.EDU (Wayne A. Christopher) (12/02/89)

I think there are other reasons to use destructive operations besides
efficiency.  If you have a large and complex data structure, sometimes
you have to walk down a long path of pointers to get to the item you want,
and it's a lot easier to change it in place than to write a long string
of setf's to make sure everything points to the right place.  In a case
like that, if you can guarantee that you will never change the first item
of a list, you really want delete, et al to work "the right way".

	Wayne

jeff@aiai.ed.ac.uk (Jeff Dalton) (01/09/90)

In article <8911291447.aa13784@PARIS.ICS.UCI.EDU> Schwamb@ics.UCI.EDU ("Karl B. Schwamb") writes:
>
>michaelg@Neon.Stanford.EDU (Michael Greenwald) writes:
>>CL-delete doesn't side-effect the environment, while your macroized
>>delete does.
>
>I'm confused here.  The semantics of delete side-effects the environment
>by changing the value of the sequence.  The macro form may side-effect 
>a variable through a setf but I don't see any real difference between
>that and altering its structure.

Well, let's start by just saying there's a difference: in one case,
a variable is given a new value; in the other the variable keeps the
same value, but the value is modified.  "Side-effects the environment"
is just another way to say "assigns to a variable".  "Environment" is
being used in a special sense here.

This difference can matter; but changing the structure is usually
considered to be more dangerous than assigning to a variable, rather
than the other way around, because other variables might point to
the same structure.

>If you advocate a functional approach then it would seem that "remove"
>should be used rather than "delete". 

Just so.

>Since delete is designed to be destructive, its only advantage seems
>to be speed since functional style would advocate "equal" checks in
>subsequent code rather than "eq".

Destructive operations aren't just a matter of efficiency.  There's
also a conceptual difference.  For example, one way to model
real-world objects that change over time is to use Lisp objects
that change (via destructive operations) over time.

>barmar@think.com writes:
>
>>For instance, what happens when sequence isn't a place that can be SETFed?
>
>I would think this could be checked during code generation.

True, but what if you still want to perform a delete in one of the
cases where the SETF was impossible?  The DELETE function would still
be useful in this case.  So I wouldn't want to replace that function
by the macro.

>>Or what if you didn't really need to store the result back into the
>>variable the list came from, e.g.
>
>I think a better implementation would check if a list were being
>used and the first element had been deleted.  Only in that special
>case would a "setf" occur.

But what if you don't want to store the result back in the place
the list came from?  What if you want to put it somewhere else,
for example, as in

   (setq b (delete 'this a))

>This seems a small price to pay to get a consistent delete function.

Consistency depends on how you look at it.  In Lisp, one often thinks
of data objects as independent of variables.  You get your consistency
only in cases where a list and a setf-able place are considered
together.  It still isn't a consistent operation on an object standing
alone.  For that, you'd need to use a headed list or something of that
sort.

However, there's nothing wrong with defining a macro to delete-in-a-
place.  There are already similar operations in Common Lisp, such as
PUSH and POP.

-- Jeff