[comp.lang.scheme] Why memq vs memv vs member, etc.?

barmar@think.com (Barry Margolin) (05/11/91)

Given Scheme's orientation in favor of the use of procedural arguments, I'm
curious about why it uses the old MacLisp style of list searching
procedures, where the equivalence predicate is implicit in the choice of
the procedure, rather than an explicit procedure argument to a single
searching procedure.  This seems to be an area where Common Lisp does what
I'd expect Scheme to do; CL has a single MEMBER function with a :TEST
option, while Scheme has MEMQ, MEMV, and MEMBER.  Why not (member <obj>
<list> eq?) instead of (memq <obj> <list>)?  The same goes for all the
other functions that come in three varieties like this.
-- 
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

ted@nmsu.edu (Ted Dunning) (05/11/91)

In article <1991May10.230057.15591@Think.COM> barmar@think.com (Barry Margolin) writes:

   Why not (member <obj> <list> eq?) instead of (memq <obj> <list>)?


well, while we are at it, why have member at all?

why not just


  (some (lambda(x) (eq? <obj> x)) <list>)

--

if feeling bad is a justification for not living, 
    then living is a justification for feeling good.

chip@soi.UUCP (Chip Morris) (05/13/91)

barmar@think.com (Barry Margolin) writes:

>....  This seems to be an area where Common Lisp does what
>I'd expect Scheme to do; CL has a single MEMBER function with a :TEST
>option, while Scheme has MEMQ, MEMV, and MEMBER.  Why not (member <obj>
><list> eq?) instead of (memq <obj> <list>)?  

I agree with Mr. Margolin on the style and generality of his proposal.
However, I just finished speeding up a large Common Lisp application
in which using MEMQ was much faster than using MEMBER
(with a :TEST of #'EQ) was quite significant.  The overhead of the
additional parameters, especially keyword parameters, was alone enough
to make a difference.  

Interestingly, the MEMQ provided as an extension by the implementation
(Lucid) ran at about the same speed as a naively programmed MEMQ.  So
the presence of a good compiler suggests keeping language constructs
to a minimum, more a Scheme idea than a Common Lisp idea.
-- 
Chip Morris, Senior Engineer
Software Options, Inc., 22 Hilliard St., Cambridge MA 02138  (617) 497-5054
chip@soi.com

gateley@rice.edu (John Gateley) (05/15/91)

In article <chip.674143084@soi> chip@soi.UUCP (Chip Morris) writes:

   barmar@think.com (Barry Margolin) writes:
   >....  This seems to be an area where Common Lisp does what
   >I'd expect Scheme to do; CL has a single MEMBER function with a :TEST
   >option, while Scheme has MEMQ, MEMV, and MEMBER.  Why not (member <obj>
   ><list> eq?) instead of (memq <obj> <list>)?  

   I agree with Mr. Margolin on the style and generality of his proposal.
   However, I just finished speeding up a large Common Lisp application
   in which using MEMQ was much faster than using MEMBER
   (with a :TEST of #'EQ) was quite significant.  The overhead of the
   additional parameters, especially keyword parameters, was alone enough
   to make a difference.  

I find this surprising - did you try playing with the speed, safety
and other declarations to get high optimizations? The compiler should
be able to replace the call (member foo bar :test #'eq) with the
call (memq foo bar) fairly easily. Disclaimer - I haven't used Lucid
before.

On a more general note - how many compilers actually perform
optimizations on optional, rest, and keyword arguments?

John
gateley@rice.edu
--
"I've thought the thoughts of little children and the thoughts of men
 I've thought the thoughts of stupid people who have never been
 so much in love as they should be and got confused too easily
 to fall in love again." The Residents and Renaldo and the Loaf

barmar@think.com (Barry Margolin) (05/16/91)

In article <chip.674143084@soi> chip@soi.UUCP (Chip Morris) writes:
>I agree with Mr. Margolin on the style and generality of his proposal.
>However, I just finished speeding up a large Common Lisp application
>in which using MEMQ was much faster than using MEMBER
>(with a :TEST of #'EQ) was quite significant.  The overhead of the
>additional parameters, especially keyword parameters, was alone enough
>to make a difference.  

I considered that before making my post.  However, my impression is that in
Scheme, performance is not as high a priority as style.

Also, Scheme designers expect the compiler to optimize away much of the
overhead.  In your test with Lucid, I think you may have forgotten to
enable some of the optimization capabilities, as Lucid knows how to
open-code calls to MEMBER when the :TEST option is #'EQ (I just tried it,
with optimization settings (speed 3), (safety 0), and (space 0); I looked
at the disassembly of (member x y :test #'eq) and it generated a simple
loop).  The Symbolics compiler also translates calls to many keyworded
functions into calls to internal versions that take positional arguments,
so that the keyword lookup is only done at compile time; however, this
isn't relevant to the kind of optimizations that a Scheme version would
need, since it doesn't have keyword parameters.

I guess the only problem with optimizing out the explicit passing of the eq
argument is that Scheme doesn't prohibit user programs from redefining
standard procedure names as Common Lisp does.  The compiler could turn off
the optimization if there were a lambda binding of eq, but it wouldn't be
able to detect whether a (set! eq ...) had been done.  Then again, this
problem exists with all open coding, and I assume most Scheme compilers do
open-code calls to many built in procedures (car, cdr, arithmetic, etc.).
And Schemers don't even have a valid excuse of "oh, the function position
is special", because one of the fundamental tenets of Scheme is that the
function position of a form is evaluated just like other positions.

-- 
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

jonl%kuwait@lucid.COM (Jon L White) (05/16/91)

re: Interestingly, the MEMQ provided as an extension by the implementation
    (Lucid) ran at about the same speed as a naively programmed MEMQ.  So
    the presence of a good compiler suggests keeping language constructs
    to a minimum, more a Scheme idea than a Common Lisp idea.

In some later releases of Lucid's lisp, MEMQ is tweaked more than 
simply "a naively programmed MEMQ"; but the benefit over the more
naive verions really isn't very much because, as you point out, the 
"Production Quality" compiler does a pretty good job with the naive 
code [assuming, of course, that you haven't confused the issues of
the two compilers.]

Also, later releases of the "Production Quality" compiler do carry out 
more of the static-analysis to convert MEMBER into MEMQ when possible. 
This is neither a difficult nor a "deep code transformation; instead, it 
is one of style imposed by the end-user community.  With a  language as 
large as Common Lisp, *many* optimizations are possible; the question
remains: which ones are practical, useful, and actually used?

Interestingly, the presence of MEMQ in the Lucid extensions set has 
little if anything at all to do with speed.  The thread of Common Lisp's
history (from PDP10 MacLisp in particular) and the desire not to see 
spurious variants on this common concept were prominent factors.


-- JonL --

chip@soi.UUCP (Chip Morris) (05/17/91)

barmar@think.com (Barry Margolin) writes:

>In article <chip.674143084@soi> chip@soi.UUCP (Chip Morris) writes:
>>I agree with Mr. Margolin on the style and generality of his proposal.
>>However, I just finished speeding up a large Common Lisp application
>>in which using MEMQ was much faster than using MEMBER
>>(with a :TEST of #'EQ) was quite significant.  The overhead of the
>>additional parameters, especially keyword parameters, was alone enough
>>to make a difference.  

Gag.  What I MEANT was with default :TEST, which is, of course #'EQL.
Yes, Lucid (and others) do detect and optimize out such idioms.  My
point was not my (uninspired) optimization, which was a preface to the
next paragraph, but to raise a small issue concerning the tradeoff
between style, generality and efficiency.  I meant to cite Lucid to
show that in the presence of a good compiler one could write a
recursive program and still get the kind of efficiency one would want.

>Scheme designers expect the compiler to optimize away much of the
>overhead.  

And so they should.  The question is, can a language supply us with a
suitably general, regular style and still permit the kind of analysis
that frees the programmer from worrying overmuch about efficiency in
the small, when, in the last stages, it becomes important?

>I guess the only problem with optimizing out the explicit passing of
>the eq argument is that Scheme doesn't prohibit user programs from
>redefining standard procedure names as Common Lisp does.  

Perhaps one must grudgingly admit an analysis advantage to Common
Lisp's otherwise distressingly assymetrical treatment of the function
position.

>And Schemers don't even have a valid excuse of "oh, the function position
>is special", because one of the fundamental tenets of Scheme is that the
>function position of a form is evaluated just like other positions.

I think that a Schemer's valid excuse should be "oh, my program
doesn't change the meaning of EQ, CAR, etc."  And maybe s/he should be
able to rely on analysis tools to assert this to a compiler in the
bulk of cases where it can be deduced.

A laudible goal for a language design is that "simple" programs
can be analyzed as such and efficiently compiled.  To what extent does
this conflict with conceptual regularity and generality?

To consider another example, isn't Scheme's somewhat unbridled
treatment of first-class continuations something of an analysis
nightmare?

-- 
Chip Morris, Senior Engineer
Software Options, Inc., 22 Hilliard St., Cambridge MA 02138  (617) 497-5054
chip@soi.com

barmar@think.com (Barry Margolin) (05/18/91)

In article <chip.674492719@soi> chip@soi.UUCP (Chip Morris) writes:
>barmar@think.com (Barry Margolin) writes:
>>I guess the only problem with optimizing out the explicit passing of
>>the eq argument is that Scheme doesn't prohibit user programs from
>>redefining standard procedure names as Common Lisp does.  
>Perhaps one must grudgingly admit an analysis advantage to Common
>Lisp's otherwise distressingly assymetrical treatment of the function
>position.

Actually, in this case it has nothing to do with the special treatment of
the function position.  Users are prohibited from defining any symbol in
the COMMON-LISP package as a function *or* a global, special variable
(there's a list of about 25 things you aren't allowed to do to symbols in
the COMMON-LISP package, so that implementations can call them internally
and depend upon any implementation-specific behaviors of their own
versions, and also to prevent applications from conflicting with each other
by accidentally using the same symbols).

>>And Schemers don't even have a valid excuse of "oh, the function position
>>is special", because one of the fundamental tenets of Scheme is that the
>>function position of a form is evaluated just like other positions.
>
>I think that a Schemer's valid excuse should be "oh, my program
>doesn't change the meaning of EQ, CAR, etc."  And maybe s/he should be
>able to rely on analysis tools to assert this to a compiler in the
>bulk of cases where it can be deduced.

But it frequently can't be deduced statically.  Any call to load (or eval,
in implementations that have it) can potentially redefine any procedure
name.  The compiler has no way of knowing whether such a file will be
loaded into the same environment as the file currently being compiled.

>To consider another example, isn't Scheme's somewhat unbridled
>treatment of first-class continuations something of an analysis
>nightmare?

Yes, but it is possible to analyze many of these cases statically.  A
static analyzer can detect when a procedure *is* redefined, but it rarely
can determine that a procedure *won't* be redefined.
-- 
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/18/91)

In article <chip.674492719@soi>, chip@soi.UUCP (Chip Morris) writes:
> >I guess the only problem with optimizing out the explicit passing of
> >the eq argument is that Scheme doesn't prohibit user programs from
> >redefining standard procedure names as Common Lisp does.  
> 
> Perhaps one must grudgingly admit an analysis advantage to Common
> Lisp's otherwise distressingly assymetrical treatment of the function
> position.

Looking at CLtL2, section 5.3.1, all I can see that bears on this is
the final paragraph:
	It is permissible to use DEFUN to redefine a function,
	to install a corrected version of an incorrect definition,
	for example.
	It is permissible to redefine a macro as a function.
	It is an error to attempt to redefine the name of a ***special form***
	(see table 5-1) as a function.
I have not been able to find the alleged prohibition in the indexed pages
describing DEFUN, FLET, or LABELS.  Given that MEMBER is not a special form,
what section of CLtL2 prohibits redefinition?

Also, the analysis advantage, if it does exist, would only apply to
predefined function names.  For user-defined functions, we have
    ;;  Lisp      Scheme
	DEFUN	: SET!			; functions can be redefined
    ::	FLET	: LET			; or locally rebound
    ::	LABELS	: LETREC		; in _both_ languages.
CL _does_ have a way for users to declare things that can't be changed,
but because of CL's "two name-spaces" approach, DEFCONSTANT _can't_ be
used to define non-redefinable functions.  Those Scheme dialects having
define-constant and/or define-integrable _do_ allow users to write
function definitions which can be optimised in the way that we'd like
a generic MEMBER to be optimised.

-- 
There is no such thing as a balanced ecology; ecosystems are chaotic.

barmar@think.com (Barry Margolin) (05/19/91)

[I wrote]
>> >I guess the only problem with optimizing out the explicit passing of
>> >the eq argument is that Scheme doesn't prohibit user programs from
>> >redefining standard procedure names as Common Lisp does.  
In article <5816@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>Looking at CLtL2, section 5.3.1, all I can see that bears on this is
>the final paragraph:
>	It is permissible to use DEFUN to redefine a function,
>	to install a corrected version of an incorrect definition,
>	for example.
>	It is permissible to redefine a macro as a function.
>	It is an error to attempt to redefine the name of a ***special form***
>	(see table 5-1) as a function.
>I have not been able to find the alleged prohibition in the indexed pages
>describing DEFUN, FLET, or LABELS.  Given that MEMBER is not a special form,
>what section of CLtL2 prohibits redefinition?

It's in an obscure place, p.260.  It is a list of restrictions on what an
implementation or program may do to symbols in the COMMON-LISP package,
which is why it's in the Packages chapter.  It specifically prohibits
portable programs from defining or binding such symbols as functions or
macros (this is our only concession to "macro hygiene").

>Also, the analysis advantage, if it does exist, would only apply to
>predefined function names.  

That's what I meant by "standard procedure names".
-- 
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

chip@soi.UUCP (Chip Morris) (05/20/91)

In reply to my remark

>>I think that a Schemer's valid excuse should be "oh, my program
>>doesn't change the meaning of EQ, CAR, etc."  And maybe s/he should be
>>able to rely on analysis tools to assert this to a compiler in the
>>bulk of cases where it can be deduced.

barmar@think.com (Barry Margolin) writes:

>But it frequently can't be deduced statically.  Any call to load (or eval,

Of course, all bets are off after calls to LOAD or EVAL.  However, I
found it not too hard to write a rather large program without ever
mentioning either.  I sure would like to have reasonable analyzers.

Indeed, in his reply to my point

>>To consider another example, isn't Scheme's somewhat unbridled
>>treatment of first-class continuations something of an analysis
>>nightmare?

Mr. Margolin says

>Yes, but it is possible to analyze many of these cases statically.  A
>static analyzer can detect when a procedure *is* redefined, but it rarely
>can determine that a procedure *won't* be redefined.

Part of the problem is that people seem to advocate an all-or-nothing
solution.  The existence of LOAD or EVAL is first used to pooh-pooh
the idea of analysis tools to guarantee certain invariants about a
program.  But maybe I'm willing to write to coding standards
(statically verifiable?) that permit such tools to work.  Shouldn't we
permit, if not encourage, such practice?  Surely one could defeat the
tools used to analyze continuations, too.  But why stop hold up the
wagon train for obscure usages when a huge amount of useful work could
be done on behalf of (analytically) simple programs that could benefit
from analysis and attendant optimization?

-- 
Chip Morris, Senior Engineer
Software Options, Inc., 22 Hilliard St., Cambridge MA 02138  (617) 497-5054
chip@soi.com

barmar@think.com (Barry Margolin) (05/21/91)

In article <chip.674746246@soi> chip@soi.UUCP (Chip Morris) writes:
>Of course, all bets are off after calls to LOAD or EVAL.  However, I
>found it not too hard to write a rather large program without ever
>mentioning either.  I sure would like to have reasonable analyzers.

But the static analyzer has no way of knowing what will be done prior to
loading the file it is analyzing.  This leads into your later point....

>Part of the problem is that people seem to advocate an all-or-nothing
>solution.  The existence of LOAD or EVAL is first used to pooh-pooh
>the idea of analysis tools to guarantee certain invariants about a
>program.  But maybe I'm willing to write to coding standards
>(statically verifiable?) that permit such tools to work.  Shouldn't we
>permit, if not encourage, such practice?

Certainly we should.  In the ANSI CL standard, we're including explicit
specifications that permit compilers to make enough assumptions that a
reasonable level of static analysis can be done.  We disallow users from
redefining symbols specified in the standard.  We define a set of
assumptions the compiler is permitted to make about the relationship
between the compilation environment and the runtime environment, which
permit a certain amount of intrafile optimization (e.g. many
implementations' "block compilation" facilities).

Making these things explicit causes the standard to be more complex, but we
think it's worth it.

-- 
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar