[comp.lang.prolog] Concerning standards.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/11/91)

There has been some discussion in both comp.lang.prolog and
comp.lang.scheme recently concerning standards for the respective
languages.  I think that the Scheme standard has a great deal to
teach people concerned with the Prolog standard (to the point
that I was flabbergasted to discover that some people concerned
with the Prolog standard with whom I corresponded not only hadn't
seen the Scheme standard, not only didn't know one was under way,
but had barely even heard of Scheme).

Let me start right off by saying that when I saw a draft of the
Scheme standard, my immediate reaction was "at LAST Algol 60 has
a successor!"  I still feel that way.  Scheme is a beautiful
language.  It's almost exactly the right "size".  The Scheme
standard is a superb piece of work.  50-odd pages _including_ a
formal semantics (which transliterates nicely into Scheme.  The
people who wrote that standard really understood the language
they were standardising.

I think it would be an excellent thing if more of the people
concerned with the Prolog standard understood the language they
were standardising.  In particular, one would expect the
formal specification of Prolog to be _about_ the same size as
the formal specification of Scheme.

But a standard is not an exercise in academic symbol-pushing.
The development of a standard should not be treated as an excuse
for CS self-stimulation.  A standard is a practical instrument,
and the important thing about a standard is the practical effect
it actually has.  This is of particular interest because for
many years an explicit goal (indeed, almost the only explicit
goal) of the BSI committee was to develop a "minimal" standard,
and the Scheme standard is just such a "minimal" standard.  So a
very important lesson to be learned from the Scheme standard is
"what is the practical effect of a small standard like this".

I regret to inform you that the answer appears to be "not much".

Let me offer you two examples.

1.  Adding and Subtracting 1.

    Many Lisp systems have functions for adding and subtracting 1.
    Over the years, the names have varied a lot, but Common Lisp
    has finally settled on the names
	(1+ expr)		; add 1 to expr
	(1- expr)		; subtract 1 from expr.
    Now, the last name is a little misleading.   It suggests
    1 - expr, rather than expr - 1.  So the language described in
    "Structure and Interpretation of Computer Programs", namely
    "Scheme", included
	(1+ expr)		; add 1 to expr
	(-1+ expr)		; add -1 to expr == subtract 1

    The Scheme standard, however, includes neither of these
    operations.  They can easily be replaced by the entirely
    equivalent
	(+ expr 1)		; add 1 to expr
	(- expr 1)		; subtract 1 from expr
    and a smart compiler will generate the same code for those
    expressions that it would have generated for 1+ and -1+.
    Indeed, there is nothing in the Scheme standard that requires
    a Scheme system to accept 1+ and -1+ as valid identifiers!

    What is the effect of this?
    a) Some Scheme systems provide 1+, 1-, and -1+
    b) Some Scheme systems provide 1+ and 1-
    c) Some Scheme systems provide 1+ and -1+
    d) Some Scheme systems provide none of them.
    And people using a) or c) systems look at SICP and think they
    are writing portable Scheme code, while people using b) systems
    look at CLtL and think "Scheme is a Lisp, so 1- is portable".

    Frankly, I am getting just a little bit tired of having to go
    through Scheme code fixing references to 1+, 1-, and -1+.
    The identifiers + and - are already exceptional; would adding
    1+ and -1+ to the list of exceptional identifiers been _such_
    a big deal?  And would defining 1+ and -1+ really have made
    the standard too big?

2.  Sorting.

    Sorting is indispensable in Prolog.  It is difficult to write
    a non-trivial efficient Prolog program without using it.  As
    Scheme lacks hash tables, sorting is also very useful in Scheme.
    However, sorting is not part of the Scheme standard or R^(3.99)RS.
    The result is that sorting is often provided, _but_ the only
    thing you can rely on is that
	if there is a non-destructive sort, it is called `sort'
	if there is a destructive sort, it is called `sort!'.
    Apart from that, the variations I have encountered are
 	a) both are provided
 	b) only `sort' is provided
 	c) only `sort!' is provided
 	d) neither is provided
        --
        e) The interface is (sort sequence less?)
        f) The interface is (sort sequence &optional (less? <))
        g) The interface is (sort less? sequence)
        --
        h) Sorting only works on lists
        i) Sorting only works on vectors
        j) `sort' works on lists only, `sort!' on vectors only
        k) Sorting works on both lists and vectors
        --
        l) Sorting is stable
        m) Sorting is not stable
        n) It isn't specified whether sorting is stable or not.
    All of this variation is really rather pointless, and does nothing
    to help people who want to write non-trivial portable code.

    i)   It is just as easy to provide _both_ sort and sort! as it
	 is to provide either.
    ii)  The interface (sort sequence less?) is both consistent with
	 Common Lisp and the majority of Schemes known to me.
    iii) Given vector->list and list->vector, there is no reason not
	 to make sort and sort! work on both lists and vectors.
    iv)  merge sort is not only stable, it is considerably faster than
	 the quicksort chosen by people who haven't read the fine print.

    I have my own implementations of `sort' and `sort!' (based on David
    H. D. Warren's keysort/2 in DEC-10 Prolog) and they are faster than
    the sorts provided in any of the Schemes where I've yet been able to
    make measurements.  But why should _everyone_ have to carry around
    their own sorting functions?  (Their own _different_, incompatible,
    and (if they succumb to the temptation to use quicksort, inefficient)
    sorting functions.)

That's looking at the effect of a `minimal' standard.
Mind you, some of the things the standard _does_ say haven't received
much in the way of conformance, so perhaps it doesn't really matter
that these things _were_ left out of the standard.

The point I want to make is that there is a ``right'' size for a
standardised language, and that many of the things which could have
been defined using the kernel of the language should be included in
a standard library.  `sort' and `sort!' can be implemented well using
only standard operations, and so could `1+' and `-1+' be if they were
legal identifiers.  Perhaps the answer is to have a kernel standard
and a free source-code `specification-quality' library, like ML.  But
Scheme seems to be a successor to Algol 60 in more ways than one, and
I'd hate to see Prolog go the same way.
-- 
Q:  What should I know about quicksort?   A:  That it is *slow*.
Q:  When should I use it?  A:  When you have only 256 words of main storage.

hassan@prl.dec.com (Hassan Ait-Kaci) (06/12/91)

In article <6209@goanna.cs.rmit.oz.au>, ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:

> 
>     I have my own implementations of `sort' and `sort!' (based on David
>     H. D. Warren's keysort/2 in DEC-10 Prolog) and they are faster than
>     the sorts provided in any of the Schemes where I've yet been able to
>     make measurements.

Is there a reference available on this keysort algorithm? I'd appreciate it.
Thanks.

-hak

kend@data.UUCP (Ken Dickey) (06/14/91)

In comp.lang.prolog goanna.cs.rmit.oz.au writes:

>There has been some discussion in both comp.lang.prolog and
>comp.lang.scheme recently concerning standards for the respective
>languages. ...

>1.  Adding and Subtracting 1.

-1 is a negative number, so cannot be a procedure: (- 0 1) -> -1

This was done as part of syntax cleanup for numbers (partly because
people found the procedure names 1+, -1+, etc. confusing; partly because
of parsing complex number syntax: e.g. "-1+3i").


>    Frankly, I am getting just a little bit tired of having to go
>    through Scheme code fixing references to 1+, 1-, and -1+.

This looks like a global-replace+educate to me.  (Note: given the
growth in the Scheme community, most Scheme code had yet to be
written).


>    However, sorting is not part of the Scheme standard or R^(3.99)RS.

This is because we were specifying the language, not a standard
library.  One of the joys of Scheme is the ability to add new data
types (e.g. Windows) without changing the language.


>That's looking at the effect of a `minimal' standard.
>Mind you, some of the things the standard _does_ say haven't received
>much in the way of conformance, so perhaps it doesn't really matter
>that these things _were_ left out of the standard.

As the IEEE standard was just published this month, this should not be
suprising.  Note that authors of virtually all Scheme compilers I know
of were represented on the standards committee (MIT, Chez, MacScheme,
Gambit, Scheme84, Scheme->C, PCScheme, Scheme XEROX, etc.--even a
Japanese group).  I expect all live compilers will be complient within
a few months.


-Ken Dickey				kend@data.uucp

=========================================================================
PS: for those interested...
=========================================================================
The IEEE Standard for the Scheme Programming Language, IEEE Std 1178-1990,
is available from

    IEEE Service Center
    445 Hoes Lane
    P.O. Box 1331
    Piscataway, NJ 08865-1331

or by calling 1-800-678-IEEE.  The order number is SH14209 and the price is
$40, with a 30% discount for IEEE members.  Credit cards and checks are 
accepted.  

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/18/91)

In article <504@data.UUCP>, kend@data.UUCP (Ken Dickey) writes:
> >1.  Adding and Subtracting 1.

> -1 is a negative number, so cannot be a procedure: (- 0 1) -> -1

Irrelevant.  The identifiers in question (used in SICP!) were 1+
and -1+, neither of which is a number.

> This was done as part of syntax cleanup for numbers (partly because
> people found the procedure names 1+, -1+, etc. confusing; partly because
> of parsing complex number syntax: e.g. "-1+3i").

Bogus.  Lisp tokenisers have typically read the whole token into a buffer,
checked to see whether it could be parsed as a number, and if not have
taken it to be a symbol.  Not only do Common Lisp systems _still_ have to
work that way, every actual Scheme implementation where I have been able
to look at the code DOES work that way.  Doesn't prove they all _do_, but
does prove they _could_ without otherwise violating the standard.  And
the argument falls flat on its face in a steaming pile of you-name-it-but-
it-ain't-roses when you remember that my complaint was not that the
standard says you CAN'T use 1+ or -1+ as identifiers (it just doesn't say
you CAN) and that in consequence most Schemes *DO* provide these operations,
it just that you can't _quite_ rely on them, because there are a couple
that don't.

> >    Frankly, I am getting just a little bit tired of having to go
> >    through Scheme code fixing references to 1+, 1-, and -1+.

> This looks like a global-replace+educate to me.

Sorry mate, you've forgotten that Scheme can do things with functions
other than just call them.  Global replace doesn't cut it.  Variables
are sometimes initialised to 1+ and -1+, e.g.
	(define add1 1+)
	(define sub1 -1+)
At this point, add1 and sub1 HAVE to be given a suitable definition.
Once you've got it, why not use it?  Of course, if you're using something
like TRANSOR or SEdit your global-search-and-replace MAY be able to
handle
	((if (< x 0) 1+ -1+) x)
but mine (even in Emacs) isn't.  (Things can, and do, get worse.)

There really wasn't anything to gain by breaking so much of SICP.
(Deleting atom? seemed rather petty, too, and has had the same effect.)

> (Note: given the
> growth in the Scheme community, most Scheme code had yet to be
> written).

Sorry, but that's not an answer.  The TEXTBOOKS had ALREADY been written.
Dybvig, for example, uses 1+.  Nobody learning Scheme from that book
would suspect that it was not in fact part of Scheme.  Nor from SICP.
And one of the Scheme manuals I have, whose authors took great care to
flag everything that isn't standard, _don't_ flag 1+ because when they
wrote the manual it *was* in the R^nRS.

I note that John Stobo's book on Prolog has met the same fate,
although since his book described the new language that the committee
he was on were busy designing, the language he described is one which
_never_ came into existence.  How very sad for him.  And if he had
stuck to describing say, C Prolog or DEC-10 Prolog, some of his readers
would have been able to use the book and would still be able to use it.


> >    However, sorting is not part of the Scheme standard or R^(3.99)RS.

> This is because we were specifying the language, not a standard
> library.

Sorry, that isn't true.  The last draft of the standard I saw (we've
got an order for the actual thing in the pipe-line somewhere) included
assq, assv, assoc, memq, memv, member.  Those operations can be defined
entirely in Scheme.  (Book 1, Lesson 3.)  What are they, if not
"standard library"?  In DEC-10 Prolog, member/2, and append/3 were not
built in; they were part of the library.  (length/2 could have been, as
well.)  Let's face it, (length -), (member - -), (append - -...) are
all library functions that _could_ have been left out of the standard.
(substring - - -) could have been left out (as (subvector - - -) has,
to my annoyance).  That's 9 so far.  In fact, I have my own implementations
of (write obj [port]) and (display obj [port]), because I wanted versions
with depth bounds, length bounds, and output base control.  The core takes
about a page of code, because everything you need is there in the language.
(This is not true for Prolog, because of variables.  write/1 is _not_
definable in `kernel' Prolog.)  Now we're up to 11; and (write - -) and
(display - -) are comparable in complexity to (sort - -).

There's no way out of it:  the Scheme standard DOES include quite a few
operations that _could_ have been put in a PD library instead.  The only
question is HOW BIG a library you are going to include.  Common Lisp's
designers have said "We are interested in helping people be more
productive; the less they have to write again the more of their _own_
work they can do".  The problem is that you can get a system which
overwhelms you with things you don't want.

Ok, one answer is to HAVE an explicit library.  To say, we'll leave this
out of the kernel of the standard.  If you want it, you have to ask for
it to be loaded.  But it will be _there_ in a form that can be loaded;
you don't have to write it.  (That's the DEC-10/Quintus library approach.)

> >That's looking at the effect of a `minimal' standard.
> >Mind you, some of the things the standard _does_ say haven't received
> >much in the way of conformance, so perhaps it doesn't really matter
> >that these things _were_ left out of the standard.
> 
> As the IEEE standard was just published this month, this should not be
> suprising.

Yes it should.  It has been easy to get drafts of the standard.  I have
in fact encountered a product that claims conformance, but lies.

> I expect all live compilers will be complient within a few months.

How much are you willing to bet?
-- 
Q:  What should I know about quicksort?   A:  That it is *slow*.
Q:  When should I use it?  A:  When you have only 256 words of main storage.