[comp.lang.scheme] "unspecified" and SET!

dorai@titan.rice.edu (Dorai Sitaram) (06/19/89)

The recent discussion of "unspecified" values reveals that Scheme is
not as syntactically uniform as it is claimed to be. Some Scheme
syntactic forms return meaningful values, others don't: the former are
"expressions" and the latter "commands". "Commands" are usually used
to represent side-effects, e.g., assignment in the form of set!.  In
an expression-oriented language, a "command" is awkwardly "expressed"
by an expression returning an arbitrary "unspecified" value.

Perhaps, SET! is not the ideal side-effecting construct for an
expression-oriented language. Felleisen and Friedman [1] propose a
side-effecting construct called a SIGMA-capability. This looks just
like a LAMBDA-expression, but whereas LAMBDA introduces new bound
variables, SIGMA modifies the binding of existing lexical variables.
In other words, on applying a SIGMA-capability to arguments, the
variables corresponding to the SIGMA-parameters are side-effected.

Problems about "unspecified" return values disappear. The application
of a SIGMA-capability side-effects the SIGMA-parameters and returns
the result of evaluating the SIGMA-body (the parallel being the
application of a LAMBDA-expression). In addition, SIGMA gives a very
natural method for parallel assignment, since the arguments to a
SIGMA-capability are all evaluated before the application performs the
side-effects.

We can retrieve the old SET! from SIGMA as follows:
	(set! x a) == ((sigma (x) <what-now>) a)
where <what-now> is the "unspecified" value of your choice. Thus,
there is no dependence on a system-chosen unspecified value.

Even better, we need not fall back on a command with its necessarily
arbitrary choice of unspecified return value. We can define a
_generalized_ SET! in terms of SIGMA that is more along the lines of
LET in terms of LAMBDA:
	(set! ([x a] ...) e ...) == ((sigma (x ...) e ...) a ...)

The upshot of all this is that while we have side-effects, we still
have a syntactically uniform language.

--dorai

[1] Felleisen & Friedman, A calculus for assigments ..., POPL 1987,
314 - 325.

ps: In terms of SET!, SIGMA's effect is as follows:
	(sigma (x ...) e ...) == (lambda (x^ ...) (set! x x^) ... e ...)
where x^ ... are new variables, used only to show what's going on. A
native implementation of SIGMA is simply got by mimicking LAMBDA,
except for the following: where LAMBDA has extend-environment, SIGMA
has update-environment.
-------------------------------------------------------------------------------
It may be that the gulfs will wash us down;
It may be we shall touch the Happy Isles.
-------------------------------------------------------------------------------

Alan@REAGAN.AI.MIT.EDU (Alan Bawden) (06/20/89)

    Date: 18 Jun 89 23:17:50 GMT
    From: Dorai Sitaram <titan!dorai@rice.edu>
    ...
    Perhaps, SET! is not the ideal side-effecting construct for an
    expression-oriented language. Felleisen and Friedman [1] propose a
    side-effecting construct called a SIGMA-capability. This looks just
    like a LAMBDA-expression, but whereas LAMBDA introduces new bound
    variables, SIGMA modifies the binding of existing lexical variables.
    In other words, on applying a SIGMA-capability to arguments, the
    variables corresponding to the SIGMA-parameters are side-effected.

    Problems about "unspecified" return values disappear....

I'm having a hard time applying this idea to the NEWLINE procedure -- which
is specified in R3RS to return an unspecified value.  Is the idea to
introduce a side-effecting construct that looks like LAMBDA, but performs a
newline when applied to an output port?

  ((NULINE () <body>)
   (CURRENT-OUTPUT-PORT))

dorai@titan.rice.edu (Dorai Sitaram) (06/20/89)

In response to my posting suggesting SIGMA and SIGMA-based SET! (a la
LAMBDA and LET) instead of traditional SET!, I received email which
basically states that SIGMA violates the orthogonality of Scheme
constructs, viz., SIGMA does "two" things: it "side-effects the
environment" _and_ "returns a value".

In other words, it is felt that any side-effecting construct should
just perform the job it was made for, viz., side-effecting, and not do
something else too, like returning values. SIGMA allegedly "fails" in
the latter detail.

But then, we should remember that our goal was to preserve the
syntactic uniformity of Scheme. Scheme is (and very deliberately too)
an expression-oriented language. In spite of our best efforts,
traditional SET! does return a value. Given that whichever primitive
side-effecting construct we pick will inevitably return a value, which
construct should we choose? I would prefer one where the returned
value is not only meaningful but _necessary_.

The value that SET! returns is totally arbitrary. What SIGMA returns
is crucial to its correctness. Recall that SIGMA does not perform a
side-effect immediately; it returns a closure, which needs to be
applied to value(s) before any side-effecting can take place.

(Incidentally, LAMBDA, the cornerstone of Scheme, also has "two"
purposes _if_ we take "returning a value" as a distinguishing property
in assessing the orthogonality of Scheme. LAMBDA introduces fresh
lexical variables _and_ returns a closure. I don't think anyone
seriously wants to reject LAMBDA for lack of orthogonality!)

--dorai

ps: Other questions:

* Why was SIGMA created? Felleisen & Friedman (POPL 1987) devised a
  calculus for a lambda-calculus-like language with assignments. SIGMA
  kind of falls out naturally. 

* Why is it "superior" to SET!? "Superior" is a tough word. However, I
  did mention the naturalness of parallel assignment in my orig.
  posting. Using SIGMA-capabilities as arguments to control-reifiers
  like call/cc is also delicious.

* There is also another reason why the SIGMA-model of assignment, viz.,
  (sigma (x ...) e ...) or (set! ([x a] ...) e ...), makes sense. In any
  program, one always performs some operation _after_ the
  side-effecting (of variable-bindings(see + below)) has been done. If
  the side-effecting were the last thing done in the program, there is
  no way it could be detected, hence no way it could be meaningful.
  Thus the application of a SIGMA-capability returns the value of the
  body of the SIGMA-expression, something (and such a something
  invariably exists) that is evaluated _after_ the side-effecting has
  taken place.

+{PRINTF is the only side-effect which can meaningfully be the last
  thing in a program.}

* [Alan Bawden's question about NEWLINE...]
  How do you deal with the other "commands" of Scheme? (E.g., READ,
  SET-CAR!, PRINTF.) Easier than SET!. SIGMA-like counterparts need
  _not_ be made for these, as that would lead to a plethora of special
  forms (as opposed to 'procedures'). All of these "command"
  procedures now take an additional argument for "the body" (thunk
  actually) whose value is returned. 

{* SIGMA of course is necessarily a special form (i.e., it cannot be
   substituted by a procedure as for READ, SET-CAR!, PRINTF above).
   The reason is variable names can't be passed as procedure arguments.}

--dorai
-------------------------------------------------------------------------------
It may be that the gulfs will wash us down;
It may be we shall touch the Happy Isles.
-------------------------------------------------------------------------------

manis@grads.cs.ubc.ca (Vincent Manis) (06/20/89)

I've bitten my tongue on this, but I really can't see the point of
#!unspecified. Perhaps it would be nice in theory (especially if 

(define loop  
  (lambda (x)
      (loop (1+ x))))

returned #!unspecified, which would then be equivalent to `bottom' in
the Scott-Strachey world), but it seems pointless.

One can argue, as somebody has, that the existing behaviour
distinguishes between expressions and commands; however, Scheme already
has a strong notion of side effects (unlike, say, Miranda), and
whether a side-effecting procedure yields (), #f, an unspecified
result, or #!unspecified really makes no difference to anybody, except
in one place, which I'll mention in a moment. 

I regard `returns an unspecified result' as an injunction not to place
a call to a procedure in a value-returning position, e.g., the last
step in a lambda or begin. If I wrote a Scheme compiler, fascist that
I am, I would probably generate an error message for such a usage.
Going to the trouble to develop a calculus of unspecified values
strikes me as almost as silly as some of the things I did when I was
involved in writing an Algol 68 compiler.

The one place where #!unspecified makes some sense is in
Read-Eval-Print loops. One can use #!unspecified as a trigger that the
result is not to be printed. But then one does not need to change the
language: PC Scheme, for example, has *the-non-printing-object*. 

If it ain't broke, don't fix it.
____________  Vincent Manis                    | manis@cs.ubc.ca
___ \  _____  The Invisible City of Kitezh     | manis@cs.ubc.cdn
____ \  ____  Department of Computer Science   | manis%cs.ubc@relay.cs.net
___  /\  ___  University of British Columbia   | uunet!ubc-cs!manis
__  /  \  __  Vancouver, BC, Canada V6T 1W5    | (604) 228-2394
_  / __ \  _  "Theoretical computer science helps me convince people that
____________  my indecisiveness is really Nondeterminism, which sounds like
              a much more positive characteristic." -- a student

pierce@lanai.cs.ucla.edu (Brad Pierce) (06/20/89)

In article <2281@ubc-cs.UUCP> manis@grads.cs.ubc.ca (Vincent Manis) writes:
  << Stuff omitted. >>
>I regard `returns an unspecified result' as an injunction not to place
>a call to a procedure in a value-returning position, e.g., the last
>step in a lambda or begin. If I wrote a Scheme compiler, fascist that
>I am, I would probably generate an error message for such a usage.
  << Stuff omitted. >>

This is really not a criticsm, only a question. How does one go about
building a compiler that can detect when a procedure is going to 
return an unspecified result, especially those which sometime return
unspecified values and sometimes return specified values? For example,
in an object-oriented program most objects will respond to some
messages with actual information and to some messages with unspecified
values. On the face of it, it sounds very difficult to determine such
things syntactically. And wouldn't runtime checking amount to 
introducing the unspecified object?

It doesn't seem right to me that a user should have to know the 
full specifications of an object before being able to confidently
send messages to that object.  It would seem preferable to me to
be able to check whether the result returned has any information
content.

Also in this discussion I haven't heard any mention of my suggestion
to see the unspecified object as being analogous to NaN in the IEEE
numeric standard. Is that a bad analogy? Is NaN a bad idea? Just
wondering.

-- Brad

ALAN@ML.AI.MIT.EDU (Alan Bawden) (06/21/89)

    Date: 19 Jun 89 23:24:37 GMT
    From: Dorai Sitaram <titan!dorai@rice.edu>
      How do you deal with the other "commands" of Scheme? (E.g., READ,
      SET-CAR!, PRINTF.) Easier than SET!. SIGMA-like counterparts need
      _not_ be made for these, as that would lead to a plethora of special
      forms (as opposed to 'procedures'). All of these "command"
      procedures now take an additional argument for "the body" (thunk
      actually) whose value is returned. 

So we would write

  (NEWLINE (CURRENT-OUTPUT-PORT)
    (LAMBDA ()
      (WRITE X (CURRENT-OUTPUT-PORT)
	(LAMBDA () X))))

to newline, and then output and return the current value of X.  (Or perhaps
the continuations are passed as the first argument -- it doesn't matter in
the current discussion.)

Now it would seem to me that the most uniform way to include assignment in
this scheme, is to have SET! take a continuation as well.  So

  (SET! X (+ X 1)
    (LAMBDA () X))

would increment the value of X and return the new value.

SET! is still a keyword, so this isn't -exactly- parallel with the
side-effecting procedures.  But I'll bet that once people start writing
programs using this scheme, the thing that will impress them the most is
that whenever they perform a side-effect they have to create an explicit
continuation to wait for the side-effect to happen.  And I'll bet they will
be most comfortable if they have to make exactly the same kind of
continuation for SET!.

I'm currently working on a language where users really -must- control
sequencing by explicitly using thunks, so I'm very interested in linguistic
constructs that support this style.  Actually, in my case its worse than in
plain Scheme, because users must also exert control over when variables are
-read- as well as written.  Thus users have to write something like

  (LOOK! X
    (LAMBDA (VALUE)
      (SET! X (+ 1 VALUE)
	(LAMBDA ()
	  <whatever>
	  ))))

to increment the value of X, because if they wrote

    (SET! X (+ 1 X)
      (LAMBDA ()
	<whatever>
	))

the occurrence of X in (+ 1 X) might be taken to refer to the value of X
-after- the SET! happens (so that the value of X would become the fixed
point of (lambda (n) (+ 1 n))!).

mkatz@SESAME.STANFORD.EDU (Morris Katz) (06/22/89)

At the R*RS meeting in Snowbird we discussed the possibility of adding some
form of multiple values to Scheme and allowing continuations to accept multiple
values.  I wonder if the solution to the problem of what should be returned by
side effecting procedures like SET! is to just have them return no values (i.e.
a zero arity multiple value).  As long as continuations can handle multiple
values appropriately, it seems to me that this solution would have many, if not
all, of the advantages of #!UNSPECIFIED without introducing many of the
undesirable side effects.

					Morry Katz
					katz@polya.stanford.edu

P.S.  After the meeting at Snowbird, Guy Steele and I came to an agreement as
      to the correct of the two conflicting semantics for multiple values to
      continuations that we proposed.  At that time, we believed that this
      solution would probably be acceptable to the great majority of the
      community.  Is there interest in having the proposal written up for
      future consideration?  I would be willing to strain my memory in an
      attempt to creat such a proposal if there is genuine interest.

chaynes@iuvax.cs.indiana.edu (Chris Haynes) (06/22/89)

   From: mkatz@SESAME.STANFORD.EDU (Morris Katz)
   Newsgroups: comp.lang.scheme
   Date: 21 Jun 89 21:17:54 GMT
   ...
   P.S.  After the meeting at Snowbird, Guy Steele and I came to an agreement as
	 to the correct of the two conflicting semantics for multiple values to
	 continuations that we proposed.  At that time, we believed that this
	 solution would probably be acceptable to the great majority of the
	 community.  Is there interest in having the proposal written up for
	 future consideration?  I would be willing to strain my memory in an
	 attempt to creat such a proposal if there is genuine interest.

I'm interested.

gateley@m2.csc.ti.com (John Gateley) (06/22/89)

In article <2281@ubc-cs.UUCP> manis@grads.cs.ubc.ca (Vincent Manis) writes:
>Going to the trouble to develop a calculus of unspecified values
>strikes me as almost as silly as some of the things I did when I was
>involved in writing an Algol 68 compiler.
 ...
>____________  Vincent Manis                    | manis@cs.ubc.ca
 ...

Do you mean Matthias Felleisen's calculus for Scheme? It is neither a 
calculus of unspecified values nor silly. It is a better theoretical
basis for Scheme than the lambda value calculus: it includes both
set! style side effects and continuations in its equations.

John
gateley@m2.csc.ti.com

manis@grads.cs.ubc.ca (Vincent Manis) (06/23/89)

In article <81872@ti-csl.csc.ti.com> gateley@m2.UUCP (John Gateley) writes:
>Do you mean Matthias Felleisen's calculus for Scheme? It is neither a 
>calculus of unspecified values nor silly. It is a better theoretical
>basis for Scheme than the lambda value calculus: it includes both
>set! style side effects and continuations in its equations.

To clarify, let me say emphatically that I *don't* mean Felleisen's
calculus. As John says, Felleisen's work is not at all silly. I was
instead referring to the proposals for #!unspecified, which, is either
the same as `bottom' in denotational semantics or it is not. In the
former case, #!unspecified should be returned, also, in both a
division by 0 and a non-terminating recursion. In the latter case,
#!unspecified is somewhat pointless (set! could just as easily return
`()), unless one has some reason for distinguishing the two.

While I do not care for William of Ockham's metaphysics, there is much
to say for his Razor. 

____________  Vincent Manis                    | manis@cs.ubc.ca
___ \  _____  The Invisible City of Kitezh     | manis@cs.ubc.cdn
____ \  ____  Department of Computer Science   | manis%cs.ubc@relay.cs.net
___  /\  ___  University of British Columbia   | uunet!ubc-cs!manis
__  /  \  __  Vancouver, BC, Canada V6T 1W5    | (604) 228-2394
_  / __ \  _  "Theoretical computer science helps me convince people that
____________  my indecisiveness is really Nondeterminism, which sounds like
              a much more positive characteristic." -- a student

dorai@titan.rice.edu (Dorai Sitaram) (06/26/89)

In article <2281@ubc-cs.UUCP> manis@grads.cs.ubc.ca (Vincent Manis) writes:
>Going to the trouble to develop a calculus of unspecified values
>strikes me as almost as silly as some of the things I did when I was
>involved in writing an Algol 68 compiler.

If this is in response to my posting(s) entitled "unspecified and
set!"  (and I'm not sure it is -- Vincent doesn't say), I request
permission to shriek, "No! This is not what I meant!" ;-)

I was drawing attention to side-effecting constructs in the literature
(that they appeared in a calculus-definition is of secondary
importance) that give all the "power" of current Schemes WITHOUT
having to mess with (returning) unspecified values at all. I do, of
course, concur with Vincent that having an unspecified object like
#!unspecified is close to being the all-time great oxymoron of our
troubled times (though he might not use the selfsame words ;-]).

Second, the comparison that Vincent draws between "(#!)unspecified"
and "bottom" is not correct. "Bottom" stands for divergence
(non-termination) and occasionally, errors; "(#!)unspecified" is a
stopgap value concocted to stand for a meaningless value returned in a
TERMINATING computation.

--dorai
-------------------------------------------------------------------------------
It may be that the gulfs will wash us down;
It may be we shall touch the Happy Isles.
-------------------------------------------------------------------------------