[comp.lang.misc] Expression Based Language

pierson@mist (Dan Pierson) (12/29/88)

In article <9282@ihlpb.ATT.COM>, nevin1@ihlpb (Liber) writes:
>This really has nothing to do with ":=" vs "=".  C is an
>expression-based language; Pascal is not (by expression-based I mean
>that every operator returns a value that can be used as one of the
>parameters for another operator).

OK, you get the grumble instead of Peter :-)

Since you're defining your own term, I can't flatly say you're wrong,
BUT this definition of "expression based" contradicts every other
definition I'm familiar with.  C and Pascal are both statement based
languages, however C has a larger set of legal expressions.  BLISS and
Lisp are both expression based languages, I.E. everything is an
expression and returns a value.  

If C was an expression based language, the ?: expression would be
totally redundant with a normal if statement (of course it would
still be more compact).  If C was an expression based language the
following would be legal:

   foo = switch (bar) {
    	    case 'a': ...
	    case 'b': ...
	    default: ...
	 };

This is not fantasy, the equivalent expressions exist and are useful
in BLISS and Lisp.
-- 
                                            dan

In real life: Dan Pierson, Encore Computer Corporation, Research
UUCP: {talcott,linus,necis,decvax}!encore!pierson
Internet: pierson@multimax.encore.com

bga@raspail.UUCP (Bruce Albrecht) (12/30/88)

Another expression based language is Algol 68.  Not only is

a := case month
       in 31, if year mod 4 = 0 then 29 else 28 fi, 31, 30, 31, 30, 31, 31,
       30, 31, 30, 31 esac;
legal, but so is

if a=b then a else c fi := d;

and they can be abbreviated to
a := (month|31, (year mod 4 = 0|29|28), 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);

and
(a=b|a|c) := d;
respectively.  (And who said C could be terse and unreadable?)

peter@ficc.uu.net (Peter da Silva) (12/30/88)

In article <4505@xenna.Encore.COM>, pierson@mist (Dan Pierson) writes:
> In article <9282@ihlpb.ATT.COM>, nevin1@ihlpb (Liber) writes:
> >This really has nothing to do with ":=" vs "=".  C is an
> >expression-based language; Pascal is not (by expression-based I mean
> >that every operator returns a value that can be used as one of the
> >parameters for another operator).
> 
> OK, you get the grumble instead of Peter :-)

Grumble away.

> Since you're defining your own term, I can't flatly say you're wrong,
> BUT this definition of "expression based" contradicts every other
> definition I'm familiar with.

You're right. However I once modified a version of the small-C compiler
to make 'C' a completely expression-based language. The changes are really
very minor (at least for that case... I don't know quite how full 'C'
would take it).

> If C was an expression based language, the ?: expression would be
> totally redundant with a normal if statement (of course it would
> still be more compact).

The small-C I was working with didn't have '?:', one reason I went to
the trouble. The other reason was I'd just read the BCPL book by Colin
Whitby-Strevens. While I was making anonymous integer arrays, I went
over the edge into algol territory.

> If C was an expression based language the following would be legal:
  [ foo = switch(bar) {...} ]

Well, if I'd had a switch statement it would have been.

> This is not fantasy, the equivalent expressions exist and are useful
> in BLISS and Lisp.

And, I believe, Algol.

Anyway, it was loads of fun.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Work: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.   `-_-'
Home: bigtex!texbell!sugar!peter, peter@sugar.uu.net.                 'U`
Opinions may not represent the policies of FICC or the Xenix Support group.

nevin1@ihlpb.ATT.COM (Liber) (12/31/88)

In article <2583@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>In article <4505@xenna.Encore.COM>, pierson@mist (Dan Pierson) writes:

>I once modified a version of the small-C compiler
>to make 'C' a completely expression-based language. The changes are really
>very minor (at least for that case... I don't know quite how full 'C'
>would take it).

This won't work too well in (full) C.  The problem is that pure
expression-based languages (like LISP) tend to be typeless as well.
Having to return a specific data type severely limits the usefulness of making
control-flow constructs (if, switch, for, etc.) expression-oriented.

>> If C was an expression based language, the ?: expression would be
>> totally redundant with a normal if statement (of course it would
>> still be more compact).

That is why the ?: operator is limited to returning values of the same
type (or coersable type), while the if statement is not.

>> This is not fantasy, the equivalent expressions exist and are useful
>> in BLISS and Lisp.

I agree.

Side note:  In LISP, I tend to think of the side-effect of an expression
as what it does, while its main purpose is to return a value (eg: the
side effect of setq is to assign a value to a variable, while its main
effect is to return its value or the variable; I can't remember which).
In Icon, I tend to reverse these definitions.  Is it just me, or is there
something about the different paradigms that switch these definitions?

Happy New Year!
-- 
NEVIN ":-)" LIBER  AT&T Bell Laboratories  nevin1@ihlpb.ATT.COM  (312) 979-4751

rej@ukc.ac.uk (R.E.Jones) (01/03/89)

In article <9310@ihlpb.ATT.COM> nevin1@ihlpb.UUCP (55528-Liber,N.J.) writes:
>This won't work too well in (full) C.  The problem is that pure
>expression-based languages (like LISP) tend to be typeless as well.

Modern functional languages tend to be strongly typed.

>Having to return a specific data type severely limits the usefulness of making
>control-flow constructs (if, switch, for, etc.) expression-oriented.

Quite right ... IF you have to return a specific data type. However, modern
functional languages like Miranda, employ a polymorphic type discipline.
One therefore gets all the benefits of more powerful combining forms without
having to rewrite essentially the same code for each specific data type.

Richard Jones.

miller@lll-crg.llnl.gov (Patrick Miller) (01/04/89)

In article <9310@ihlpb.ATT.COM> nevin1@ihlpb.UUCP (55528-Liber,N.J.) writes:

>This (using expression forms) won't work too well in (full) C.  
	he problem is that pure
>expression-based languages (like LISP) tend to be typeless as well.
>Having to return a specific data type severely limits the usefulness of making
>control-flow constructs (if, switch, for, etc.) expression-oriented.

Strongly typed functional languages like SISAL support this very concept.
Constructs like:

	IF (test 1) THEN
	    exp 1
	ELSIF (test 2) THEN
	    exp 2
	ELSE
	    exp 3
	ENDIF

are required where exp 1, exp 2, exp 3 are all of the same type.  This is
not as restrictive as one might think.

						Patrick Miller

Patrick J. Miller		miller@lll-crg.llnl.gov
uucp:				{gatech,pyramid,rutgers}!lll-crg!miller
other things to try:		miller%lll-crg.llnl.gov@relay.cs.net

eric@snark.UUCP (Eric S. Raymond) (01/05/89)

In article <9310@ihlpb.ATT.COM> nevin1@ihlpb.UUCP (55528-Liber,N.J.) writes:
>This (using expression forms) won't work too well in (full) C. The problem is
>that pure expression-based languages (like LISP) tend to be typeless as well.
>Having to return a specific data type severely limits the usefulness of making
>control-flow constructs (if, switch, for, etc.) expression-oriented.

To see that this is false, try a simple thought experiment. Imagine that you
have modified your C compiler so that the language now has two new properties;

1. Every block contruct { <st1>; <st2>; .... <stn> ;} returns, as an assignable
   rvalue, the value of the nth statement <stn>.

2. The for, while, do, and case constructs return the value of the last block
   executed before termination; that is, the value of the last *statement*
   executed before termination.

Voila! Expression-oriented C. Trivial to implement, and *incredibly* useful
(if you've ever written LISP you know why). Look at what we could *drop* from
the language -- the sequential-execution comma and ?: operators, for starters.

If you're willing to break some old code, try thinking of function-argument
comma as an aggregation operator...then consider what foo = {4, 5, 2} might
mean for foo of type, say, (int [3]). First class vector arithmetic, anyone?
Now think about the possibilities for natural expression of parallelism in
expressions like {foofunc(), 66, p + q}.

These are *my* impossible dreams for the future of C. Hey, Bjarne -- any
chance you might want to include 1 and 2 above in some future C++ ;-)?
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      Email: eric@snark.uu.net                       CompuServe: [72037,2306]
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718

karl@haddock.ima.isc.com (Karl Heuer) (01/06/89)

(I've added comp.lang.c to the distribution, since a lot of D-designers hang
out there, but I've redirected followups back to comp.lang.misc, since that's
where the discussion seems to belong.  c.l.c readers should subscribe to c.l.m
if they want to continue this thread.)

In article <9310@ihlpb.ATT.COM> nevin1@ihlpb.UUCP (55528-Liber,N.J.) writes:
>In article <2583@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>>I once modified a version of the small-C compiler to make 'C' a completely
>>expression-based language. The changes are really very minor (at least for
>>that case... I don't know quite how full 'C' would take it).
>
>This won't work too well in (full) C.  The problem is that pure
>expression-based languages (like LISP) tend to be typeless as well.
>Having to return a specific data type severely limits the usefulness of making
>control-flow constructs (if, switch, for, etc.) expression-oriented.

Let's try to define EC, an extension of C with the property that every
statement is an expression, yet retaining the property that every expression
has a type.  All constructs with no reasonable value definition become void
expressions.  This includes else-less if, default-less switch, any if- or
switch-statement with branches of incompatible type, and any top-testing loop.
Do-while can be defined to return the value of the last iteration of the loop
body.

The restriction against test-at-top is annoying, but the language must be
prepared to have a value for a loop that executes zero times.  This suggests
that one ought to be able to attach an `else' clause to such a loop.  (I once
had a student who kept trying to do this in Pascal...)  Of course, since
`while (A) B;' is equivalent to `if (A) do B; while (A);', this functionality
is already available by explicitly writing `if (A) {do B; while (A);} else C;',
but a simple `while (A) B; else (C);' would be a more compact way to write it.
(Though it would break C compatibility.)

What about break statements?  We'd probably be better off with a syntax
change, writing `break (EXPR);' to specify the value to be returned by the
enclosing loop or switch.  If we use the current syntax, then in order for a
statement `switch (A) { case 0: B; break; default: C; break; }' to have the
obvious value (A==0?B:C), the rule must be that the value of a switch
statement is that of the statement *preceding* the break.  An analogous rule
could be applied to a break inside a do-while.  (As always, when the rules do
not apply or do not yield a compatible type, the entire expression would be
void-valued for all paths.)

I'm not sure what to do about continue.  If the `else' extension is added,
then a continue which causes the loop to exit (because the retested condition
is now false) could trigger the `else' clause.  (Which would mean that even a
do-while would have use for an else.)  Alternately, a continue could obtain a
value in the same way that a break does, but use it only if the loop test
fails.

Passing values through a goto doesn't seem to work well, since a label by
itself is not a valid statement.  Well, I guess we could say that a labeled
null statement has the value of the statement preceding the goto, but that's
stretching the idea a bit too far, I think.  (I did once suggest this for a
dialect of TECO: I wanted `5 Ofoo$ ... !foo! UX' to assign 5 to register X,
which TECO-10 didn't do.)

(Given that so many statements end up being void-valued expressions, is there
any point to this language extension?  Yes.  There are contexts where a
statement is not legal, but an expression is, even though its value is not
used.)

Of course, a simpler approach would be to add valof...resultis from BCPL, or
inline functions from C++, but it's not really the same as a true expression
language.  On the other hand, a language with expression/statement equivalence
probably shouldn't be based on C in the first place, since several constructs
become redundant (braces vs parens, semicolon vs the comma operator, if-then
vs the ternary operator).

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

w-colinp@microsoft.UUCP (Colin Plumb) (01/06/89)

Actually, defining the semantics of such things is easy.
A friend of mine did it in a language design where {} created first-class
procedures.  I.e. you didn't call

while(cond) {
	body...
}

you called

while ({cond}, {
	body
})

and while was defined as (in a C-like syntax, where "int foo()" means a
*procedure* not a pointer to one):

void while(bool test(), void body())
{
	if (test()) {
		body();
		while(test, body);
	}
}

which, was, of course, really written more like

void while (bool test(), void body)) =
{
	ifthen(test, {body; while(test, body)})
}

There was, of course, syntactic sugar to make this more usable, but
while and relatives were in fact not part of the language, merely part
of the standard library.
-- 
	-Colin (uunet!microsof!w-colinp)

peter@ficc.uu.net (Peter da Silva) (01/07/89)

Three comments: First,

	a = for(s1;e;s2)

	Is this legal? If e fails, is the value s1?

also,

	[BCPL resultis...]

	This is equivalent to your 'break(n)', above. I like this very
	much. I didn't do anything like 'resultis', and had break return
	the value of the preceding expression.

and finally,

	For expressions like if-without-else, the value if no statement
	was executed was zero, NULL, whatever. This was in analogy to
	short-circuit AND (which wasn't in the language).

	That is, "a = if(b) then 2" was equivalent to "a = b || 2".

	This also obviated the need for flow analysis, and let me just use
	the value in BC as the result of a statement :->.

...

OK. I think the name "E" is better than "EC". (I called mine "Small-P :->").

How about making break(n) work in any block? Hmmm. No. Hmmm. Maybe resultis
would be better. Simplify it, make it 'result'.

I think the value of a statement with varying types is soluble by looking
at the coercion needed to use it in context. If function prototypes are
mandated, the only problem is printf(). And I think the promotion rules
will work here.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Work: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.   `-_-'
Home: bigtex!texbell!sugar!peter, peter@sugar.uu.net.                 'U`
Opinions may not represent the policies of FICC or the Xenix Support group.

anw@nott-cs.UUCP (01/09/89)

In article <11359@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer)
writes:
	[a thoughtful and interesting article about how to extend C into a
	 full expression language.]

>[...].  On the other hand, a language with expression/statement equivalence
>probably shouldn't be based on C in the first place, since several constructs
>become redundant (braces vs parens, semicolon vs the comma operator, if-then
>vs the ternary operator).

	On the third hand [:-),*], this redundancy can be very helpful to
humans.  Eg, if parens/braces/brackets were everywhere interchangeable, then
a measure of "elegant variation" makes multi-bracketed expressions more
readable and may help the compiler to produce better error messages (by
localising bracket mis-matches).  Similarly, having both "if-then" and "?:"
gives the human some syntactic sugar with which to clarify meaning.

-------
* On a clock, the third hand is presumably the second hand.

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk

karl@haddock.ima.isc.com (Karl Heuer) (01/12/89)

In article <2659@ficc.uu.net> you write:
>Three comments: First,
>
>	a = for(s1;e;s2)
>
>	Is this legal? If e fails, is the value s1?

I hadn't thought about that, but it's a reasonable definition -- provided the
type of s1 matches the type of the body of the for statement.  It does require
saving the value of s1 before executing e, which is somewhat couterintuitive.

>	For expressions like if-without-else, the value if no statement
>	was executed was zero, NULL, whatever. This was in analogy to
>	short-circuit AND (which wasn't in the language).

This would be self-consistent, but I'm not sure how useful it would be.  In a
while-statement, for example, if the value at the bottom of the loop is some
useful value, the default (zero-execution) value would probably be some sort
of error indication.  Zero is not always an out-of-band value.

On the other hand, a void value is even less useful, and the existence of else
(for if-statements), default (for switch-statements), and the above trick of
using `for' instead of `while' means that the user has complete control over
the default value in any case.  Given this, it seems acceptable to use zero as
the `default default', when no explicit default is specified.  It's also
within the spirit of C, I think.

>I think the value of a statement with varying types is soluble by looking
>at the coercion needed to use it in context.

Context outside of the expression itself?  I think that's dangerous ground.
Can you give an example of how one might use this?

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

peter@ficc.uu.net (Peter da Silva) (01/14/89)

  In article <2659@ficc.uu.net> I write:
> >Three comments: First,

> >	a = for(s1;e;s2)

> >	Is this legal? If e fails, is the value s1?

In article <11398@haddock.ima.isc.com>, karl@haddock.ima.isc.com (Karl Heuer) writes:
> I hadn't thought about that, but it's a reasonable definition -- provided the
> type of s1 matches the type of the body of the for statement.

I don't see why. After all, if you have an expression in current 'C' like:

	a = b ? c : d;

The types of c and d don't have to match. Why add an additional constraint
in the extended language?

> It does require
> saving the value of s1 before executing e, which is somewhat couterintuitive.

True. This might be a bit of a problem. I wonder if there's any way around
that...

> >I think the value of a statement with varying types is soluble by looking
> >at the coercion needed to use it in context.

> Context outside of the expression itself?  I think that's dangerous ground.

That's the way it currently works for ?:.

> Can you give an example of how one might use this?

int a;

	a = switch(nextchar()) {
		case '0': case '1': case '2': case '3':
		case '4': case '5': case '6': case '7':
			result 'O';
		case '8': case '9':
			result 'D':
		case 'A': case 'B': case 'C':
		case 'D': case 'E': case 'F':
			result 'X';
		default:
			result -1;
	}
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Work: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.   `-_-'
Home: bigtex!texbell!sugar!peter, peter@sugar.uu.net.                 'U`
Opinions may not represent the policies of FICC or the Xenix Support group.

chris@mimsy.UUCP (Chris Torek) (01/15/89)

[re type-matching in multipart expression values]
>>Context outside of the expression itself?  I think that's dangerous ground.

In article <2734@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>That's the way it currently works for ?: [in C].

No: in `a ? b : c', the type of the result is based only on the
types of `b' and `c', much as though b and c were to be added.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

peter@ficc.uu.net (Peter da Silva) (01/16/89)

In article <15472@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> [re type-matching in multipart expression values]
> >>Context outside of the expression itself?  I think that's dangerous ground.

> In article <2734@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
> >That's the way it currently works for ?: [in C].

> No: in `a ? b : c', the type of the result is based only on the
> types of `b' and `c', much as though b and c were to be added.

OK, the bottom line for expression-C remains the same... there is no
reason you shouldn't be able to say:

	a = if (this) then 10.0 else 'a';

...and have all the coercions work the same way they do in real-C for:

	a = (this) ? 10.0 : 'a';

Personally I wish that this sort of implicit type-coercions in 'C'
was be deferred as late as possible, to take advantage of surrounding
context. That way you wouldn't be burned by assuming this makes sense:

	long_var = (unsigned short)65535 + (signed short)32767;

This calculation seems, to the 'C' novice, like it should equal 98302,
but it actually ends up 32766.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Work: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.   `-_-'
Home: bigtex!texbell!sugar!peter, peter@sugar.uu.net.                 'U`
Opinions may not represent the policies of FICC or the Xenix Support group.

karl@haddock.ima.isc.com (Karl Heuer) (01/17/89)

>>>	a = for(s1;e;s2)
>>>	Is this legal? If e fails, is the value s1?
>
>>[reasonable, but s1 and the for-body would have to have matching types.]
>
>I don't see why.

I meant that they have to be compatible in the same sense as the ?: operands,
i.e. it must be possible to bring them to a common type.  It would not be
legal to write
	a = for(int_expr; e; s2) string_expr;

>>>I think the value of a statement with varying types is soluble by looking
>>>at the coercion needed to use it in context.
>
>>Context outside of the expression itself?  I think that's dangerous ground.
>
>That's the way it currently works for ?:.

We must be talking about different things.  I agree that it is possible to
assign a type to any statement whose component types are sufficiently
compatible, in the same way that ?: does%.  I believe this is what I said in
my first posting on the subject, in fact.  In the case of incompatible types,
such as the example above, I think we'd be treading on dangerous ground if we
tried to resolve it to anything more meaningful than a void expression.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
________
% As Chris notes, this does not involve any context beyond the expression
being typed.  In particular, on a 16-bit implementation,
	ulongvar = ushortvar * ushortvar
does a 16-bit multiply, not a 32-bit multiply; this is a classic gotcha.