[comp.lang.misc] Third public review of X3J11 C

cik@l.cc.purdue.edu (Herman Rubin) (09/01/88)

In article <8660@ihlpb.ATT.COM>, nevin1@ihlpb.ATT.COM (Liber) writes:
> In article <897@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
> |In article <4203@adobe.COM>, burgett@steel.COM (Michael Burgett) writes:

> |That a badly designed language was rejected is irrelevant.

> But the part that isn't irrelevant is *why* did PL/1 turn out, in your
> words, to be a badly designed language?  We don't want to go around
> repeating the mistakes of the past if we don't have to.

A language should be easy to read and as easy to write as possible.  The
kludges made in PL/1 to allow the use of the properties of the machine were
to use the common assembler notation, which while it is precise, is difficult
to read and write.

HLLs such as C make heavy use of overloaded operators and infix notation for 
operators.  There are only a few assemblers which use infix notation, and I
know of none which use overloaded operators and weak typing.  In addition,
HLLs allow multiple operations in a single statement, array handling, and
similar goodies.

The makers of PL/1, when they came to allowing the user to use the low-level
procedures, required the users to use the clumsy assembler notation or even
worse.  I believe that a flexible HLL which comes close to accomplishing 
what both C and FORTRAN accomplish, and a lot more, can be produced.  It
might be necessary to require explicit operator precedence instead of implicit,
at least in some cases (it has been stated that this is one of the biggest
problems in a compiler; APL has completely dropped it), and possibly to 
remove some of the implicities introduced in some of the languages.

If there is a movement to produce a flexible HLL, I would be willing to
participate.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

ok@quintus.uucp (Richard A. O'Keefe) (09/02/88)

In article <908@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
>In article <8660@ihlpb.ATT.COM>, nevin1@ihlpb.ATT.COM (Liber) writes:
>> But the part that isn't irrelevant is *why* did PL/1 turn out, in your
>> words, to be a badly designed language?  We don't want to go around
>> repeating the mistakes of the past if we don't have to.
>
>A language should be easy to read and as easy to write as possible.  The
>kludges made in PL/1 to allow the use of the properties of the machine were
>to use the common assembler notation, which while it is precise, is difficult
>to read and write.
>
I used to use PL/I (yes, that's an "I" not a "1"), and I'm afraid I don't
quite know what Herman Rubin is getting at here.  PL/I syntax, for those
who are fortunate enough not to know it, is full of things like
	PUT FILE (OUTFILE) EDIT (THIS,THAT,THE_OTHER) (A(10)) PAGE;
Roughly,
	<main keyword> {<sub keyword> [(<argument list>)]}... ;
For another example,
	CALL PROCEDURE(ARG1, ..., ARGN);

For an example of arithmetic operations, to add two MxN arrays of
floating point numbers:
	DECLARE (A,B) BINARY FLOAT DIMENSION (1:M, 1:N);
	A = A+B;

I have no desire to praise PL/I, but I honestly don't see any resemblance
to any of the assembly languages I've ever used.

As for infix notation, I wish someone would come up with a standard
notation for sequence concatenation: I've seen "+" (which mathematical
convention reserves for commutative operations), "&" (which looks like
"and"), "*" (which makes the most sense, but is rare), and the theory
papers tend to use a sign which is a bit like ^ and a bit like the
intersection sign, and needless to say isn't in the ISO 8859/1 character set.
In the absence of an agreed notation for such a fundamental operation,
the use of functional notation has a lot to commend it.

cik@l.cc.purdue.edu (Herman Rubin) (09/02/88)

In article <340@quintus.UUCP>, ok@quintus.uucp (Richard A. O'Keefe) writes:
> In article <908@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
> >In article <8660@ihlpb.ATT.COM>, nevin1@ihlpb.ATT.COM (Liber) writes:
> >> But the part that isn't irrelevant is *why* did PL/1 turn out, in your
> >> words, to be a badly designed language?  We don't want to go around
> >> repeating the mistakes of the past if we don't have to.
> >
> >A language should be easy to read and as easy to write as possible.  The
> >kludges made in PL/1 to allow the use of the properties of the machine were
> >to use the common assembler notation, which while it is precise, is difficult
> >to read and write.
> >
> I used to use PL/I (yes, that's an "I" not a "1"), and I'm afraid I don't
> quite know what Herman Rubin is getting at here.

It is not surprising that we do not always notice the same things.  The first
thing I notice about a computer is what its instructions can do, and how
convenient it is to juggle the quantities (to me a quantity is anything, not
just a number) around.  When I see a language, the corresponding question is
whether the designers have provided a way for me to use the hardware instruc-
tions, and how easy and convenient it is for me to do that.

Consequently, the first thing to strike me about PL/I is that the insertion of
the hardware instructions the gurus did not consider in their infinite wisdom
worthy of including in the language is forced to be in the same clumsy form
which discourages the use of assembler language.  That the language designers
do not consider this problem important is, I believe, the major problem in
producing efficient semi-portable software.

>						 PL/I syntax, for those
> who are fortunate enough not to know it, is full of things like
> 	PUT FILE (OUTFILE) EDIT (THIS,THAT,THE_OTHER) (A(10)) PAGE;
> Roughly,
> 	<main keyword> {<sub keyword> [(<argument list>)]}... ;
> For another example,
> 	CALL PROCEDURE(ARG1, ..., ARGN);

This type of syntax, through calls, can be easily used in just about any HLL
(C has gotten rid of the word CALL), except that the use of the <sub keyword>
would have to be handled in some other way.  I do not find that notation much
different from the usual.

> For an example of arithmetic operations, to add two MxN arrays of
> floating point numbers:
> 	DECLARE (A,B) BINARY FLOAT DIMENSION (1:M, 1:N);

If C had multidimensional arrays, and did not require indices to start at 0,
such as PASCAL and recent versions of FORTRAN permit, it would be nothing new
to C.  The wording would be slightly different.

> 	A = A+B;

Overloaded operators to the front!

> I have no desire to praise PL/I, but I honestly don't see any resemblance
> to any of the assembly languages I've ever used.

I repeat, what about the multitudinous operations which the gurus have not
included.

> As for infix notation, I wish someone would come up with a standard
> notation for sequence concatenation: I've seen "+" (which mathematical
> convention reserves for commutative operations), "&" (which looks like
> "and"), "*" (which makes the most sense, but is rare), and the theory
> papers tend to use a sign which is a bit like ^ and a bit like the
> intersection sign, and needless to say isn't in the ISO 8859/1 character set.

The only assembly language I have used in which I used this used ^.  I do not
know of any use for ^ other than concatenation and power (or superscript) until 
the C gurus came up with the idea of saying, "Nobody uses this, so we will use
it for exclusive or."  This is similar to the use of \ for an escape character.
Language designers should have more respect for even moderately established 
notation in mathematics and computational theory.

> In the absence of an agreed notation for such a fundamental operation,
> the use of functional notation has a lot to commend it.

This is a major bone of contention and interpretation.  I consider that
a binary operator function normally uses an infix symbol.  It may be
necessary to invent a corresponding symbol, which need not be a single
character, for the purposes of a program if one is not available.
I see the problem of assembler language that it is functional notation
and not overloaded operator.  What do you think the reaction of HLL users
would be if you required them to write addint(x,y) for x+y if one wanted
to add the integers x and y?

As an example of how a language can and should be extended if the need arises,
I have a function (which I will not describe here, nor will I give the
algorithm), which requires the following:

long long integers (at least two more bits than the exponent in a double)
Boolean operations and additions on such
packing and unpacking of doubles into (exponent, mantissa), the mantissa
	being a long long integer

in addition to the usual operations.  I maintain that any acceptable language
should allow such additions, including the temporary addition of infix
functions, if necessary.

One point concerning the above.  Standard mathematical operations can produce
more than one result.  A simple example is division, with quotient and
remainder.  Another example is that one frequently wants both the sine and
cosine; it is only about 30% more expensive to produce both of them simul-
taneously than to produce one.  I can give other examples.  Thus one should
be able to put a string before the = sign in a replacement statement.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

ark@alice.UUCP (Andrew Koenig) (09/02/88)

In article <908@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes:
 
> A language should be easy to read and as easy to write as possible.  The
> kludges made in PL/1 to allow the use of the properties of the machine were
> to use the common assembler notation, which while it is precise, is difficult
> to read and write.

Would you mind explaining this a little more?

I don't understand what you're trying to say.
-- 
				--Andrew Koenig
				  ark@europa.att.com

ok@quintus.uucp (Richard A. O'Keefe) (09/03/88)

In article <910@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
>It is not surprising that we do not always notice the same things.  The first
>thing I notice about a computer is what its instructions can do, and how
>convenient it is to juggle the quantities (to me a quantity is anything, not
>just a number) around.  When I see a language, the corresponding question is
>whether the designers have provided a way for me to use the hardware instruc-
>tions, and how easy and convenient it is for me to do that.
>
>Consequently, the first thing to strike me about PL/I is that the insertion of
>the hardware instructions the gurus did not consider in their infinite wisdom
>worthy of including in the language is forced to be in the same clumsy form
>which discourages the use of assembler language.  That the language designers
>do not consider this problem important is, I believe, the major problem in
>producing efficient semi-portable software.

I'm sorry, I still don't understand.  PL/I doesn't include ANY hardware
instructions in the language (PL/I runs on 370s, B6700s, and Multics,
which don't share a word-length, still less an instruction, plus quite
a few other machines).

It is worth distinguishing between a language and its implementation.
The PL/I *language* permits the the declaration of integers with any
(fixed) number of bits via
	DECLARE MY_INT BINARY FIXED PRECISION(NBR_OF_BITS);
So, if you need 70-bit integers, do
	DECLARE (I,J,K) FIXED(69);	/* + 1 sign bit */
A particular PL/I *implementation* may restrict this, in which case
blame the *implementation*, not the *language*.  The PL/I *language*
also provides the BIT(N) family of data types, and operations on them,
and conversion between FIXED(N) and BIT(N) is straightfoward.
So the operations Herman Rubin requests:

>long long integers (at least two more bits than the exponent in a double)
>Boolean operations and additions on such

are already supported by the PL/I *language* (though not, perhaps, in a
particular implementation).  The other operations:

>packing and unpacking of doubles into (exponent, mantissa), the mantissa
>	being a long long integer

are not directly supported (converting a FLOAT to BIT(N) is done by first
converting the FLOAT to FIXED), but the UNSPEC function can be used for
that.  For example, suppose you have a 64-bit IEEE float declared as
	DECLARE IEEE64 BINARY FLOAT PRECISION(52);
and want to get at the sign, exponent, and significand.
	DECLARE S BIT(1), E BIT(11), M BIT(52), HACK BIT(64);

	HACK = BIT(IEEE64);		/* get at the raw bits */
	S = SUBSTR(HACK, 1, 1);		/* sign */
	E = SUBSTR(HACK, 2, 11);	/* biassed exponent */
	M = SUBSTR(HACK, 12, 52);	/* significand */

I repeat, a particular implementation might not allow this (shame), and
if it is allowed, it may be woefully inefficient, but the *LANGUAGE*
permits pretty direct expression of each of Rubin's examples.

A brief look through an MC68k manual suggests to me that the only 68k
instructions which couldn't be accessed quite directly from a smart
PL/I compiler are
	exchange, swap register halves, test and set, rotates,
and	simultaneous quotient and remainder from a divide.
I haven't got an up-to-date PL/I manual, so perhaps it has got a
ROTATE() function these days.  As for simulatenous quotient and
remainder, a peephole optimiser which can't convert
	Q = DEND / DOR;
	R = MOD(DEND, DOR);
into the appropriate instruction just isn't trying.  (I have to admit
that the NS32k has two sets of integer divide instructions, and I do
not see how to get at the other set.  If PL/I were extended with the
Common Lisp division functions...)

I repeat from my previous posting that I have no desire to defend PL/I;
it is an *awful* language.  But it is less vulnerable to the charge of
building in a limited set of instructions than most.  (It can even get
at the VAX EPOLY instruction, though the language predates the VAX.)

>What do you think the reaction of HLL users would be if you required them
>to write addint(x,y) for x+y if one wanted to add the integers x and y?

Lisp users have written (PLUS X Y) for years, and you're talking to
someone whose preferred notation is plus(Addend, Augend, Sum) (:-).
_My_ definition of a HLL is one where I don't have to worry about storage
management.

>One point concerning the above.  Standard mathematical operations can produce
>more than one result.  A simple example is division, with quotient and
>remainder.  Another example is that one frequently wants both the sine and
>cosine; it is only about 30% more expensive to produce both of them simul-
>taneously than to produce one.  I can give other examples.

Have you met POP-2?  (Or its current incarnation, Poplog.)  The Pop view
is that an infix operator is just syntactic sugar for a function call, so
	x+y;	x,y,+;	+(x,y);	x, y.+;
are merely notational variants.  You can declare your own operators.
It has a function which yields quotient and remainder; I can never
remember which result is which, but let's say it's
	dividend // divisor -> quotient -> remainder;
Mesa will let you do something like
	[quo,rem] := divide[dend,dor];
*Real* high-level-languages (that is, languages which, unlike C, Pascal,
Fortran, Modula-2, &c, are not very thinly disguised assembler) let you
do things like
	let (quo,rem) = div(dend,dor) in
	    <something using quo and rem>

Why functions with multiple results haven't caught on in programming
languages generally I don't know (unless it is the idea that function
results have something to do with registers: it's not that long since
C acquired the ability to return records).

db@lfcs.ed.ac.uk (Dave Berry) (09/10/88)

In article <340@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>As for infix notation, I wish someone would come up with a standard
>notation for sequence concatenation: I've seen "+" (which mathematical
>convention reserves for commutative operations), "&" (which looks like
>"and"), "*" (which makes the most sense, but is rare), and the theory
>papers tend to use a sign which is a bit like ^ and a bit like the
>intersection sign, and needless to say isn't in the ISO 8859/1 character set.
>In the absence of an agreed notation for such a fundamental operation,
>the use of functional notation has a lot to commend it.

I've seen "+" used for non-commutative operations, e.g. the definition
of Standard ML uses it for environment modification.  ML itself uses
"^" for strings and "@" for lists.  "@" sort of stands for "@ppend".

I've seen concatenation used too (in Icon?).  Of course, this clashes
with multiplication and function application.

But easily the best I've seen is Apple BASIC.  What could be clearer
than "a$(len(a$)+1) = b$" ?      :-O

 Dave Berry.		   A detailed economic analysis has revealed that
 db@lfcs.ed.ac.uk	   Britain's economy is changing from a manufacturing
 			   economy to an ass-licking economy.

ok@quintus.uucp (Richard A. O'Keefe) (09/12/88)

In article <769@etive.ed.ac.uk> db@lfcs.ed.ac.uk (Dave Berry) writes:
>In article <340@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>>As for infix notation, I wish someone would come up with a standard
>>notation for sequence concatenation: I've seen "+" (which mathematical
							    ============
>>convention reserves for commutative operations) ....

>I've seen "+" used for non-commutative operations, e.g. the definition
>of Standard ML uses it for environment modification.

ML is not mathematics.  It would be *so* sensible to use a "product"
symbol "x" or even "*" for string concatenation, because it is
associative with an identity, and exponentiation is exactly the
right operation for iterated concatenation ("ab"**3 = "ababab").

>I've seen concatenation used too (in Icon?).

Not Icon.  Icon uses "||", just like PL/I.