[comp.lang.misc] Multi-compilers

cik@l.cc.purdue.edu (Herman Rubin) (09/18/90)

In article <9009110403.AA03158@csd4.csd.uwm.edu>, markh@csd4.csd.uwm.edu (Mark William Hopkins) writes:
>
>   Recently, an interesting idea has come to mind for a new kind of compiler:
>a Multi-Compiler.  What makes it different from your typical compiler is that
>it accepts code from more than one source language: many source languages in
>fact.
> 
>   However, it's an idea that is easier said than conceived.  What would it
>look like?  The whole issue seems to revolve around this concept (which I
>borrow from linguistics) of 'code-switching'.
>
>   Code-switching is where a multi-lingual speaker switches from one language
>to another, often in mid-sentence.  For instance, while waiting for a
>departure from an airport in Budapest, I got in a conversation with an East
>German traveller.  However, my German was weak, his English was non-existant,
>and our Hungarian was not very strong.  So we found it necessary to literally
>sprinkle our conversations with almost random switching between German and
>Hungarian.  Each language offered something which compensated for something
>lacking in (our knowledge of) the other.

But you were not using specialized languages.  Any one of the languages would
have been quite adequate for the task.  It was knowledge in one language which
compensated for lack of knowledge of the other.  If the two of you both knew
Swahili, you would undoubtedly have used that for conversation.  This is not
the problem in computer languages.

>    A good programmer will also face the same kind of dilemma.  Different
>languages are designed to do different things better.  An extreme example is
>the case of writing a truly practical AI control program which would ideally
>handle all the intelligent rule-based tasks in Prolog, and all the
>event-driven tasks in assembly and C, and maybe even the recognition and
>learning tasks in the assembly of a special purpose neural net chip.

I disagree.  Apart from machine language, there is no language designed to
use the power of the computer.  Mathematicians, physicists, chemists, etc.,
have specialized vocabularies added to their "normal" languages, and would
not think of writing an article exclusively in their jargon.  Also, they
have had to interact with other jargons, and there is no attempt to preclude
constructs, as the present computer languages do.

Even Basic English, an attempt to produce an easily learned limited version 
of English, was not really that limited.  The main limitation was to replace
verbs and tenses by compound expressions using participles and auxiliaries.
But nothing was really left out.  Algol failed in its goal as an algorithmic
language by omitting even well-known algorithmic ideas, the most notable being
the existence of a list of outputs from a replacement statement.  This has
been discussed somewhat; a list is not a struct, or a class in an object-
oriented language, but a temporary labeling.

We really need the approach taken by the users of natural language, namely,
if it isn't there, put it in!  If it is clumsy, do it differently, although
this can be very hard to achieve.  Why can we not have an overloaded, weakly
typed, "natural" notation assembler, with a macro processor which can expand
things in the user's fashion?

>   The question, naturally, is: when are you allowed to code-switch?
>Depending on how you answer this, you either got a closely integrated set of
>*distinct* compilers (like the Quick series marketed by MicroSoft), or a
>truly integrated programmer's utility.

Only the latter will do.  Having part of the program produced by one 
language and part by another requires combining them by someone knowing
the syntaxes of both, and even the machine representations of both.  In
some cases, this can be done by a super-linker, but not by a linker.  But
does the super-linker even have the information?  We would have to augment
the symbol table by having the parameter structure for each call listed,
and also for each return.  The code would likely be clumsy and slow.

The language can be produced, and presumably compilers can be written.
But it must take the approach that everything is possible, and that 
utility is more important than esthetics.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

wendt@ives.cs.colostate.edu (alan l wendt) (09/19/90)

In article <2576@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:

=The language can be produced, and presumably compilers can be written.
=But it must take the approach that everything is possible, and that 
=utility is more important than esthetics.

Sounds like PL/I to me.

Alan Wendt

chip@tct.uucp (Chip Salzenberg) (09/20/90)

According to cik@l.cc.purdue.edu (Herman Rubin):
>Why can we not have an overloaded, weakly typed, "natural" notation
>assembler, with a macro processor which can expand things in the
>user's fashion?

For the last time (I wish), it's because:

  1.  You haven't published specifications for such a thing.
  2.  No one else wants it badly enough to write it.
  3.  Apparently, neither do you.

You're too busy to do it?  Fine.  So is everyone else.
Please put up or shut up.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

ath@prosys.se (Anders Thulin) (09/21/90)

According to cik@l.cc.purdue.edu (Herman Rubin):
>Why can we not have an overloaded, weakly typed, "natural" notation
>assembler, with a macro processor which can expand things in the
>user's fashion?

Sounds oddly like BLISS ...


-- 
Anders Thulin       ath@prosys.se   {uunet,mcsun}!sunic!prosys!ath
Telesoft Europe AB, Teknikringen 2B, S-583 30 Linkoping, Sweden

cik@l.cc.purdue.edu (Herman Rubin) (09/21/90)

In article <9206@ccncsu.ColoState.EDU>, wendt@ives.cs.colostate.edu (alan l wendt) writes:
> In article <2576@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
> 
> =The language can be produced, and presumably compilers can be written.
> =But it must take the approach that everything is possible, and that 
> =utility is more important than esthetics.
> 
> Sounds like PL/I to me.

The problem with PL/I and assemblers is the poor notation.  This even 
occurs with the present languages, to more of an extent than many would
recognize.  One of the important advantages of languages is the use of
overloaded operators and relations, and converient use of standard
mathematical notation.  

Why is there an absolute value function rather than an absolute value
operator?  Only because when Fortran was written, they were limited to
a 48 character set (IBM punched cards, at the time).  This is also why
** was used for exponentiation, and even * for multiplication.  I have
used a dialect of Fortran which overloaded some of the "standard" functions,
so that absolute value, logarithm, mod, etc., did not have to have any modifier
indicating the type(s) of the arguments.  The compiler did the appropriate
analysis.

A language is a set of macros to be interpreted by the compiler into machine
procedures.  There is no good reason why the user should not be allowed to
add macros in convenient notation to this list, nor why many of the restric-
tions on the current macros need be there.  If the machine can do something,
the user should be able to invent a notation to do it, if necessary.  For
example, I find the general vector operation on the CYBER 205, with its dozen
optional fields, easy to explain in a notation easy to read and write.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

norman@d.cs.okstate.edu (Norman Graham) (09/22/90)

From article <2581@l.cc.purdue.edu>, by cik@l.cc.purdue.edu (Herman Rubin):
> A language is a set of macros to be interpreted by the compiler into machine
> procedures.  There is no good reason why the user should not be allowed to
> add macros in convenient notation to this list, nor why many of the restric-
> tions on the current macros need be there.

This view explains why you're constantly at odds with the rest of 
the netnews community. 

First, as anyone with a basic knowledge of languages can tell you,
a language is a set of strings (of finite but arbitrary length)
over some alphabet. In addition, computer languages are notations
for expressing computations; they are not notations for expressing
the machine instructions you want a compiler to generate.

As a general rule, a computer language exists apart from any particular
compiler or computer architecture, although issues of compilation
and computation model may influence the design of a language.
(eg. The design of Pascal's syntax makes it easy to be parsed
using a particular method. Also, the constructs offered by Pascal
reflect the choice of the von Neumann machine as the underlying
computational model.) The point is that a program has a meaning 
that is independent of any particular sequence of machine instructions.

Second, it is simply naive to describe what a compiler does as
macro expansion. Otherwise, you would have had your super-macro
super-assembler decades ago and we wouldn't have bothered to 
expend the great effort to design so many different languages 
and compilers.

> If the machine can do something,
> the user should be able to invent a notation to do it, if necessary.  For
> example, I find the general vector operation on the CYBER 205, with its dozen
> optional fields, easy to explain in a notation easy to read and write.

And in a previous post, cik@l.cc.purdue.edu (Herman Rubin) writes:
> [...] Apart from machine language, there is no language designed to
> use the power of the computer.  

This is written as if you believe the 'power of the computer' resides
in its instruction set. I disagree with this idea. In fact, the whole
RISC movement is founded on the idea that to build a faster computer 
you must _remove_ instructions from the instruction set. Likewise,
adding an instruction to a modern computer will not increase its
computational power.

I believe the 'power of the computer' resides in its ability to be
used as an abstraction device on which we can build higher and higher
levels of reality. Personally, I chose to live in a reality where
lambda calculus is the underlying computational model. This implies 
that not only do I give up the ability to control the machine instructions
used by my programs, I also give up any notion of the sequencing of 
those instructions (control flow) and the notion of updatable memory
locations (ie. updatable variables, machine state). But by living in
this level of reality I gain the ability to write extremely clear and
straight-forward programs that are 10, 20, and sometimes even 30 times
_shorter_ than the equivalent programs written in a typical high-level
language. 
-- 
Norman Graham   <norman@a.cs.okstate.edu>   {cbosgd,rutgers}!okstate!norman
The opinions expressed herein do not necessarily reflect the views of
the state of Oklahoma, Oklahoma State University, OSU's Department of
Computer Science, or of the writer himself.

misha@ai.mit.edu (Mike Bolotski) (09/23/90)

In article <1990Sep22.061027.9223@d.cs.okstate.edu>, norman@d.cs.okstate.edu
(Norman Graham) writes:
|> 
|> Second, it is simply naive to describe what a compiler does as
|> macro expansion. [ ... ] 

And then:

|> levels of reality. Personally, I chose to live in a reality where
|> lambda calculus is the underlying computational model. This implies 

Lambda calculus can be implemented little more than textual substitutions.
In that sense, macro expansion is almost sufficient as a computational 
model.  

|> in its instruction set. I disagree with this idea. In fact, the whole
|> RISC movement is founded on the idea that to build a faster computer 
|> you must _remove_ instructions from the instruction set. Likewise,

Not at all.  Refer to the endless "What is RISC" discussion on comp.arch.
 
|> adding an instruction to a modern computer will not increase its
|> computational power.

Sure it will.  Integer multiplication is sped up considerably by 
a hardware multiplier.  Again, see the interminable 16x16=32 etc etc
discussion on comp.arch. 
 
|> locations (ie. updatable variables, machine state). But by living in
|> this level of reality I gain the ability to write extremely clear and
|> straight-forward programs that are 10, 20, and sometimes even 30 times
|> _shorter_ than the equivalent programs written in a typical high-level
|> language. 

And consequently, 10, 20, or 30 times *slower* than typical high-level
languages.  Acceptable for some applications, not acceptable for others.

---
Mike

norman@d.cs.okstate.edu (Norman Graham) (09/23/90)

From article <1990Sep22.224055@ai.mit.edu>, by misha@ai.mit.edu (Mike Bolotski):
> In article <1990Sep22.061027.9223@d.cs.okstate.edu>, norman@d.cs.okstate.edu
> (Norman Graham) writes:
> |>
> |> Second, it is simply naive to describe what a compiler does as
> |> macro expansion. [ ... ]
>
> And then:
>
> |> levels of reality. Personally, I chose to live in a reality where
> |> lambda calculus is the underlying computational model. This implies
>
> Lambda calculus can be implemented little more than textual substitutions.
> In that sense, macro expansion is almost sufficient as a computational
> model.

Yes. But no reasonable lambda calculus compiler would choose macro
expansion as its compilation technique. Thus my point stands.

> |> adding an instruction to a modern computer will not increase its
> |> computational power.
> 
> Sure it will.  Integer multiplication is sped up considerably by 
> a hardware multiplier.  Again, see the interminable 16x16=32 etc etc
> discussion on comp.arch. 

Tisk, tisk. I didn't say adding instructions/hardware wouldn't increase
the speed of a computer. I was talking about the class of problems the
computer can solve. I mean, once you have Turing completeness... [OK, so
we don't really have that; but we can pretend right?] I thought my
discussion of the computer as an abstraction machine made that clear.

> |> locations (ie. updatable variables, machine state). But by living in
> |> this level of reality I gain the ability to write extremely clear and
> |> straight-forward programs that are 10, 20, and sometimes even 30 times
> |> _shorter_ than the equivalent programs written in a typical high-level
> |> language. 
> 
> And consequently, 10, 20, or 30 times *slower* than typical high-level
> languages.  Acceptable for some applications, not acceptable for others.

You've not looked at the recent compilation technology for functional
languages--we're closing the gap. Within the year, I expect to see 
functional systems produce code that is only 2 or 3 times slower
than the code produced by the typical high-level language compiler.
[Much of that overhead will be eaten up by the garbage collector, etc.]

There are many programs that can live with that kind of performance
hit, especially given the wins on the points of readability, 
writability, and maintainability that functional languages offer.

But I agree with you: Functional languages are not the appropriate
choice for every problem--I never implied that they were.

--Norm
-- 
Norman Graham   <norman@a.cs.okstate.edu>   {cbosgd,rutgers}!okstate!norman
The opinions expressed herein do not necessarily reflect the views of
the state of Oklahoma, Oklahoma State University, OSU's Department of
Computer Science, or of the writer himself.

trier@cwlim.CWRU.EDU (Stephen C. Trier) (09/23/90)

This discussion of programming language design brings to mind an
excellent book I came across in the library, entitled "Programming
Languages: A Grand Tour".  It was published, I believe, by the ACM,
but I do not have any other publication information.

This book gathers together many historically important papers on
programming language design and on the programming languages
themselves.  To some extent, it serves as an introduction to the
philosophic issues that dictate the design of a particular
programming language.

Herman Rubin may find this book to be quite eye-opening, as I did.
(It also put me on an Algol-60 kick that I haven't yet managed to
overcome...)

-- 
Stephen Trier                              Case Western Reserve University
Work: trier@cwlim.ins.cwru.edu             Information Network Services
Home: sct@seldon.clv.oh.us

pcg@cs.aber.ac.uk (Piercarlo Grandi) (09/24/90)

On 22 Sep 90 06:10:27 GMT, norman@d.cs.okstate.edu (Norman Graham) said:

norman> From article <2581@l.cc.purdue.edu>, by cik@l.cc.purdue.edu
norman> (Herman Rubin):

cik> A language is a set of macros to be interpreted by the compiler
cik> into machine procedures.

This is a bit crude -- less crude, but still not entirely satisfactory,
may be that a language is a set of macros for a virtual machine, and the
compiler must both expand the macros and map the virtual machine to the
real machine.

cik> There is no good reason why the user should not be allowed to add
cik> macros in convenient notation to this list, nor why many of the
cik> restrictions on the current macros need be there.

The concept of a supercompiler -- you would be pleased with Lisp/Scheme,
that do not have any such restrictions, and many Lisp/Scheme compilers,
which use a distributed compiler approach -- each relevant function is
tagged with two (or more, if you want) properties, that define its
behaviour with respect to 'COMPILE and 'INTERPRET.

	e.g. the symbol PLUS is associated, under the COMPILE property,
	with code that _generates_ "load op1, load op2, sum op1,op2,op3"
	while under the INTERPRET property it will execute the same
	sequence. A Smalltalk/Self style compiling interpreter is one
	that generates code incrementally and then passes control to the
	chunks of code so generated on the fly. Not difficult to do
	in Lisp/Scheme either.

The compiler and the interpreter are then just symbolic reduction
engines that fire the appropriate rules for each symbol they see in the
source. In one case the reducation produces code to calculate results,
in the other case directly results.  Actually if one is clever it is
possible to realize that first one wants to reduce things at compile
time as much as possible, then one wants to compile the rest.

	What you want to do is to interpret a program as much as you can,
	and then compile what remains, that will depend on free variables.

I think that Forth is similar here -- two (conceptual) dictionaries and
two distinct but similar reduction engines.

norman> This view explains why you're constantly at odds with the rest of 
norman> the netnews community.

Precisely, indeed so many in the netnews community cannot conceive of
anything different from the C or Fortran they are accustomed to use,
or indeed some others cannot see Rubin's motivation entirely, as follows.

norman> First, as anyone with a basic knowledge of languages can tell you,

I am somehow under the impression that Rubin understands languages
better than somebody that gives much importance to syntax. A recent book
on language implementation maps many disparate languages into a simple
lisp/like syntax, with no great loss.

norman> a language is a set of strings (of finite but arbitrary length)
norman> over some alphabet.

Well, that is the syntax of the language -- essentially irrelevant,
except for pragmatics. If this were literally true, yacc would be all we
need for writing compilers. But:

norman> In addition, computer languages are notations for expressing
norman> computations; they are not notations for expressing the machine
norman> instructions you want a compiler to generate.

Uhhhmmmm. Here we seem to be describing some branch of mathematics
instead -- a fairly common idea.  Computer languages exist for pragmatic
reasons. Otherwise we'd be using lambda calculus or Turing machines. I
used to think, erroneously I now understand, that computer languages are
not there to describe computations; they are there for supporting an
engineering like activity, in which expressing computations is a fairly
small part of the whole picture. Analogies are always inaccurate, but it
would be like saying that the reason for a jet engine is the
demonstration of one of Newton's laws, or of the effects of heating a
gas. Maybe civil engineering is only really about differential equations
as well.

norman> As a general rule, a computer language exists apart from any
norman> particular compiler or computer architecture, although issues of
norman> compilation and computation model may influence the design of a
norman> language.

I would dearly hope they do. Actually I would hope they are *prominent*.
I would venture as far as saying that the machine is the reason for
which the language exists. The idea that a language *definition* should
imply a virtual rather than a real machine is to aid portability,
hopefully without hurting its applicability to real machines too much,
and for languages for which portability is not important, a virtual
machine based definition is often a burden.

norman> Second, it is simply naive to describe what a compiler does as
norman> macro expansion.

Maybe you are thinking of macros as they are found in languages like C,
those Rubin is quite unhappy about.

I am quite sure that a compiler can be implemented in Lisp/Scheme
macros. I am using "macro" here as 'procedure that is evaluated at
compile time, and whose function is to transform a piece of program into
something else'.  It need not be a mere syntax macro, say like in cpp.
If you are familiar with Lisp/Scheme, just consider the source and
effects of some common 'loop' macros -- to my poor tired eyes they look
like pieces of a compiler.

cjk> If the machine can do something, the user should be able to invent
cjk> a notation to do it, if necessary.  For example, I find the general
cjk> vector operation on the CYBER 205, with its dozen optional fields,
cjk> easy to explain in a notation easy to read and write.

Maybe Rubin is a bit rough in expression, but I think that he is trying
to say that he would dearly love to have extensible compilers, i.e.
compilers where the set of compile time procedures can be extended by the
user, i.e. the symbolic reduction engine is programmable. The "if
necessary" above does not refer to power, I guess, but to cost.

Well, maybe he should seriously consider Lisp/Scheme. It is well known
that they are excellently suited to numerical computations, and they
provide unparalleled access to the underlying implementation. Maybe a
look a T? Forth is another nice choice. Note that I am not joking here
-- I am very serious. I am considering writing myself an OS kernel in
Scheme or something similar rather than in C/C++. Scheme can be viewed
as a fairly good SIL, even if most Scheme systems do not implement it as
one (especially inasmuch memory management is concerned -- you don't
have the option of doing manual memory reclamation really).

norman> This is written as if you believe the 'power of the computer' resides
norman> in its instruction set. I disagree with this idea.

Interesting disagreement. So Peano arithmetic (zero, successor, test
against zero) is enough and everything else is redundant? Why are we not
using Turing machines? They are as powerful as anything else, and have
a pretty small instruction set.

norman> In fact, the whole RISC movement is founded on the idea that to
norman> build a faster computer you must _remove_ instructions from the
norman> instruction set.

Uhmmm. _must_? I don't think they see it as a necessity -- my own,
probably seriously wrong idea of the subject, is that the RISC movement
is about improving price/performance, not *power* (once you have got
Peano, you have got it all), by observing that it is _often_ a bad
investment to allocate your silicon budget to seldom used, complicated
instructions, but instead it gives better price/performance to allocate
it to speeding up the simpler, most frequently executed ones.

I don't think that this excludes that a RISC machine can have
complicated instructions (vide floating point), if the payoff is
sufficient.  So far the best known payoff for doing matrix computing is
in vector instructions, ergo they are RISCy :-). Or maybe I am no longer
current on this subject and I should agree that:

norman> Likewise, adding an instruction to a modern computer will not
norman> increase its computational power.

Or maybe you should reconsider -- try to remove vector instructions from
a Cray, if they let you. Indeed its power will not be diminished, as
long as Peano instructions are left in, but its price/performance ratio
may be another story, I guess. Or maybe you should reread Hennessy's and
Patterson's book to get a better idea of what is the RISC idea about
(the title gives away almost the whole story).

norman> I believe the 'power of the computer' resides in its ability to be
norman> used as an abstraction device on which we can build higher and higher
norman> levels of reality.

Here we fully agree. The reasons for which computers are so important
are that they are general purpose automata (abstraction engines), and
that they can be cost effective too. Leaving out the second aspect
brings us back to Babbage, I am afraid.

norman> Personally, I chose to live in a reality where
norman> lambda calculus is the underlying computational model.

We couldn't have guessed! :-). Most of us out here are instead in a
reality where they must pay for their machines, and are keenly
interested in price/performance ratios.

Rubin too is unfortunately locked in this sad and regrettable reality,
and also in a segment of it where absolute prices are so huge that
price/performance ratio of a machine/language combination seems not
easily dismissable.

He is quite annoyed that he cannot get the benefits of high level
languages without forgoing those of special purpose hw facilities that
substantially enhance the cost effectiveness of his work, and that also
carried a fairly large price tag (only a few million dollars, I guess,
that's the difference between the price of a 205 and that of a mainframe
of equivalent scalar power). He is even more annoyed as he suspects that
the problem could be rectified with just a little more effort on the
part of language designers and implementors.

norman> But by living in this level of reality I gain the ability to
norman> write extremely clear and straight-forward programs that are 10,
norman> 20, and sometimes even 30 times _shorter_ than the equivalent
norman> programs written in a typical high-level language.

Are you sure that your "programs" are equivalent? A lot of people
(including regrettably Prof. Dijkstra and Prof. Hoare) seem to have lost
sight of the distinction between what is a program and what is an
algorithm written in a program-like notation. A sort program, e.g.  UNIX
sort(1), is quite a different thing from a sort algorithm, e.g.
qsort(3).

My probably warped view of the programming scene is that the best known
general purpose language technology, applied under favourable
circumstances, seems (hand waving very prominently here) no more than
2-3 times more productive than mainstream language technology (Fortran,
Cobol, C, PL/1); claims to the contrary make me a bit skeptical. Now I
admit that even 200-300% is a lot, in economic terms, but it pales by
comparison with the different productivity levels of *people*.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

norman@d.cs.okstate.edu (Norman Graham) (09/24/90)

From article <PCG.90Sep23181145@teachc.cs.aber.ac.uk>:

Here are the players:
pcg>     Piercarlo Grandi  <pcg@cs.aber.ac.uk>
cik>     Herman Rubin      <cik@l.cc.purdue.edu>
norman>  Norman Graham     <norman@d.cs.okstate.edu>

norman> First, as anyone with a basic knowledge of languages can tell you,
 
pcg> I am somehow under the impression that Rubin understands languages
pcg> better than somebody that gives much importance to syntax. A recent book
pcg> on language implementation maps many disparate languages into a simple
pcg> lisp/like syntax, with no great loss.

This sounds like denotational semantics to me. So programs written in
one language can be translated into another language; what's you point?
 
norman> a language is a set of strings (of finite but arbitrary length)
norman> over some alphabet.
 
pcg> Well, that is the syntax of the language -- essentially irrelevant,
pcg> except for pragmatics. If this were literally true, yacc would be all we
pcg> need for writing compilers. But:

While syntax is certainly arbitrary (for the most part), it certainly
is not irrelevant: Syntax defines what strings of symbols are valid
programs (i.e. syntax defines the language and the language is a set
of strings of symbols). Without a syntatic definition, how can our
comptilers know the difference between a correct program and an errant
one? [Aside. I'm sure you're aware that Yacc can't handle the syntax
of many programming languages.]

On the other hand, I agree that syntactic issues are, as a rule, boring.
And, of course, syntax says nothing about what a program means.
 
norman> In addition, computer languages are notations for expressing
norman> computations; they are not notations for expressing the machine
norman> instructions you want a compiler to generate.
 
pcg> Uhhhmmmm. Here we seem to be describing some branch of mathematics
pcg> instead -- a fairly common idea.  Computer languages exist for pragmatic
pcg> reasons. Otherwise we'd be using lambda calculus or Turing machines. I
pcg> used to think, erroneously I now understand, that computer languages are
pcg> not there to describe computations; they are there for supporting an
pcg> engineering like activity, in which expressing computations is a fairly
pcg> small part of the whole picture. [ ... ]

Well, you've confused me here: Do you agree or disagree with me.
I believe you meant to disagree, but an extra 'not' got in the way. :-)

I will not disagree with your assertion that computer language exist
for pragmatic reasons. However, I will stand by my original statement:
Programs are abstract specifications of computation--not abstract
specifications of sequences of machine instructions. [Here, I'm 
writing of the general case, not of special or assembly languages.]

Of course, describing computations is only a small part of what a
programmer/software engineer does, but this is beside the point.
 
norman> As a general rule, a computer language exists apart from any
norman> particular compiler or computer architecture, although issues of
norman> compilation and computation model may influence the design of a
norman> language.
 
pcg> I would dearly hope they do. Actually I would hope they are *prominent*.
pcg> I would venture as far as saying that the machine is the reason for
pcg> which the language exists. [ ... ]

I'm afraid you mistook my intent here. I was speaking of the way
the von Neumann machine influences the imperative languages, or the
way the lambda calculus influences the functional languages, etc. 
The underlying computational model influences language design and 
thus programming style. [And yes, there is some hope that novel
languages will influence machine design.]

If we had to rely solely on the physical machine for our ideas of
language design, I'm afraid we would be stuck with increasingly 
baroque or specialized languages.
 
norman> Second, it is simply naive to describe what a compiler does as
norman> macro expansion.

pcg> Maybe you are thinking of macros as they are found in languages like C,
pcg> those Rubin is quite unhappy about.
pcg> 
pcg> I am quite sure that a compiler can be implemented in Lisp/Scheme
pcg> macros. I am using "macro" here as 'procedure that is evaluated at
pcg> compile time, and whose function is to transform a piece of program into
pcg> something else'.  It need not be a mere syntax macro, say like in cpp.
pcg> If you are familiar with Lisp/Scheme, just consider the source and
pcg> effects of some common 'loop' macros -- to my poor tired eyes they look
pcg> like pieces of a compiler.

Actually, I had forgotten about Lisp/Scheme macros. Yes you can
implement a translator from a (user defined) language to Lisp/Scheme
with macros. But this does not mean that compilers, in general,
are macro expanders, which was Rubin's original assertion.
Certainly languages are not collections of macros, as Rubin 
also asserted. [But, of course, a translator may be a collection
of macros.]
 
cjk> If the machine can do something, the user should be able to invent
cjk> a notation to do it, if necessary.  For example, I find the general
cjk> vector operation on the CYBER 205, with its dozen optional fields,
cjk> easy to explain in a notation easy to read and write.
 
pcg> Maybe Rubin is a bit rough in expression, but I think that he is trying
pcg> to say that he would dearly love to have extensible compilers, i.e.
pcg> compilers where the set of compile time procedures can be extended by the
pcg> user, i.e. the symbolic reduction engine is programmable. The "if
pcg> necessary" above does not refer to power, I guess, but to cost.
pcg> 
pcg> Well, maybe he should seriously consider Lisp/Scheme. It is well known
pcg> that they are excellently suited to numerical computations, and they
pcg> provide unparalleled access to the underlying implementation. Maybe a
pcg> look a T? Forth is another nice choice. Note that I am not joking here
pcg> -- I am very serious. I am considering writing myself an OS kernel in
pcg> Scheme or something similar rather than in C/C++. Scheme can be viewed
pcg> as a fairly good SIL, even if most Scheme systems do not implement it as
pcg> one (especially inasmuch memory management is concerned -- you don't
pcg> have the option of doing manual memory reclamation really).

So, what you're saying is that Ruben should invent his own language
and write translators for that language in Lisp/Scheme macros. Yes
this is possible--I've heard of a macro package for Common Lisp
that implements an Algol-like language.
 
norman> This is written as if you believe the 'power of the computer' resides
norman> in its instruction set. I disagree with this idea.
 
pcg> Interesting disagreement. So Peano arithmetic (zero, successor, test
pcg> against zero) is enough and everything else is redundant? Why are we not
pcg> using Turing machines? They are as powerful as anything else, and have
pcg> a pretty small instruction set.

Well, I'm sure you know we can't build Turing machines. :-)

Let me clarify my position here: While the 'power' of the computer
does not reside in its instruction set, the efficiency and practicality
of the computer does.
 
norman> In fact, the whole RISC movement is founded on the idea that to
norman> build a faster computer you must _remove_ instructions from the
norman> instruction set.
 
pcg> Uhmmm. _must_? I don't think they see it as a necessity -- my own,
pcg> probably seriously wrong idea of the subject, is that the RISC movement
pcg> is about improving price/performance, not *power* (once you have got
pcg> Peano, you have got it all), by observing that it is _often_ a bad
pcg> investment to allocate your silicon budget to seldom used, complicated
pcg> instructions, but instead it gives better price/performance to allocate
pcg> it to speeding up the simpler, most frequently executed ones.
pcg> 
pcg> I don't think that this excludes that a RISC machine can have
pcg> complicated instructions (vide floating point), if the payoff is
pcg> sufficient.  So far the best known payoff for doing matrix computing is
pcg> in vector instructions, ergo they are RISCy :-). [ ... ]

Quite correct. I didn't mean to imply otherwise.

norman> I believe the 'power of the computer' resides in its ability to be
norman> used as an abstraction device on which we can build higher and higher
norman> levels of reality.
 
pcg> Here we fully agree. The reasons for which computers are so important
pcg> are that they are general purpose automata (abstraction engines), and
pcg> that they can be cost effective too. Leaving out the second aspect
pcg> brings us back to Babbage, I am afraid.

norman> Personally, I chose to live in a reality where
norman> lambda calculus is the underlying computational model.

pcg> We couldn't have guessed! :-).

Oops, my rhetoric gave me away :-)

pcg> Most of us out here are instead in a
pcg> reality where they must pay for their machines, and are keenly
pcg> interested in price/performance ratios.
pcg> 
pcg> Rubin too is unfortunately locked in this sad and regrettable reality,
pcg> and also in a segment of it where absolute prices are so huge that
pcg> price/performance ratio of a machine/language combination seems not
pcg> easily dismissable.
pcg> 
pcg> He is quite annoyed that he cannot get the benefits of high level
pcg> languages without forgoing those of special purpose hw facilities that
pcg> substantially enhance the cost effectiveness of his work, and that also
pcg> carried a fairly large price tag (only a few million dollars, I guess,
pcg> that's the difference between the price of a 205 and that of a mainframe
pcg> of equivalent scalar power). He is even more annoyed as he suspects that
pcg> the problem could be rectified with just a little more effort on the
pcg> part of language designers and implementors.

And I can emphathise with his plight.

norman> But by living in this level of reality I gain the ability to
norman> write extremely clear and straight-forward programs that are 10,
norman> 20, and sometimes even 30 times _shorter_ than the equivalent
norman> programs written in a typical high-level language.
 
pcg> Are you sure that your "programs" are equivalent? [...]

Right. I should have written "comperable" or "similar" rather than
"equivalent".
-- 
Norman Graham   <norman@a.cs.okstate.edu>   {cbosgd,rutgers}!okstate!norman
The opinions expressed herein do not necessarily reflect the views of
the state of Oklahoma, Oklahoma State University, OSU's Department of
Computer Science, or of the writer himself.

stephen@estragon.uchicago.edu (Stephen P Spackman) (09/24/90)

It turns out that you can get pretty generally extensible syntax in a
programming language without using macro expansion at all, and
probably without impairing efficiency significantly. Everything has
clean formal semantics - my current notation is both semantically and
syntactically like a very much cleaned up Algol68.

I'm supposed to be writing a paper about it at this very moment, but I
seem to be goofing off :-). When I saw this thread I couldn't help
commenting.

(No, I'm not going to post a full description because then I *surely*
won't finish writing this stupid thing up. I'm already bored to tears
:-).

stephen p spackman  stephen@estragon.uchicago.edu  312.702.3982

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (09/24/90)

In article <2581@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes:
> Why is there an absolute value function rather than an absolute value
> operator?

In APL it's |x and in JOSS it's |x|.  abs was an operator in Algol 68 too;
how you entered it was of course system-dependent.

> I have used a dialect of Fortran which overloaded some of the "standard" functions,
> so that absolute value, logarithm, mod, etc., did not have to have any modifier
> indicating the type(s) of the arguments.  The compiler did the appropriate
> analysis.

Sounds like Fortran 77.  PL/I "GENERIC" functions provided this.  Algol 68
provided overloading for operators.  Fortran 90 provides user-definable
overloading.  It's in Ada.  Haskell overloads.

> A language is a set of macros to be interpreted by the compiler into machine
> procedures.  There is no good reason why the user should not be allowed to
> add macros in convenient notation to this list, nor why many of the restric-
> tions on the current macros need be there.

I strongly suspect that Pop-11 may have just what you want.  (Except that
for portability reasons it compiles to a common "Pop Virtual Machine".)
I suggest that you look in comp.lang.lisp for articles posted by Aaron
Sloman; he has recently pointed out that Pop-11 "syntax forms" let you
plant the code of your choice.

-- 
Heuer's Law:  Any feature is a bug unless it can be turned off.

mroussel@alchemy.chem.utoronto.ca (Marc Roussel) (09/25/90)

In article <2581@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
>Why is there an absolute value function rather than an absolute value
>operator?  Only because when Fortran was written, they were limited to
>a 48 character set (IBM punched cards, at the time).  This is also why
>** was used for exponentiation, and even * for multiplication.  I have
>used a dialect of Fortran which overloaded some of the "standard" functions,
>so that absolute value, logarithm, mod, etc., did not have to have any modifier
>indicating the type(s) of the arguments.  The compiler did the appropriate
>analysis.

Uhhh... Isn't this just what the FORTRAN 77 standard dictates?  (Yes, I
know about the exceptions.  I mean in the usual case where I say, for
instance, a = sin(b) and the types of both a and b are well-defined.)

                                Just asking...

				Marc R. Roussel
                                mroussel@alchemy.chem.utoronto.ca

lgm@cbnewsc.att.com (lawrence.g.mayka) (09/25/90)

In article <PCG.90Sep23181145@teachc.cs.aber.ac.uk>, pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
> My probably warped view of the programming scene is that the best known
> general purpose language technology, applied under favourable
> circumstances, seems (hand waving very prominently here) no more than
> 2-3 times more productive than mainstream language technology (Fortran,
> Cobol, C, PL/1); claims to the contrary make me a bit skeptical. Now I
> admit that even 200-300% is a lot, in economic terms, but it pales by
> comparison with the different productivity levels of *people*.

I disagree with your estimate, but even if 3.0 were the correct
multiplier to apply to individuals working alone, the nature of large
software system development can greatly exaggerate the effect.  30
mediocre programmers who are constantly in each other's way and who
have only a fair understanding of project requirements and system
architecture may be an order of magnitude less productive than 3
top-notch programmers who know every inch of their system and their
project, and who can work together as a harmonious team.


	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@iexist.att.com

Standard disclaimer.

carroll@udel.edu (Mark Carroll <MC>) (09/26/90)

In article <2581@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
>
>The problem with PL/I and assemblers is the poor notation.  This even 
>occurs with the present languages, to more of an extent than many would
>recognize.  One of the important advantages of languages is the use of
>overloaded operators and relations, and converient use of standard
>mathematical notation.  
>

>Why is there an absolute value function rather than an absolute value
>operator?  Only because when Fortran was written, they were limited to
>a 48 character set (IBM punched cards, at the time).  This is also why
>** was used for exponentiation, and even * for multiplication.  I have
>used a dialect of Fortran which overloaded some of the "standard" functions,
>so that absolute value, logarithm, mod, etc., did not have to have any modifier
>indicating the type(s) of the arguments.  The compiler did the appropriate
>analysis.
>

First of all: functions versus operators is nothing more than a foolish
complaint on the basis of syntax. You don't like typing ABS(x)? Just
write yourself a quick filter that converts |x to ABS(X). It's probably
a whole two lines in awk or perl.

Second, for overloading: it exists. C++, or any object oriented language
allows full overloading. It will resolve things to the proper function
for the types involved. And it's even user extensible.

>A language is a set of macros to be interpreted by the compiler into machine
>procedures.  

This is where you really go off the wall.

A language is nothing but a set of Macros? Maybe I'll stop laughing and take
you seriously if you can show me any way of describing Smalltalk as  a set
of macros.

>There is no good reason why the user should not be allowed to
>add macros in convenient notation to this list, nor why many of the restric-
>tions on the current macros need be there.  If the machine can do something,
>the user should be able to invent a notation to do it, if necessary.  For
>example, I find the general vector operation on the CYBER 205, with its dozen
>optional fields, easy to explain in a notation easy to read and write.

I'm terribly happy for you.

Listen, if you're so hyped on the idea of writing your own super-macro
assembler, why don't you sit down and write a specification of the damned
thing. If you write the specification, I'll write the compiler. OK?

	<MC>

--
| Mark Craig Carroll: <MC> |"We the people want it straight for a change;
| CIS Graduate Student at  | cos we the people are getting tired of your games;
| University of Delaware   | If you insult us with cheap propaganda; 
| carroll@udel.edu         | We'll elect a precedent to a state of mind" -Fish