[comp.compilers] Multi-compilers -- The ``Ideal'' Programming language ??

md89mch@cc.brunel.ac.uk (Martin Howe) (09/25/90)

In the middle of my periodical musings about what should or should not
go into an ``ideal'' programming language, and how people tend to resist
such ideas on grounds of implementation difficulty (among others), I came
across article <9009110403.AA03158@csd4.csd.uwm.edu> in which Mark William
Hopkins <markh@csd4.csd.uwm.edu> writes:

>   Recently, an interesting idea has come to mind for a new kind of compiler:
>a Multi-Compiler.  What makes it different from your typical compiler is that
>it accepts code from more than one source language: many source languages in
>fact.

In fact reading the Byte 15 Year Anniversary issue, it seems Jensen & Partners
have come up with just that - the TopSpeed system. I tell you,
                          FROM stdio IMPORT printf;
looked pretty odd at first sight.

> What would it look like ? The whole issue seems to revolve around
> this concept (which I borrow from linguistics) of 'code-switching'.

In fact, TopSpeed isn't, as far as I can make out, a true ``multicompiler'';
JPI seem to do it around libraries. One uses one language as a top-level
shell and calls library routines from whichever languages have been installed
with your system.

However, I have felt for some time, that multicompilers when they arrive
will not solve the problem very much more than mixed-language compilation
and linkable object modules do now. The __real__ problem, in my estimation
is that of deciding exactly *what* should go in an all-embracing language.

As Mark Hopkins says:

> Different languages are designed to do different things better.

I would go further: different programming pradigms do things better. This
is obvious; but the solution, while equally obvious, doesn't seem to have been
tried [except Trilogy ?] and multicompilers only sidestep it.

There are at the moment, four well-known programming pradgigms: imperative,
funtional, logic and object-oriented. There may be others, but these ones
are the main four at the moment. People often ignore the fact that real-world
problems often require one or more language types to solve them and for this
reason, I have suggested in the past and will continue to suggest, that a
``multi-language'' which covers all four is, rather than an ``ideal
impossibility'' or ``too difficult to implement'' or a ``bloated compiler''
[substitute whinge of your choice], an ABSOLUTE NECESSITY if anything even
remotely like an ``ideal'' programming language is ever to be designed.

I suggest that while we can never create the ideal <anything>, we can come
pretty close, and I offer the following possible solution for discussion:

For each type of language (four at the moment), extract a minimal language
that fulfills the requirements. For example, bare-bones Modula-2 for
the imperative requirements. Design a lexicon and grammar that covers all
four and are as natural-language like as possible without being imprecise.
If you have to go to LL(2) or have a two-level parser so be it; MIPS are
cheap these days (hey, I'm a VLSI designer, I should know :-); and human
time isn't. Let the library (ie, object class) writers extend as necessary.

This is another focal point. It is stupid to say ``Oh, but the user can
write routines to extend the language.'' Oh yeah ? Then tell me which of the
following is more readable, given a library of complex arithmetic functions:

sin := (e**z - e**-z) / 2i  (* note the lack of garbage like FLOAT *)

sin := CompDiv(CompSub(CompExp(z),CompExp(-z)),CompAssign(0,2));

It gets worse if you can't return user-defined non-cardinal types
(ie pointers to them) on the stack. This is another flaw in some languages
today. If I code

VAR
   meow : ARRAY[0..262144] of byte;

and later on in a procedure

RETURN meow;

I **know** the compiler isn't going to return a 256kByte array on the stack;
it'll use a pointer. But I, the programmer DON'T NEED TO KNOW THIS ! There can
be no excuse these days for not allowing ANYTHING to be returned from a
procedure, but even Modula-2 Rev. 4 doesn't do this. Pfft!


Furthermore, make it easy to define not only your own operators, but also
__your own textual forms for literals__. I would rather write

CONST
    zin = z2 / (5+3i)
than
CONST
    zin = CompDiv(z2,CompAssign(5,3))

for example. Again, at this point, people usually start to whine, but
I would say that there is almost certainly a crossover point past which,
as languages get more natural-looking, the designer can think in higher level
terms, and express higher level ideas more succinctly, and therefore
__LESS BUGGILY__. (Who cares about EOL & EOT ? WHILE (<>) looks fine to me).
(Of course they can express higher-level algorithm flaws more succinctly :-)

Of course, it must be remembered that someone who must have been very
clever once remarked: "Enable programmers to program in English, and you
will find that they can't". This is true up to a point. Our language
must be limited, or it will lose any preciseness. I am saying also that
a __lot__ of extra syntactic freedom in saying what you _can_ say in the
language, and current languages just don't provide it.

For example, is it really so difficult to parse out the noise words in

  z2 := the 53rd 130th root of z1;

given a prodecure CompRoot(complex,integer,integer,complex) ? Perhaps with
objects available, we can provide self.parser as a routine with each
declared type [recursive compilation anyone ?].

Oh, and one more thing - MACROS !
If I am putting together a library of IO routines based on a library
that comes with the compiler, I don't want a function call overhead,
whenever I use any of those routines verbatim. For example, if I rewrite
sin() and cos(), but leave exp() alone, I take a performance hit when I say

MyExp(number:real);
BEGIN
     RETURN maths.exp(number:real)
END MyExp;

since MyExp is a real function, not a macro. Furthermore, I frequently
want to be able to dump a copy of a routine inline without doing it as
a function call, eg., for reasons of speed; but keeping only one main
definition of that function.

How about
BEGIN
     ...
     EXEC (some_horribly_complicated_test())

(*rather than *)

     some_horribly_complicated_test();
     ...
END;

For that matter, INC(x) looks like a procedure, but it'd damn well
better be a SINGLE assembler instruction in practice, or else.

    ------------------------------------------------------------------------

Well I've got that lot off my chest after so many years, so let's
clean up the loose ends.

Mark continues:
>people I talked to about this seem to arrive at as a first idea, then you
>have nothing more than a series of disjoint compilers integrated by a common
>object code format and single linker.

BTW, JPI use a common p-code and object code generator.

> Syntax is not an issue.
Here  I must disagree. See above.

> We're not talking about actualy merging the syntaxes of the source languages
I am (sort of).
>would be an interesting problem to solve.
You bet !

> When you want your compiler to do C, you issue a #in c directive. When you
> want it to switch to Pascal, you likewise issue a #in pascal directive, and
> so on...
I have thought of this before, but I'm not sure I'd like it. 

> With this latter strategy (more than one language per file), the issue of
> what language you issue external declarations becomes moot: since it's all
> "going down the same stomach" anyhow, it doesn't matter.
I couldn't agree more, but I still feel the #C #pascal idea would look too
odd. Still, its a matter of taste.

> The best strategy to pursue to minimize these problems see to be to 
> simultaneously develop extensions of each language that are upwardly
> compatible with the latest standard and which make these languages as much
> alike as possible.  This means adding C/Pascal-like data structures and
> control structures to the likes of FORTRAN or BASIC, for instance.

I'll go along with that in the meantime, despite the people who laugh when
I say it. Believe me, many people I have talked to find such ideas anathema.

>   It seems to me, though, that the huge investment in this effort would be
>very much worth it, since no matter where I talk and who I talk to about
>this, the idea goes over extremely well: it seems that we're talking about
>the ultimate programmer's workbench with this kind of utility.
Agreed.

>   But there's this one nagging issue: what would this give us that using a
>series of compilers, like MicroSoft's Quick series, with a good linker won't
>already give you?

A completely integrated and normalised language, tailored to fit the majority
of real-world problems (at least those we know how to do at the moment) with
as few _extraneous_ ways of doing the same thing as possible.

Oh well. I can dream...

Regards,
Martin.
(I leave Brunel University at the end of next week, but I'll happily
discuss this (if anyone's interested) until then).
--
Martin Howe, Microelectronics System Design MSc, Brunel U.
[A J Perlis often commented that attempts to combine dissimilar language
types produced "dumbbell shaped languages," i.e. the pieces didn't fit
together very well.  I'd also like a language that lets me say anything I
want to say very concisely, but I'm not convinced that I can define
something that combines all sorts of different stuff and doesn't end up
looking totally ugly.  More specific proposals could be persuasive.  Also,
there has been a long thread on this topic in comp.lang.misc. -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

markh@csd4.csd.uwm.edu (Mark William Hopkins) (09/27/90)

In article <1990Sep25.025517.25446@esegue.segue.boston.ma.us> md89mch@cc.brunel.ac.uk (Martin Howe) writes:
>> Different languages are designed to do different things better.
>
>I would go further: different programming pradigms do things better. This
>is obvious; but the solution, while equally obvious, doesn't seem to have been
>tried [except Trilogy ?] ...

This is precisely what I meant, and gave the Prolog/C example to stress
this.  The particular multi-compiler I'm developing integrates programming
languages from all the 4 programing language paradigms you mention: C++ --
an object oriented language; C, Pascal, BASIC, and FORTRAN -- imperative
languages; LISP, and Miranda -- functional (and quasi-functional) languages;
and Prolog (a logic programming language).

You've mentioned in the subsequent text the ideal you'd like to see where a
language becomes almost flexible enough to allow user-defineable syntax.
Prolog already allows for this to a significant degree, though it is grossly
underutilized, judging by the number of virtually unreadable Prolog programs
I've been able to take and very nearly convert into English, with
prepositions, verbs, and so on.  C++ has this feature to a smaller degree,
Haskell (and probably Miranda) goes almost as far as Prolog.
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

pardo@cs.washington.edu (David Keppel) (09/30/90)

markh@csd4.csd.uwm.edu (Mark William Hopkins) writes:
>[Prolog allows for user-definable syntax to sme degree]

So do (did) many LISPs, where it was possible to redefine the
read-eval-print loop to use Your Favorite Syntax.

	;-D on  ( No taxing those sins! )  Prado
-- 
		    pardo@cs.washington.edu
    {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.