[comp.compilers] Need input on designing a new language

ramsey@ncoast.ORG (Cedric Ramsey) (05/31/90)

	I have a question to pose to you programmers. I am thinking about
developing a new language but first I would you all's input a problem that
I have. Firstly, I must say that the main goal of many languages should be
simplicty. Simple to solve problems with and simple to implement on a compiler.
Thats the heart of my problem, simple to implement. It would be easier for
me if I forced the user, programmer, to declare all the procedures before the
function body occurs. That is; 

Declare Procedure A
 ... Local and Global variables ...
End Declare
Declare Procedure B
 ... Local and Global variables ...
End Declare
     .
     .
     .
Declare Procedure X
 ... Local and Global variables ...
End Declare


Body for procedure A
End Body

Body for procedure B
End Body
    .
    .
    .
Body for procedure X
End Body

	Else I could allow the C flavor declarations, that is;
Declare procedure A
 ... Local GLobals ...
End Declare
Body for procedure A
End body

If you programmers think that this is a good or bad idea, that is, force 
programmer to declare all procedures before usage, please send me your 
pro and/or cons against or for it. After all, the language will be for you 
guys so I feel it necessary for you guys to give me guide lines in creating 
a heafty language. please leave replys via E-mail: ramsey@ncoast.ORG , 
thank you for your time and considerations.
[What's a heafty language? -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

henry@zoo.toronto.edu (06/02/90)

>Thats the heart of my problem, simple to implement. It would be easier for
>me if I forced the user, programmer, to declare all the procedures before the
>function body occurs...

You have to decide where your priorities lie:  simpler implementation, or
simpler use.  Implementation is definitely simpler if you know the details
about a procedure before you have to compile a call to it.  However, my
experience is that programmers overwhelmingly prefer to write the entire
procedure -- declarations and body -- in one piece, and see no reason why
they should have to contort their code for the compiler's convenience.
They put up with it in most current languages, but they don't like it.

Compilers do random-access lookups much better than humans do.  If the
compiler needs information from line 245 to compile line 34, it should
go get it, not require the programmer to do so.

                                    Henry Spencer at U of Toronto Zoology
                                uunet!attcan!utzoo!henry henry@zoo.toronto.edu
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

klefstad@opera.ICS.UCI.EDU (Ray Klefstad II) (06/04/90)

>>Thats the heart of my problem, simple to implement. It would be easier for
>>me if I forced the user, programmer, to declare all the procedures before
>>the function body occurs...
>
>You have to decide where your priorities lie: ...

It is really more complex than that.  There are (at least) four issues one
must consider when designing a general purpose language:
    1) programming ease
    2) compiler speed
    3) ease of compiler implementation
    4) language support for good software engineering

Let's look at three alternatives for subprogram declaration:

A) requiring specification before declaration of every subprogram
B) requiring specification before use of every subprogram
C) allowing arbitrary declaration and use order (use before declaration
   is ok)

This is how I would grade each approach on each issue (scale: excellent,
good, fair, poor, fail):
    Approach A:
        1: poor (requires duplicate typing of specs and bodies)
        2: poor (one pass, but must process duplicate definitions)
        3: good (almost excellent, but B is just as easy)
        4: (possibly) good (if you allow separate module specs and bodies
            like Ada or Modula 2 and separate compilation with strong
            semantic analysis across modules)
    Approach B:
        1: good (usually only a few forward declarations are necessary,
           but this style may force a backwards order on declaration,
           i.e., main subprograms must be places before their auxiliary
           subprograms)
        2: excellent (allows one pass with minimal duplicate definition)
        3: excellent (specs are always known before a use - even in one pass)
        4: (possibly) excellent (same as Approach A.4 above)
    Approach C:
        1: excellent (what could be easier)
        2: poor (This will require two passes in the front-end.  I can
           think of one way to do it in one, but it will require lots of
           space and time overhead.)
        3: fair (again, requires two passes or some other complicated scheme)
        4: poor (How do you get semantic checking across modules?
           An optional pass by a lint-like program?  Yuck!)

I say approach B is the winner.  This is what you see in current
languages such as Pascal, C, and even Ada.  I think the design of Ada
packages with separate specs and bodies is a real winner on this issue.
The subprogram specs in the package spec act as forward declarations
for the subprogram bodies in the package body and as forward declarations
for all modules that use (via `with') the subprograms defined for use by
that package.  In addition, you get excellent support for the often
overlooked point 4.

You asked for opinions about forward declarations and I gave it.  You
didn't ask for opinions about your decision to invent and implement a
new programming language, but I will give mine anyway.

If your motivation is to learn something about programming language
implementation, then it's a good idea.  If your motivation is to invent
yet-another-programming-language, then I predict NO ONE will ever use
your language.  You will probably not use it either - even if it
incorporates a few new and interesting ideas!  Look at the
history of languages such as Alphard, Euclid, (even Smalltalk to some
extent) - the list goes on - and these are languages that had significant
interesting ideas.  Look what languages are successful: FORTRAN, COBOL,
BASIC, C, Ada, and (based on initial reaction) C++.

Ada was successful (if you can call it that) due to tremendous financial
and political support from the Dept. of Defense.  C gained support (over
assembly, FORTRAN, etc) because it is easy for programmers to write
efficient code (even with a non-optimizing compiler which is very easy
to write), it is easy to do machine-oriented operations, and it is easy
to implement (or port) a C compiler.  C++ is riding on the success wave
of C.  Few programmers want to expend the effort to learn a new language
(such as yours), but they might learn a few additions to their favorite
language.  While I do not favor programming in Ada, I have to admit
it has the best design of all the languages with which I am familiar,
but I doubt Ada would have succeeded without the support of the U.S. DOD.

A better use of your time would be to make a small improvement to an
existing language (such as that done by adding classes to C in C++).
Perhaps you can figure out a new multi-tasking model or an exception
handling model that works well with object-oriented languages.  This
could be added to implementations of C++, Objective-C, Smalltalk, and
others (perhaps even Ada).  When you do so, give it a nice syntax that
blends in with the conventions of the language you are extending.

One more thing to consider is the purpose of your language.  Who will
use it?  What will be its strengths?  Will it be the language to replace
all other languages?  You must decide, because you must make numerous
design decisions (tradeoffs considered) between various ways to handle
features of programming languages.  Are you going to build a debugger
and tools to support software development in your language?  Are you
going to write language primers to teach programmers to use your
language?  If not, then who will?

I don't mean to squelch your enthusiasm, but you should know what you
are in for.  I certainly don't want to discourage experimentation and
learning, but be aware of what is in store if you are expecting others
to use your language.  I can tell you, I already don't like the syntax
you have shown in your examples, but that is probably just personal
preference.  Many did not like Ada when it first came out.  It's always
easier to learn a language if it is consistent with a language you
already know, then you can learn just the differences.

If you are interested in designing languages, I suggest you first read
"Programming Language Concepts" by Carlo Ghezzi and Mehdi Jazayeri
or some other appropriate text covering language issues and design.

So do I design new language features?  Of course I do.  It is loads of
fun.  I am lucky to teach the compiler courses here at U.C. Irvine and
each term I must come up with a new language for my students to
compile.  I often try out new ideas in this course, but I usually do so
by starting with a subset of Ada and adding one new interesting idea.
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

henry@zoo.toronto.edu (06/05/90)

>    4) language support for good software engineering	...
>C) allowing arbitrary declaration and use order (use before declaration
>   is ok)	...
>    Approach C:	...
>        4: poor (How do you get semantic checking across modules?
>           An optional pass by a lint-like program?  Yuck!)

Semantic checking across modules is a necessity, and there is only one way
to do it:  gather information from all the modules, and compare.  The
question is whether that is done automatically (either by the compiler
or by a separate program), or whether the programmer is forced to do some
of the work by contorting his program structure to facilitate checking.
I take the strange and unpopular :-) position that such mechanical busywork
ought to be done by software, not by me.

If you insist on seeing it in terms of header files that have to be
included for compilation, consider having some part of the implementation
generate them automatically, rather than demanding that the user do it.
This also avoids the standard problem of header files, to wit that they
get out of step with the code they claim to describe.

The extra processing time is irrelevant; if minimizing the CPU time needed
to compile my programs was all-important, I would write them in hexadecimal.
I have no liking for slow compilers, mind you, but if that CPU time is
being spent doing something useful -- like performing tasks that would take
much longer if done by hand -- I'm all for it.

Incidentally, Ray's article is a classical example of what's wrong with
header files, since understanding his assessments of the different methods
needed constant flipping back and forth between the assessments and his
definitions of aspects 1-4 and methods A-C.

On a broader topic, I'm in agreement with him that defining a new
general-purpose language is likely to be a futile exercise.  To gain any
popularity at all, a language either has to fill some specialized niche
that existing languages address poorly, or be an order of magnitude
better than they are at what they already do.  Small improvements are
not sufficient, because they do not justify the pain of conversion and
incompatibility.

                                    Henry Spencer at U of Toronto Zoology
                                henry@zoo.toronto.edu uunet!attcan!utzoo!henry
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

itcp@relay.EU.net (Tom Parke) (06/06/90)

klefstad@opera.ICS.UCI.EDU (Ray Klefstad II) writes:

>>>Thats the heart of my problem, simple to implement. It would be easier for
>>>me if I forced the user, programmer, to declare all the procedures before
>>>the function body occurs...
>>
>>You have to decide where your priorities lie: ...

>It is really more complex than that.  There are (at least) four issues one
>must consider when designing a general purpose language:
>    1) programming ease
>    2) compiler speed
>    3) ease of compiler implementation
>    4) language support for good software engineering
	- how well is the language specified
	- what degree of consistency checking is available (eg. type checking)
	- what support for modularity and re-use is there.
     5) compiler portability
     6) source (of programs in the new language) portability
     7) reliability of compiler (similar to 3 but I like the extra emphasis)
     8) efficiency of compiled programs 
	- degree of possible optimisation
	- how close to the hardware (inverse of portability)?
     9) how extensible is the language?
    10) how dynamic is the language? 

I make it at least 10, any others?

For subprogram declaration 8 is also significant, the more you know
about a procedure before you call it the greater the scope for
procedure call optimisation, for some risc architectures this can be
very significant.

>You asked for opinions about forward declarations and I gave it.  You
>didn't ask for opinions about your decision to invent and implement a
>new programming language, but I will give mine anyway.

Me too. Frankly, if you're worried about whether procedures need
declaring before use I'd suggest you've either finished the design or
started in the wrong place :-)

>If your motivation is to learn something about programming language
>implementation, then it's a good idea.  If your motivation is to invent
>yet-another-programming-language, then I predict NO ONE will ever use it.

If you want to learn about programming language *design*, then it's a
good idea. But lay your hands on every language spec you can find
first... If you wish to learn about programming language
implementation, try writing a compiler for a subset of your favourite
language and then adding bits - see what's easy and what isn't !

>One more thing to consider is the purpose of your language.  Who will
>use it? [and more relevant points]

He's right. Every programmer and his dog thinks he can design a better
programming language, somehow it's only natural. But all the problems
in the languages we know should make us pause and think. Having one or
two better ideas isn't the hard part - it's avoiding the mistakes :-)

	Tom

(My opinions and spelling are my own) [bah, I fixed the latter -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.