md89mch@cc.brunel.ac.uk (Martin Howe) (09/25/90)
In the middle of my periodical musings about what should or should not go into an ``ideal'' programming language, and how people tend to resist such ideas on grounds of implementation difficulty (among others), I came across article <9009110403.AA03158@csd4.csd.uwm.edu> in which Mark William Hopkins <markh@csd4.csd.uwm.edu> writes: > Recently, an interesting idea has come to mind for a new kind of compiler: >a Multi-Compiler. What makes it different from your typical compiler is that >it accepts code from more than one source language: many source languages in >fact. In fact reading the Byte 15 Year Anniversary issue, it seems Jensen & Partners have come up with just that - the TopSpeed system. I tell you, FROM stdio IMPORT printf; looked pretty odd at first sight. > What would it look like ? The whole issue seems to revolve around > this concept (which I borrow from linguistics) of 'code-switching'. In fact, TopSpeed isn't, as far as I can make out, a true ``multicompiler''; JPI seem to do it around libraries. One uses one language as a top-level shell and calls library routines from whichever languages have been installed with your system. However, I have felt for some time, that multicompilers when they arrive will not solve the problem very much more than mixed-language compilation and linkable object modules do now. The __real__ problem, in my estimation is that of deciding exactly *what* should go in an all-embracing language. As Mark Hopkins says: > Different languages are designed to do different things better. I would go further: different programming pradigms do things better. This is obvious; but the solution, while equally obvious, doesn't seem to have been tried [except Trilogy ?] and multicompilers only sidestep it. There are at the moment, four well-known programming pradgigms: imperative, funtional, logic and object-oriented. There may be others, but these ones are the main four at the moment. People often ignore the fact that real-world problems often require one or more language types to solve them and for this reason, I have suggested in the past and will continue to suggest, that a ``multi-language'' which covers all four is, rather than an ``ideal impossibility'' or ``too difficult to implement'' or a ``bloated compiler'' [substitute whinge of your choice], an ABSOLUTE NECESSITY if anything even remotely like an ``ideal'' programming language is ever to be designed. I suggest that while we can never create the ideal <anything>, we can come pretty close, and I offer the following possible solution for discussion: For each type of language (four at the moment), extract a minimal language that fulfills the requirements. For example, bare-bones Modula-2 for the imperative requirements. Design a lexicon and grammar that covers all four and are as natural-language like as possible without being imprecise. If you have to go to LL(2) or have a two-level parser so be it; MIPS are cheap these days (hey, I'm a VLSI designer, I should know :-); and human time isn't. Let the library (ie, object class) writers extend as necessary. This is another focal point. It is stupid to say ``Oh, but the user can write routines to extend the language.'' Oh yeah ? Then tell me which of the following is more readable, given a library of complex arithmetic functions: sin := (e**z - e**-z) / 2i (* note the lack of garbage like FLOAT *) sin := CompDiv(CompSub(CompExp(z),CompExp(-z)),CompAssign(0,2)); It gets worse if you can't return user-defined non-cardinal types (ie pointers to them) on the stack. This is another flaw in some languages today. If I code VAR meow : ARRAY[0..262144] of byte; and later on in a procedure RETURN meow; I **know** the compiler isn't going to return a 256kByte array on the stack; it'll use a pointer. But I, the programmer DON'T NEED TO KNOW THIS ! There can be no excuse these days for not allowing ANYTHING to be returned from a procedure, but even Modula-2 Rev. 4 doesn't do this. Pfft! Furthermore, make it easy to define not only your own operators, but also __your own textual forms for literals__. I would rather write CONST zin = z2 / (5+3i) than CONST zin = CompDiv(z2,CompAssign(5,3)) for example. Again, at this point, people usually start to whine, but I would say that there is almost certainly a crossover point past which, as languages get more natural-looking, the designer can think in higher level terms, and express higher level ideas more succinctly, and therefore __LESS BUGGILY__. (Who cares about EOL & EOT ? WHILE (<>) looks fine to me). (Of course they can express higher-level algorithm flaws more succinctly :-) Of course, it must be remembered that someone who must have been very clever once remarked: "Enable programmers to program in English, and you will find that they can't". This is true up to a point. Our language must be limited, or it will lose any preciseness. I am saying also that a __lot__ of extra syntactic freedom in saying what you _can_ say in the language, and current languages just don't provide it. For example, is it really so difficult to parse out the noise words in z2 := the 53rd 130th root of z1; given a prodecure CompRoot(complex,integer,integer,complex) ? Perhaps with objects available, we can provide self.parser as a routine with each declared type [recursive compilation anyone ?]. Oh, and one more thing - MACROS ! If I am putting together a library of IO routines based on a library that comes with the compiler, I don't want a function call overhead, whenever I use any of those routines verbatim. For example, if I rewrite sin() and cos(), but leave exp() alone, I take a performance hit when I say MyExp(number:real); BEGIN RETURN maths.exp(number:real) END MyExp; since MyExp is a real function, not a macro. Furthermore, I frequently want to be able to dump a copy of a routine inline without doing it as a function call, eg., for reasons of speed; but keeping only one main definition of that function. How about BEGIN ... EXEC (some_horribly_complicated_test()) (*rather than *) some_horribly_complicated_test(); ... END; For that matter, INC(x) looks like a procedure, but it'd damn well better be a SINGLE assembler instruction in practice, or else. ------------------------------------------------------------------------ Well I've got that lot off my chest after so many years, so let's clean up the loose ends. Mark continues: >people I talked to about this seem to arrive at as a first idea, then you >have nothing more than a series of disjoint compilers integrated by a common >object code format and single linker. BTW, JPI use a common p-code and object code generator. > Syntax is not an issue. Here I must disagree. See above. > We're not talking about actualy merging the syntaxes of the source languages I am (sort of). >would be an interesting problem to solve. You bet ! > When you want your compiler to do C, you issue a #in c directive. When you > want it to switch to Pascal, you likewise issue a #in pascal directive, and > so on... I have thought of this before, but I'm not sure I'd like it. > With this latter strategy (more than one language per file), the issue of > what language you issue external declarations becomes moot: since it's all > "going down the same stomach" anyhow, it doesn't matter. I couldn't agree more, but I still feel the #C #pascal idea would look too odd. Still, its a matter of taste. > The best strategy to pursue to minimize these problems see to be to > simultaneously develop extensions of each language that are upwardly > compatible with the latest standard and which make these languages as much > alike as possible. This means adding C/Pascal-like data structures and > control structures to the likes of FORTRAN or BASIC, for instance. I'll go along with that in the meantime, despite the people who laugh when I say it. Believe me, many people I have talked to find such ideas anathema. > It seems to me, though, that the huge investment in this effort would be >very much worth it, since no matter where I talk and who I talk to about >this, the idea goes over extremely well: it seems that we're talking about >the ultimate programmer's workbench with this kind of utility. Agreed. > But there's this one nagging issue: what would this give us that using a >series of compilers, like MicroSoft's Quick series, with a good linker won't >already give you? A completely integrated and normalised language, tailored to fit the majority of real-world problems (at least those we know how to do at the moment) with as few _extraneous_ ways of doing the same thing as possible. Oh well. I can dream... Regards, Martin. (I leave Brunel University at the end of next week, but I'll happily discuss this (if anyone's interested) until then). -- Martin Howe, Microelectronics System Design MSc, Brunel U. [A J Perlis often commented that attempts to combine dissimilar language types produced "dumbbell shaped languages," i.e. the pieces didn't fit together very well. I'd also like a language that lets me say anything I want to say very concisely, but I'm not convinced that I can define something that combines all sorts of different stuff and doesn't end up looking totally ugly. More specific proposals could be persuasive. Also, there has been a long thread on this topic in comp.lang.misc. -John] -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.
markh@csd4.csd.uwm.edu (Mark William Hopkins) (09/27/90)
In article <1990Sep25.025517.25446@esegue.segue.boston.ma.us> md89mch@cc.brunel.ac.uk (Martin Howe) writes: >> Different languages are designed to do different things better. > >I would go further: different programming pradigms do things better. This >is obvious; but the solution, while equally obvious, doesn't seem to have been >tried [except Trilogy ?] ... This is precisely what I meant, and gave the Prolog/C example to stress this. The particular multi-compiler I'm developing integrates programming languages from all the 4 programing language paradigms you mention: C++ -- an object oriented language; C, Pascal, BASIC, and FORTRAN -- imperative languages; LISP, and Miranda -- functional (and quasi-functional) languages; and Prolog (a logic programming language). You've mentioned in the subsequent text the ideal you'd like to see where a language becomes almost flexible enough to allow user-defineable syntax. Prolog already allows for this to a significant degree, though it is grossly underutilized, judging by the number of virtually unreadable Prolog programs I've been able to take and very nearly convert into English, with prepositions, verbs, and so on. C++ has this feature to a smaller degree, Haskell (and probably Miranda) goes almost as far as Prolog. -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.
pardo@cs.washington.edu (David Keppel) (09/30/90)
markh@csd4.csd.uwm.edu (Mark William Hopkins) writes: >[Prolog allows for user-definable syntax to sme degree] So do (did) many LISPs, where it was possible to redefine the read-eval-print loop to use Your Favorite Syntax. ;-D on ( No taxing those sins! ) Prado -- pardo@cs.washington.edu {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.