[comp.arch] Machine-independent intermediate languages

eric@snark.UUCP (Eric S. Raymond) (09/29/88)

The comp.arch discussion thread "Re: Software distribution" seems to me to
have drifted off into a lot of pointless theologizing. Let's try for a reality
check.

Let's start by asking the question:

	1) What properties distinguish a MLL from a HLL?

That is: how do I look at the semantics, performance, and portability of
a set of languages and sort the MIILs from the HLLs? Next:

	2) Are the portability goals for which MIILs are designed achievable
	   at all, given the diversity of today's architectures?

and, finally 

	3) If the answer to 2 is 'yes', *can those goals be achieved with
	   lower complexity and cost than an HLL compiler?*

If the answer to 3 is 'no', as I suspect, then I submit that we already have
as good an MIIL as we're ever going to get.

It's called 'C'.
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718

yba@arrow.bellcore.com (Mark Levine) (10/01/88)

You have built some questionable assumptions into your reality
check, and need to define your acronyms:

	MLL -- is this a machine level language?
	MIIL -- is this a machine independent intermediate language,
		and if so, at what level of expressive power, and is
		it any different than the MLL?  How?

I beleive in such a thing as a MOL, a machine oriented language, and
in a high level machine oriented language which is portable.  It is
possible to do MUCH better than C.

Eleazor bar Shimon, once and future Carolingian
yba@sabre.bellcore.com

stodghil@svax.cs.cornell.edu (Paul Stodghill) (10/01/88)

Talk about esoteric. Geez.

I have been interested in Machine Independent Intermediate Languages for
2 years now, I have yet to see any literature of substance. Not to say that
it doesn't exist, just that I haven't been able to find it. Does anyone
have any pointer to papers on the subject? Please Email to me directly.

						- Paul Stodghill
						  stodghil@cs.cornell.edu

eric@snark.UUCP (Eric S. Raymond) (10/02/88)

In article <898@sword.bellcore.com>, yba@arrow.UUCP (Mark Levine) writes:
> You have built some questionable assumptions into your reality
> check, and need to define your acronyms:

Sorry, I typoed.

> 	MLL -- is this a machine level language?

That 'MLL' should have read 'MIIL'.

> 	MIIL -- is this a machine independent intermediate language,
> 		and if so, at what level of expressive power, and is
> 		it any different than the MLL?  How?

Good question -- in fact, it's precisely the question I was trying to ask
with respect to HLLs. I think more acronym confusion can be best avoided if
I re-pose my three questions with this correction. They are:

	1) What properties distinguish a MIIL from a HLL?

(That is: how do I look at the semantics, performance, and portability of
a set of languages and sort the MIILs from the HLLs?)

	2) Are the portability goals for which MIILs are designed achievable
	   at all, given the diversity of today's architectures?

	3) If the answer to 2 is 'yes', *can those goals be achieved with
	   lower complexity and cost than an HLL compiler?*

The whole debate so far has been about 2). I am trying to suggest that the
critical question is actually 3), that the answer to 3) appears to be 'no',
and that the notion of MIIL is therefore fundamentally rather pointless,
because it distracts us from the *important* questions about designing
portability into HLLs.

> I beleive in such a thing as a MOL, a machine oriented language, and
> in a high level machine oriented language which is portable.  It is
> possible to do MUCH better than C.

Fine. I don't so believe (I've seen too many bizarre architectures) but I have
an open mind. Show me!
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718

djones@megatest.UUCP (Dave Jones) (10/04/88)

From article <e2uEl#2ORJzO=eric@snark.UUCP>, by eric@snark.UUCP (Eric S. Raymond):

> 
> 	1) What properties distinguish a MIIL from a HLL?
> 
        The HLL should be easy for humans to write.

        The MILL should be in the form most easily generated by compiler
        frontends, which is to say, it should be Polish form, not 
        infix.  Probably postfix-Polish is best.

	If you are going to have different compilers for HHL work and
        MIIL work, you might as well have two different languages, so...

        Now, I'll answer a question you did not ask: Namely, how should
        an MIIL compiler operate differently from an HLL compiler?

           a.  The HIIL compiler may assume that its input is correct, 
           provided that a check-sum is correct (or on very dependable 
           systems, providing that a "magic number" is correct). It need
           not spend valuable cpu cycles checking on correctness.

           b. The HIIL compiler should not do any "optimization" which
           could possibly affect program operation in any way.  In
           particular, it should always evaluate expressions in the
           order in which it is told to evaluate them.  That leaves
           many C compilers right out.  It should do no global optimizations.
           (Global optimizations are machine-independent, and should
           be performed on the MIIL form or on an earlier form, before
           the MIIL compiler sees it.)
           
> 
> 	2) Are the portability goals for which MIILs are designed achievable
> 	   at all, given the diversity of today's architectures?
> 
        I know from experience that a completely general solution is tough.
        Most of the problems come from screwy pointer-formats. Still, you
        can come up with something that works for "reasonable" 
        architectures.

> 	3) If the answer to 2 is 'yes', *can those goals be achieved with
> 	   lower complexity and cost than an HLL compiler?*
> 
        I think probably so.  Certainly one can design an MIIL which
        is easier to generate code for (in the front-end) than currently
        popular HLL's.

> The whole debate so far has been about 2). I am trying to suggest that the
> critical question is actually 3), that the answer to 3) appears to be 'no',
> and that the notion of MIIL is therefore fundamentally rather pointless,
> because it distracts us from the *important* questions about designing
> portability into HLLs.

As a practical matter, I can't agree.  As we speak, I am writing an 
HLL compiler. I wish I had a good MIIL and some cross-compilers for it.  
Current implementations of C will not do because, 

   1) Generating C code is a pain in the butt (c.f. Cfront).
   2) C cross compilers do not exist for the machines in question.
   3) Current C compilers for the target machine are too slow to 
      use interactively in a read-compile-load-run loop ("executing data").

Admittedly, the second two reasons are not due to the nature of C itself, 
but rather to the nature of current compilers.  I had rather see
more work go into new MIIL's, and indeed, new HLL's rather than seeing
yet another batch of C compilers.  Nobody is going to do new C compilers
tuned to be used as MIIL compilers.  For marketing reasons, the new
compilers will all do ANSII prototypes and lots of optimization.

Even if it is possible to define a language which will serve as both
an HLL and an MIIL, (an "MIHLL"), it sure would be nice to have the MIIL 
to boot strap it with.  Perhaps the MIIL would turn out to be a 
subset of the HLL with "checking turned off", I dunno.

itcp@ist.CO.UK (News reading a/c for itcp) (10/04/88)

From article <e2uEl#2ORJzO=eric@snark.UUCP>, by eric@snark.UUCP (Eric S. Raymond):
> 	2) Are the portability goals for which MIILs are designed achievable
> 	   at all, given the diversity of today's architectures?
> 
> 	3) If the answer to 2 is 'yes', *can those goals be achieved with
> 	   lower complexity and cost than an HLL compiler?*
> 
> The whole debate so far has been about 2). I am trying to suggest that the
> critical question is actually 3), that the answer to 3) appears to be 'no',
> and that the notion of MIIL is therefore fundamentally rather pointless,
> because it distracts us from the *important* questions about designing
> portability into HLLs.

There are two independent goals for a MIIL, the original one that started this
discussion, namely a `single universal distribution medium' that didn't entail
giving away source. As this was being discussed in what is essentially a UNIX
environment the assumption was that to all intents and purposes the MIIL only
had to serve for C programs. Here compiler costs are not really a
consideration - but I feel it remains to be shown that obfuscated C source
could not serve this purpose.

I have another interest in MIIL and that is as a Language independent
intermediate code to promote the design and disemination of new programming
languages. Clearly it would be acceptable for the MILL implementation
cost to exceed the cost of a single HLL compiler, so long as it was cheaper
than two HLL compilers. If it were more expensive than that I would seriously
doubt its reliability and maintainability.

[Usual disclaimer: this represents only my hastily assembled opinion and
		   spelling, and not necessarily anyonelse's]

	Tom (itcp@uk.co.ist)

yba@arrow.bellcore.com (Mark Levine) (10/04/88)

In article <853@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>From article <e2uEl#2ORJzO=eric@snark.UUCP>, by eric@snark.UUCP (Eric S. Raymond):
>Even if it is possible to define a language which will serve as both
>an HLL and an MIIL, (an "MIHLL"), it sure would be nice to have the MIIL 
>to boot strap it with.  Perhaps the MIIL would turn out to be a 
>subset of the HLL with "checking turned off", I dunno.

An old bootstrapping technique I have seen used well is to have a
"virtual machine", say a single stack machine, which you implement on
each target using whatever the best tools on the target allow.  The
bootstrap compiler produces output in the virtual machine's instruction
set, and you write your code generator in the HLL of the compiler
itself, followed by compiling the compiler.  If you happen to produce
RISC machines and put your virtual machine into real hardware, you get
to skip some steps....  Cross-compilers are replaced by implementing
the virtual machine (which is kept deliberately simple).

Is there resistance to the idea of writing the language first, then seeing
what it takes to execute it efficiently, and _then_ designing the hardware?
It seems designers are still fond of assembler, and providing many ways to
do operations for a compiler writer who only needs one fast way.

On the other hand, the LISP machines and the SCHEME chip don't seem to be
setting the world on fire.  What does the current folklore hold?  I have
never really talked with a RISC designer about languages.  I have this
fear they would all want to do an ADA machine first :-).

Eleazor bar Shimon, once and future Carolingian
yba@sabre.bellcore.com

pardo@june.cs.washington.edu (David Keppel) (10/04/88)

djones@goofy.megatest.UUCP (Dave Jones) writes:
>[ the HIIL compiler should(n't)... ]

Uh, great, now we have: HLL, HIIL, MIIL.  What's HIIL?

>[ global optimizations are machine-independent ]

I think that there are probably a *lot* of global optimizations that
*are* machine dependent.  Proof by authority: William Wulf said so.
Proof by trivialization: certain global variables may have their
concrete type assigned based on machine depndencies, and these in turn
will affect local computation; the concrete type assignment may in
some cases be available only after certain kinds of global
analysis[*].

Eventually you take a hit.

[*] Consider a language that supports two types of integers, a
    hardware-supported type and an arbitrary-precision type.  The
    variable may be declared with values outside the hardware type for
    some machines, inside the hardware type for other machines.  Even
    knowing whether the *declaration* fits may not be enough.  If the
    declaration doesn't fit the machine type and the *usage* is always
    within the machine type (which may be determined in at least some
    cases by looking at every assignment to the variable), then
    failure to do this (machine-dependent) global optimization will
    cause the compiler to allocate the arbitrary-preciesion type,
    which will generally be far less efficient.

	;-D on  ( Suboptimal reality )  Pardo
-- 
		    pardo@cs.washington.edu
    {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo

pardo@june.cs.washington.edu (David Keppel) (10/05/88)

yba@sabre.bellcore.com (Mark Levine) writes:
>[ virtual machine ]

I'd encourage everybody to go look at the Smalltalk Virtual Machine
(STVM).  Yes, even those of you who think that Smalltalk sucks dead
goats.  The reason is that there a lot of very clever things that have
been done with Smalltalk and the STVM and I believe that they are all
(well, nearly all :-) reasonable things to think about for language
and virtual machine design.  (Some of them reflect things that were
perhaps not such good ideas and should be avoided, others are good
things that should perhaps be duplicated.)  Anything that calls itself
Smalltalk must act like it runs on a STVM, even if the Smalltalk
source is compiled to native code.

What other machines use virtual machine specifications?  I know Prolog
has an abstract machine, but I don't think that it is in any way
required for a Prolog implementation.

    ;-D on
      ( 25 cents is 2 bits, so flipping a quarter gives a quandary result )
									Pardo
-- 
		    pardo@cs.washington.edu
    {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo

chase@Ozona.orc.olivetti.com (David Chase) (10/05/88)

In article <5933@june.cs.washington.edu> pardo@uw-june.UUCP (David Keppel) writes:
>What other machines use virtual machine specifications?  I know Prolog
>has an abstract machine, but I don't think that it is in any way
>required for a Prolog implementation.

"BCPL has a simple semantic structure which is based on an idealized machine."
(page 1, _BCPL -- the language and its compiler_, by Martin Richards.)

The "machine language" usually used for this virtual machine (the
compiler's intermediate code) was called OCODE.  Another virtual
machine was used to aid in bootstrapping systems; the machine language
for this machine was called INTCODE.

In case you are wondering, the virtual machine chosen for BCPL can present
problems to efficient implementation on some machines.  There was some
paper in SP&E within the last decade or so describing a port of BCPL to
a Burroughs machine, and the task was Not Pretty.

David

will@uoregon.uoregon.edu (William Clinger) (10/05/88)

In article <905@sword.bellcore.com> yba@sabre.bellcore.com (Mark Levine) writes:
>An old bootstrapping technique I have seen used well is to have a
>"virtual machine", say a single stack machine, which you implement on
>each target using whatever the best tools on the target allow....

This works, but there are usually a few inefficiencies caused by
mismatches between the virtual and actual target machines.  For
example, if the virtual machine has general registers but the
target machine has separate address and data registers, then it is
likely that extra register-register move instructions will be
generated that could have been avoided by a custom code generator.

>Is there resistance to the idea of writing the language first, then seeing
>what it takes to execute it efficiently, and _then_ designing the hardware?

Yes, because people have the impression that hardware designed that way
will run only one language (or language family) efficiently.  Few people
are willing to put up with that unless the language is Fortran or C.

>On the other hand, the LISP machines and the SCHEME chip don't seem to be
>setting the world on fire.  What does the current folklore hold?

Current folklore holds that general purpose processors, coupled with
good compilers, deliver nearly the same performance and cost no more
than special purpose processors for these languages.  LISP machine
users counter that their machines offer performance advantages other
than raw speed: the programming environment, safety (dynamic error
checking), no need to waste programmer time on type declarations, and
so on.

The SCHEME chips were one-of-a-kind by-products of academic research
into VLSI design, and deliberately used an interpreter-oriented and
rather low-performance architecture, so they were never intended to
set the world on fire.

William Clinger

yba@arrow.bellcore.com (Mark Levine) (10/05/88)

In article <e2uEl#2ORJzO=eric@snark.UUCP> eric@snark.UUCP (Eric S. Raymond) writes:
>In article <898@sword.bellcore.com>, yba@arrow.UUCP (Mark Levine) writes:
>
>	2) Are the portability goals for which MIILs are designed achievable
>	   at all, given the diversity of today's architectures?
>
>	3) If the answer to 2 is 'yes', *can those goals be achieved with
>	   lower complexity and cost than an HLL compiler?*
>
>The whole debate so far has been about 2). I am trying to suggest that the
>critical question is actually 3), that the answer to 3) appears to be 'no',
>and that the notion of MIIL is therefore fundamentally rather pointless,
>because it distracts us from the *important* questions about designing
>portability into HLLs.

[Sorry I seemed to ignore this -- we lost our active file a few days ago
and I missed your response.]  I believe we are in agreement here.

>> I beleive in such a thing as a MOL, a machine oriented language, and
>> in a high level machine oriented language which is portable.  It is
>> possible to do MUCH better than C.
>
>Fine. I don't so believe (I've seen too many bizarre architectures) but I have
>an open mind. Show me!

I can point at MARY again.  (Or maybe LIS).  We probably need to define what
"better" means here -- I mean to have the most expressive power available to
the program writer as well as to generate fast and tight code.  I do not mean
to limit yourself to expressing semantics which can be efficiently compiled on
all architectures, which may indeed be what MIIL language proponents are looking
for.  If my highly optimized vax program runs half as fast on your sun, I still
think it is portable; if you can change an implentation prelude or a generic
sort package from my vax library to your sun library and it runs just as fast,
I think we are winning.

A machine oriented language, or high level assembler if you prefer, is not
predicated on a portable intermediate representation.  I think all it needs
guarantee is you _can_ get at the machine language ops if you want to, and
that you can supply (presumably non-optimal) replacement semantics for
other architectures.  In other words, you write a code generator and some
implementation specific code for each target.  This is the same as doing an
HLL, yes?  (If you _have_ optimal replacement semantics, you would of course
use them.)  An asm escape does _not_ fit this need.  The big difference is
being able to add high level constructs (language expansions if not language
extensions) with user defined semantics, in the source language, which can go
all the way to the machine language.

C is not all that portable, although it is popular to think of it as both
machine close and easily portable, because the source code writer must
explicitly handle his portability constructs (#ifdef the bytes go _this_
way, except #ifdef I am on an IBM RT unless #ifdef this is a 16-bit machine...).
It lacks most of the features I really want for large systems and toolkits
(give me operator overloads, full typing, generic instantiation, a function
value language that goes left to right, implicit operator definitions and
explicit modules with separate compilation -- not to exclude dynamic linking).

Without a position to protect (at RPI they used to say: "An open mind has but
one disadvantage: it collects dirt") I will go as far as to say if you took
the restrictions (built in for verifiers we will never build) out of ADA, you
could do better in ADA than in C.  I am also willing to learn; show me why not?

Let me go further in agreeing with you and say that getting the kind of
portability I described above ("winning") _must_ be achieved in the HLL.
By the time you get to an IL, you have lost the most important information
of all -- what the writer was trying to do!  The ability to optimize the
operation make-symbol-table" or "cubic-spline" is much more useful than
the ability to optimize move-indirect-through-autoincrement-register.

Eleazor bar Shimon, once and future Carolingian
yba@sabre.bellcore.com

eric@snark.UUCP (Eric S. Raymond) (10/06/88)

Aha! Finally, an apologia for the MIIL comcept that makes sense...

In article <853@goofy.megatest.uucp>, djones@megatest.UUCP (Dave Jones) writes:
> From article <e2uEl#2ORJzO=eric@snark.UUCP>, by me:
> > 	1) What properties distinguish a MIIL from a HLL?
> > 
>         The HLL should be easy for humans to write.
>         The MILL should be in the form most easily generated by compiler
>         frontends, which is to say, it should be Polish form, not 
>         infix.  Probably postfix-Polish is best.

I consider this a detail. Front-ending is *not* the hard part in HLL
compilation; front ends are easy to write, and easy to port. Code generation
is the hard part. Eliminating the front end, by itself, doesn't pare away
enough complexity and cost to justify the MIIL concept.

>         Now, I'll answer a question you did not ask: Namely, how should
>         an MIIL compiler operate differently from an HLL compiler?
> 
>            a.  The HIIL compiler may assume that its input is correct, 

Again, this is a *front end* issue, and front ends are the *easy* part.
Code generators (which are, typically, IL-to-native-code translators) don't
typically error-check their input stream, either!

Here's a gedankenexperiment for you. Let's suppose we *had* a "universal" MIIL;
now, I write a HLL front end in it, and I ship that with uMIIL. Do I have a
universal HLL compiler? Given that assumption, yes I do.

What this demonstrates is that, aside from an essentially fixed *one-time* cost
to write an HLL front-end, the cost and complexity of writing a uHLL is the
*same* as that of writing a uMIIL!

(BTW, please forgive the proliferation of jargon but I am *not* going to wear
out my typing finger on "universal machine-independent intermediate language"
again)

Now, what's wrong with this picture?

Again, *we left out the hard part* -- which is designing the uMIIL/code
generator and then implementing it on the billyuns and billyuns of bizarre
boxes out there.

>            b. The HIIL compiler should not do any "optimization" which
>            could possibly affect program operation in any way.

So how's this different from an HLL with the optimization turned off?

> > 	2) Are the portability goals for which MIILs are designed achievable
> > 	   at all, given the diversity of today's architectures?
> > 
>         I know from experience that a completely general solution is tough.
>         Most of the problems come from screwy pointer-formats. Still, you
>         can come up with something that works for "reasonable" 
>         architectures.

Certainly. It's called 'C'.

No flames about C's problems, please. I know it's not perfect. But the *fact*
is that it is now filling the niche in the computer science ecology that you're
describing -- and you haven't advanced any compelling reasons to abandon its
HLLness in favor a search for a hypothetical uMIIL.

> > 	3) If the answer to 2 is 'yes', *can those goals be achieved with
> > 	   lower complexity and cost than an HLL compiler?*
> > 
>         I think probably so.  Certainly one can design an MIIL which
>         is easier to generate code for (in the front-end) than currently
>         popular HLL's.

Let's assume you can. So what? Front-ending (I guess I have to repeat this
again) is *not the real problem*! As long as you have to write a nontrivial
code generator anyhow (and you will, for any hardware that doesn't closely
match the virtual machine defined by the uMIIL) the complexity and cost of
that back end is going to completely swamp the cost of hauling along and using
an HLL front end.
 
> As a practical matter, I can't agree.  As we speak, I am writing an 
> HLL compiler. I wish I had a good MIIL and some cross-compilers for it.  
> Current implementations of C will not do because, 
> 
>    1) Generating C code is a pain in the butt (c.f. Cfront).

Maybe your translation methods need work. Have you tried treating your
new-HLL parse tree as data and doing rewrites on subtree pattern matches until
you get legal C? That way C gets to do the grotty parts like symbol table and
name space maintenance. Use LISP for this if you can, then compile the LISP.

>    2) C cross compilers do not exist for the machines in question.

Given what I've shown above, why do you consider uMILL 'cross-compilers'
to be significantly easier/more feasible? (and do you see why this is the right
question now?)

>    3) Current C compilers for the target machine are too slow to 
>       use interactively in a read-compile-load-run loop ("executing data").

An implementation detail, as you yourself observed.

> Even if it is possible to define a language which will serve as both
> an HLL and an MIIL, (an "MIHLL"), it sure would be nice to have the MIIL 
> to boot strap it with.  Perhaps the MIIL would turn out to be a 
> subset of the HLL with "checking turned off", I dunno.

Ah. Now *that's*, in my opinion, the right lesson to draw from thinking about
MIILs -- that we need more don't-try-to-second-guess-me switches on our HLL
compilers.
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718

lamaster@ames.arc.nasa.gov (Hugh LaMaster) (10/06/88)

In article <e4ITv#4cfCcm=eric@snark.UUCP> eric@snark.UUCP (Eric S. Raymond) writes:

>I consider this a detail. Front-ending is *not* the hard part in HLL
>compilation; front ends are easy to write, and easy to port. Code generation
>is the hard part. Eliminating the front end, by itself, doesn't pare away
>enough complexity and cost to justify the MIIL concept.
>
>
>
>
Well, certain steps like vectorization and certain other optimizations
logically fall before production of the MIIL and are language dependent.
These parts of the front end must be hard, judging by the number of bugs
associated with production compilers in this area.  So, in some cases 
there is a significant amount of work to writing a language dependent
front end.  Therefore, there is a significant potential benefit to
using a MIIL if you can make it work.



-- 
  Hugh LaMaster, m/s 233-9,  UUCP ames!lamaster
  NASA Ames Research Center  ARPA lamaster@ames.arc.nasa.gov
  Moffett Field, CA 94035     
  Phone:  (415)694-6117       

rminnich@super.ORG (Ronald G Minnich) (10/07/88)

In article <30151@oliveb.olivetti.com> chase@Ozona.UUCP (David Chase) writes:
>In case you are wondering, the virtual machine chosen for BCPL can present
>problems to efficient implementation on some machines.  There was some
>paper in SP&E within the last decade or so describing a port of BCPL to
>a Burroughs machine, and the task was Not Pretty.
you betcha. For example, the Burroughs machine supported arrays
and such via segments, but since BCPL really wants a flat address space
they decided to take one BIG segment and use that for a program's
entire address space, thus throwing away all the nice part of that
architecture and paying the performance price for segmentation they
did not use. Lots of other problems but the memory model was 
definitely the worst. I went through this same go-round too when 
we looked at putting C on that architecture. 
   Let's see now, if i change BCPL to C and Burroughs to 386 
in the above, i think that describes what C on the 386 does- each 
process uses one BIG segment for its entire address space, thus not 
using the 386 segments. Most (if not all) of the machines that run
Unix now share certain assumptions that we all pretty much take 
for granted. But if you work on a machine that breaks those assumptions
you hit them pretty fast.
ron

peter@ficc.uu.net (Peter da Silva) (10/10/88)

In article <815@super.ORG>, rminnich@super.ORG (Ronald G Minnich) writes:
>    Let's see now, if i change BCPL to C and Burroughs to 386 
> in the above, i think that describes what C on the 386 does- each 
> process uses one BIG segment for its entire address space, thus not 
> using the 386 segments.

This is a UNIX assumption, not a 'C' one. 'C' works just fine with non-
contiguous segments *if* the segments can be made large enough for any
given memory object. 'C' has definite problems with memory objects bigger
than a segment. Many languages do, actually... most just hide them
better.

UNIX, however, likes each address space to be contiguous. Look at the
behaviour of sbrk(), for example. If you can get your programs all using
a higher level construct (say, malloc) this system call can be removed
and the 386 cn be more effectively utilised. Unfortunately, there are
programs (/bin/sh, for example) that depend on sbrk. (actually, what
/bin/sh does can't be explained in polite company).
-- 
Peter da Silva  `-_-'  Ferranti International Controls Corporation.
"Have you hugged  U  your wolf today?"            peter@ficc.uu.net

yba@arrow.bellcore.com (Mark Levine) (10/10/88)

In article <358@istop.ist.CO.UK> itcp@ist.CO.UK (News reading a/c for itcp) writes:
>I have another interest in MIIL and that is as a Language independent
>intermediate code to promote the design and disemination of new programming
>languages. Clearly it would be acceptable for the MILL implementation
>cost to exceed the cost of a single HLL compiler, so long as it was cheaper
>than two HLL compilers. If it were more expensive than that I would seriously
>doubt its reliability and maintainability.
>
Eariler in this discussion I mentioned MARY.  After recently talking to the
fellow who taught it to me, I should add that he is working on the third
generation of the language, and that to make this "new programming language"
available, the target for the compiler is C.  Given that most new machines
these days get C as part of the initial language suite, even though it is
not all things to all of us, perhaps (I was wrong and) we already have a
_de facto_ MIIL.

If the machine model under a particular C compiler is doing the "Wrong" thing
(ala the discussion of Burroughs segments), you would have to have assembler
escapes or your own code generator to get around it.  This still sounds like
a cheaper way to get started than porting a virtual machine.  Perhaps the better
topic is how much it costs to do better than C, rather than whether one can.
In the context of the quoted article, I would consider C not to be an HLL.
In the same context, let me agree with earlier postings and ask what are the
objections to calling C the MIIL in question and saying it is already done
and has zero new costs?  Certainly an interesting way to distribute reliability
and maintainability costs.  Perhaps also a good reason to be locked into a
"standard" C.

Eleazor bar Shimon, once and future Carolingian
yba@sabre.bellcore.com

db@lfcs.ed.ac.uk (Dave Berry (LFCS)) (10/10/88)

In article <e4ITv#4cfCcm=eric@snark.UUCP> eric@snark.UUCP (Eric S. Raymond) writes:
>Certainly. It's called 'C'.
>
>No flames about C's problems, please. I know it's not perfect. But the *fact*
>is that it is now filling the niche in the computer science ecology that you're
>describing -- and you haven't advanced any compelling reasons to abandon its
>HLLness in favor a search for a hypothetical uMIIL.

C isn't the only language filling this niche.  LISP is another.  C is
presumably better for languages where efficiency is a prime concern,
and LISP for those requiring garbage collection, etc.   The choice
is also affected by the availability of implementations for the desired
hardware; LISP would presumably be a better choice for a LISP machine.

As an aside, I've heard both disparagingly described as "portable assemblers"
(and I've heard their proponents take that as a compliment).
Dave Berry,	Laboratory for Foundations of Computer Science, Edinburgh.
		db%lfcs.ed.ac.uk@nss.cs.ucl.ac.uk
		<Atlantic Ocean>!mcvax!ukc!lfcs!db

db@lfcs.ed.ac.uk (Dave Berry) (10/11/88)

In article <5933@june.cs.washington.edu> pardo@uw-june.UUCP (David Keppel) writes:
>
>What other machines use virtual machine specifications?  I know Prolog
>has an abstract machine, but I don't think that it is in any way
>required for a Prolog implementation.

Some formal specifcation techniques use abstract machines.  These can
often be implemented.  By definition, any implementation of the language
must mirror the behaviour of an implementation of the formal semantics
on a "semantic abstract machine".

One example of such a machine is the Typol system written by Gilles Kahn
and his group at INRIA.  This converts natural semantics specifications
to Prolog and runs them in an integrated environment.  (The environment
is called Centaur; Typol is the semantic specification language).
They've specified the Standard ML core language, among others.

Dave Berry,	Laboratory for Foundations of Computer Science, Edinburgh.
		db%lfcs.ed.ac.uk@nss.cs.ucl.ac.uk
		<Atlantic Ocean>!mcvax!ukc!lfcs!db

cik@l.cc.purdue.edu (Herman Rubin) (10/12/88)

In article <831@etive.ed.ac.uk>, db@lfcs.ed.ac.uk (Dave Berry (LFCS)) writes:
> In article <e4ITv#4cfCcm=eric@snark.UUCP> eric@snark.UUCP (Eric S. Raymond) writes:

			.....................

> C isn't the only language filling this niche.  LISP is another.  C is
> presumably better for languages where efficiency is a prime concern,
> and LISP for those requiring garbage collection, etc.   The choice
> is also affected by the availability of implementations for the desired
> hardware; LISP would presumably be a better choice for a LISP machine.
> 
> As an aside, I've heard both disparagingly described as "portable assemblers"
> (and I've heard their proponents take that as a compliment).

Neither C nor LISP can be described correctly as a portable assembler.  A 
portable assembler should have the property that anything which the machine
can do can be relatively easily described in the language in such a way that
the resulting object code does what the programmer, understanding the machine
instructions, timing, and limitations, wants it to do, and how he wants it
done.

Thus a portable assembler must be able to produce efficient versatile code.
It should also be easy to write, with any construct that the  programmer 
feels useful relatively easy to insert.  I would find such a gadget very
useful.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

eric@snark.UUCP (Eric S. Raymond) (10/12/88)

In <912@sword.bellcore.com>, yba@sabre.bellcore.com (Mark Levine) writes:
>      [...] the target for the compiler is C.  Given that most new machines
> these days get C as part of the initial language suite, even though it is
> not all things to all of us, perhaps (I was wrong and) we already have a
> _de facto_ MIIL.                    [...]                Perhaps the better
> topic is how much it costs to do better than C, rather than whether one can.

Precisely the point I have been trying to make.

And, on a related topic:

Peter ("Have you hugged your wolf today?") deSilva seems to think the point
of a uMIIL is to provide a medium for selling software, a way for it to be
distributed in machine-independent form that nosy hackers can't read and
modify.

Excuse me, but I thought the security problem in for-sale software was to guard
it from unauthorized *copying* and *use*, not unauthorized *understanding*! A
uMIIL does nothing for the real problem, since by definition it has to be easy
to copy and run on lots of machines.
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718

gwyn@smoke.ARPA (Doug Gwyn ) (10/13/88)

In article <970@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
-A portable assembler should have the property that anything which the machine
-can do can be relatively easily described in the language in such a way that
-the resulting object code does what the programmer, understanding the machine
-instructions, timing, and limitations, wants it to do, and how he wants it
-done.

I defy you to come up with one that can be used for all of:
IBM System/38, Burroughs B6700, Motorola MC68000, Intel iAPX432.
Yet C and LISP are (barely) doable for all these architectures.

diamond@csl.sony.JUNET (Norman Diamond) (10/14/88)

In article <831@etive.ed.ac.uk>, db@lfcs.ed.ac.uk (Dave Berry (LFCS)) writes:

(About C and Lisp...)
> As an aside, I've heard both disparagingly described as "portable assemblers"
> (and I've heard their proponents take that as a compliment).

Both languages' inventors created them expressly to be assemblers with
a portable syntax.  The description is not disparaging at all.  Only
certain critics who intend the description to be disparaging in fact
reveal their gross ignorance.

The phrase "portable assembler" is unfortunately ambiguous.  This has
led users to expect C PROGRAMS to be as portable as the language's
SYNTAX.  Since their demands have been listened to, C is losing its
original capabilities.  Anyone who wants to write portable PROGRAMS
should use another language.  Lisp pretty well fits the bill of
machine-independence, despite its original purpose of assisting the
coding of machine-language (not assembler-language) programs.
-- 
-------------------------------------------------------------------------------
  The above opinions are my own.   |   Norman Diamond
  If they're also your opinions,   |   Sony Computer Science Laboratory, Inc.
  you're infringing my copyright.  |   diamond%csl.sony.jp@relay.cs.net

gwyn@smoke.BRL.MIL (Doug Gwyn ) (10/28/88)

In article <10037@socslgw.csl.sony.JUNET> diamond@csl.sony.JUNET (Norman Diamond) writes:
>The phrase "portable assembler" is unfortunately ambiguous.  This has
>led users to expect C PROGRAMS to be as portable as the language's
>SYNTAX.  Since their demands have been listened to, C is losing its
>original capabilities.

I'm not aware of any capabilities that have been lost.

>Anyone who wants to write portable PROGRAMS should use another language.

I have to take strong exception to this.  C probably offers more support
for writing significant portable programs than any other language.  It
has a nice balance of standardization and flexible accommodation of
variant machine architectures and environments.

guy@auspex.UUCP (Guy Harris) (10/29/88)

>Both languages' inventors created them expressly to be assemblers with
>a portable syntax.  ...
>
>The phrase "portable assembler" is unfortunately ambiguous.  This has
>led users to expect C PROGRAMS to be as portable as the language's
>SYNTAX.

You mean users like Dennis Ritchie, Steve Johnson, etc.?  Those foolish
people; had they known that C programs were really assembler-language
programs, they would never have tried to make them work on multiple
machines....

To quote from Johnson's "C Program Portability":

	  As soon as C compilers were available on other machines, a
	number of programs, some of thm quite substantial, were moved
	from UNIX to the new environments.  In general, we were quite
	pleased with the ease with which programs could be transferred
	between machines.

It goes on to say that the difficulties they ran into in porting were:

	1) As the language evolved, compilers changed so there were
	   incompatibilities between the compilers due to features
	   that had made it into one compiler but not into another yet.

	2) The machines ran different operating systems.

The latter was described as the most serious difficulty, and led them to
note "gee, UNIX is written in C, how about porting *it* to other
machines" - a decision whose ramifications most of us can testify to....

>Since their demands have been listened to, C is losing its
>original capabilities.

For example?

>Anyone who wants to write portable PROGRAMS should use another language.

You may believe this, but there are a vast number of people whose
experience indicates that it is simply not true.  Given that, you may
want to re-evaluate your belief....  (Remember, BTW, that "is-portable" is
not a Boolean predicate; not all programs may be portable to completely
arbitrary architectures, although the range of architectures that
support C and UNIX is fairly impressive.)

aarons@cvaxa.sussex.ac.uk (Aaron Sloman) (11/10/88)

If you want to design a machine-independent intermediate language  (or
target  virtual  machine  for  compilers)   you  are  pulled  in   two
directions:-

a. Make it a  high level VM  so that compiling  to it from  high level
   programming languages is relatively easy.

b. Make it a low level VM so that translating to machine code is
   relatively easy, making porting to new machines easy etc.

Poplog  solves  the  dilemma  by  providing  both,  with  a   machine-
independent, language-independent optimising  compiler going from  the
high level to the low level.

So adding a new language (producing  a new "front end") is  relatively
easy using tools provided for compiling to the high level. And porting
to a new machine (producing a new "back end") is relatively easy.

This architecture, designed and implemented  mostly by John Gibson  at
Sussex University,  has been  used to  implement portable  INCREMENTAL
compilers for a collection of interactive languages, which can then be
mixed  if  necessary  for  solving  problems  that  require  different
languages for different sub problems.

           {POP-11, COMMON LISP, PROLOG, ML, SYSPOP}
                               |             (used to implement Poplog)
                (Machine independent)Compile to
                               |
                               V
                    [High level VM (PVM)]   \
                     (extended for SYSPOP)   | machine and language
                               |             | independent
                     Optimise & compile to   |
                               |             |
                               V             |
                    [Low level VM (PIM)]     |
                     (modified for SYSPOP)   /
                               |
             (Language indepdendent) Compile (translate) to
                               |
                               V
                 [Native machine instructions]
                  [or assembler - for SYSPOP]


So Poplog has two machine independent virtual machines (or if you like
intermediate languages). The high  level Poplog Virtual Machine  (PVM)
is suitable as a target for (incremental) compilers and has been  used
for Prolog, Common Lisp,  Pop-11 (a lisp like  language with a  Pascal
like syntax) Standard ML (version 1) and other simpler special purpose
languages. Syspop  is  an extended  dialect  of Pop-11  enhanced  with
C-like facilities for  pointer manipulation,  etc. Syspop  is used  to
implement  the  core  of  Poplog.   E.g.  the  garbage  collector   is
implemented in Syspop.

The low  level Poplog  Implementation Machine  (PIM) is  a  convenient
virtual architecture with instructions that translate without too much
trouble to instructions  for a typical  general purpose computer.  The
level is about the same as that of VAX.

There is a machine  independent and language independent  intermediate
compiler which compiles from the high level PVM to the low level  PIM,
optimising on the way. A machine-specific back-end then translates the
low-level  VM  to  native  machine   code,  except  when  porting   or
re-building the  system.  In  the  latter  case  the  final  stage  is
translation to assembly language. (See diagram above.)

Poplog was originally implemented  on a VAX  running VMS(tm), but  has
since been ported to a range computers with versions of Unix(tm), e.g.
VAX + Berkeley  Unix, Sun-2, Sun-3,  Sun-4, Sun-386i, Hewlett  Packard
M680?0 + Unix  workstations, Apollo +  Unix, GEC-63, Sequent  Symmetry
(multiple 80386 + Unix), Orion 1/05 (Clipper processor + Unix).

To illustrate,  porting to  Sun-4 took  two people,  not working  full
time, about  6  weeks.  A  programmer  who  had  never  ported  Poplog
previously, nor worked  with Intel 80386  processors previously,  took
about four and a half months to  port it to Sequent Symmetry (much  of
the time without a sequent computer to check things out). Then porting
to Sun386i took under two weeks.

If a user implements a "front-end" compiler from a new language to the
PVM  then  that  language  automatically  runs  on  all  the  machines
supporting Poplog, and inherits an integrated editor, rich development
environment, indefinite precision arithmetic, window manager, etc. One
user claimed  it took  him about  three  weeks in  his spare  time  to
produce a  Poplog  compiler  for  Scheme. Once  he  had  Scheme  being
translated to the PVM, he did not  need to bother about getting it  to
translate to machine code on the machines running Poplog.

NOTE
 The comilers in  Poplog are incremental  in that you  can compile  or
recompile individual  procedures in  the process  in which  previously
compiled procedures are  or have been  running. New procedure  records
are created in the heap, and linked  in to the rest of the system  via
identifier records, without  having to go  through a separate  linking
process. Interactive commands given in the language are compiled, then
run, and the compiled procedure is then discarded.

This use  of incremental  compilation is  fairly common  with Lisp  or
Prolog systems.  It  gives almost  as  much flexibility  as  using  an
interpreter, but  programs  generally run  much  faster, as  they  are
compiled to machine code.

POPLOG provides  an  extensive  kit  of tools  that  can  be  used  by
languages that need  them including a  lightweight process  mechanism,
extendable record  and vector  classes, hash-tables,  ratios,  complex
numbers,  floating  point  numbers,   arrays,  an  extensive  set   of
procedures for  interacting  with  the  operating  system,  an  object
oriented library, list  processing, a  pattern matcher,  etc. Some  of
these are built in whilst others are autoloaded on demand. Not all the
functionality is provided  directly by  virtual machine  (intermediate
language)  instructions.  Instead  utilities  are  provided  that  are
themselves defined (e.g. in Syspop) and then compiled through the  two
intermediate stages.

We have not (yet) tried using Poplog to implement a compiler for a
non-interactive language like C or Pascal. In principle it should be
possible, at a cost in efficiency. Instead Poplog allows procedures
written in these languages to be linked in for use as subroutines.


Aaron Sloman,
School of Cognitive and Computing Sciences,
Univ of Sussex, Brighton, BN1 9QN, England
    ARPANET : aarons%uk.ac.sussex.cvaxa@nss.cs.ucl.ac.uk
              aarons%uk.ac.sussex.cvaxa%nss.cs.ucl.ac.uk@relay.cs.net
    JANET     aarons@cvaxa.sussex.ac.uk
    BITNET:   aarons%uk.ac.sussex.cvaxa@uk.ac
        or    aarons%uk.ac.sussex.cvaxa%ukacrl.bitnet@cunyvm.cuny.edu
    UUCP:     ...mcvax!ukc!cvaxa!aarons
            or aarons@cvaxa.uucp