[comp.arch] C machine

stevem@auscso.UUCP (12/04/87)

A year or so ago I read about a machine called the Lilith (sp?) that
was developed in Switzerland.  From what I read it seemed that this
machine had been developed with Modula 2 applications in mind.  I was
also led to understand that, because of this design, the machine could
execute compiled Modula programs much faster than any general purpose CPU 
in the same word-size/clock-speed class.

If this is true, I would like to know if anyone out there is working
on a similar machine designed to work more efficiently with C programs.
It seems that this would be the ultimate UNIX machine, eliminating the
need for assembly patches to the kernel.  It could run a completely 
portable version of UNIX and be very quick about it.

If there are problems inherint in the design of C that make such a machine
impossible I would like to know what these are.  If there are no such
problems then why in the world haven't I heard about it?  I read all of 
the UNIX literature I can get my hands on and it seems that someone would
have noticed the potential of such a machine by now.


p.s. I've noticed alot of people putting trademark notices in their  
     notes whenever they use the word UNIX.  I hope I won't be taken
     away in the middle of the night by the secret phone police for 
     not doing so myself. 8-O

mash@mips.UUCP (John Mashey) (12/06/87)

In article <759@auscso.UUCP> stevem@auscso.UUCP (Steven G. Madere) writes:
>A year or so ago I read about a machine called the Lilith (sp?) that
>was developed in Switzerland.  From what I read it seemed that this
>machine had been developed with Modula 2 applications in mind....

>If this is true, I would like to know if anyone out there is working
>on a similar machine designed to work more efficiently with C programs....

>If there are problems inherint in the design of C that make such a machine
>impossible I would like to know what these are.  If there are no such
>problems then why in the world haven't I heard about it?  I read all of 
>the UNIX literature I can get my hands on and it seems that someone would
>have noticed the potential of such a machine by now.

What does it mean to develop a machine that works efficiently with C?
It means using the statistics of real C-compiler generated code,
and real C programs to help drive the design of the architecture.

At least the following, to some extent or other, were so done (with
recent references, which have many references to earlier work).
(Others are invited to post references).
*'d machines are commercially available today.

Bell Labs "C" machines and CRISP
	Ditzel, McLellan, and Berenbaum, "Design Tradeoffs to Support
	the C PRogramming Language in the CRISP Microprocessor",
	Proc ASPLOS II, ACM SIGARCH 15, 5 (Oct 87) 158-163.

*HP Precision
	Magenheimer, Peters, Pettis, Zuras, "Integer Multiplication and
	Division on the HP Precision Architecture.", Proc ASPLOS II, 90-99.

*MIPS R2000
	Chow, Correll, Himelstein, Killian, Weber, "How Many Addressing
	Modes Are Enough?", Proc. ASPLOS II, 117-121.

*Sun SPARC (via some heritage from Berkeley RISC, at least)
	I don't have an immediate reference to a published paper
	that has statistical analyses: maybe someone from Sun can point at one.)

I'm sure there are more, but at least these machines or their predecessors
had considerable modeling work designed to make C good on them.

Now, the more serious issue is whether or not it's a good idea to build
a UNIX machine that's optimized ONLY for C, or in fact for any single
language in particular.  Contrary to popular belief, many UNIX machines
run FORTRAN also, or even (gasp!) COBOL, BASIC, PL/I, ADA, LISP, Prolog, Modula,
etc.  Whether or not you care about these things depends on what markets you
think you're in.  For many of them [RISC bias on], you do fine if you just
make sure loads, stores, branches, and function calls go fast.  For others,
you may have to do more work. [RISC bias off]

Of the machines above, as far as I can tell [originators, please correct me
if I'm wrong], tunings were mostly driven as follows:

C machine & CRISP: C

HP Precision: C, COBOL (there are 1-2 instructions that help
decimal arithmetic in an appropriate RISC fashion); FORTRAN

MIPS R2000: C+FORTRAN, PASCAL; not particularly COBOL, but turns out OK.

SPARC: C, LISP.

In general, unless botched badly, most reasonably modern (CISC or RISC)
designs are at least reasonable for running C.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

drew@wolf.UUCP (Drew Dean) (12/07/87)

I apologize, as a non - vi hacker, I can't get this message to contain a copy 
of the one it refers to, if someone could send me a little description of vi I'd
be very grateful .....
Anyways, the Lilith (you did spell it correctly) was / is a machine built by Dr.
Nicklaus Wirth to run Modula-2 as its SOLE language.  That is the OS, compiler, debugger, and everything else are written in Modula.  The machine has a 16 bit
architecture, and has 4 AMD 2900 series bit slice chips.  (That is 2900, NOT
29000, work on the Lilith started in the late 1970's.)  The 4 board proccesor
has 256 instructions, all chosen to help in the writing of the Modula-2 compiler.
  For example, building a stack frame on procedure entry is ONE instruction.
Due to the need to support a (at that time) hi-res display (768 by 594, mono.
and interlaced), the system uses a 64 bit wide read data bus, and a 16 bit write
data bus.  With the average instruction length ~ 10 bits, each instruction fetch
got about 6 instructions.  The Lilith was a very CISCy design (process switching
is also < 5 instructions), and was completely stack based.  All math operations
received operands on top of the stack, and pushed the result back on.  If anyone
wants further detail, send me email ....
At any rate, the Lilith was a 1979 technology 16 bit machine running at 6 Mhz.
It was blindingly fast, the 5 (yes 5) pass modula compiler was quicker that a
lot of recent things, like Microsoft C 4.0.  It had a barrel shifter for fast 
graphics, and the OS had several interesting features, and came in SOURCE code.

I remember hearing a few years ago about something called the BBN C machine, 
which did essentially the same thing for C.  Can someone supply further details
about it ?  Also, the Novix NC4000 and upcoming Buffalo processors do the same
thing for Forth.  The buffalo is is supposed to run @ 100 Mhz, and require .8
clock cycles/instruction. That's only rumor, but if it's anything close to that
it will be FAST....

Drew Dean
FROM Disclaimers IMPORT StandardDisclamier;
UUCP: {ihnp4, sdcsvax}!jack!wolf!drew

jkh@violet.berkeley.edu (Jordan K. Hubbard) (12/07/87)

In article <1061@winchester.UUCP> mash@winchester.UUCP (John Mashey) writes:
>In article <759@auscso.UUCP> stevem@auscso.UUCP (Steven G. Madere) writes:
>>A year or so ago I read about a machine called the Lilith (sp?) that
>>was developed in Switzerland.  From what I read it seemed that this
>>machine had been developed with Modula 2 applications in mind....
>
>>If this is true, I would like to know if anyone out there is working
>>on a similar machine designed to work more efficiently with C programs....
>At least the following, to some extent or other, were so done (with
>recent references, which have many references to earlier work).
>(Others are invited to post references).
>*'d machines are commercially available today.

I think BBN developed a series of machines a few years back that
were supposedly optimized for C. They were available in various
models, though all I've ever seen is the C/60. It's been a long
time since I looked at the specs, so I really couldn't say how
they were "optimized" or what sort of performance they got out of
them. I *do* know that they didn't exactly sell like hotcakes
and people seem to be using them as INP's on the internet. Does
anyone know if this is because there are favorable reasons for
using them this way? I suspect that BBN pushes them on people that
need internet access..

				Jordan Hubbard
				jkh@violet.berkeley.edu


DISCLAIMER: "I don't know what the hell you're talking about.."

henry@utzoo.UUCP (Henry Spencer) (12/08/87)

> ... because of this design, the machine could
> execute compiled Modula programs much faster than any general purpose CPU 
> in the same word-size/clock-speed class.

The real question, though, is how it compares in speed to general-purpose
CPUs that are in the same cost/delivery-time class.  The usual answer is
"not well", unless the language needs some odd feature that conventional
hardware does not do well at.  Even then, cleverness can often substitute
for special hardware.  Worse, even if the thing is competitive when first
built, remember that there are enormous resources pushing the development
of faster and better general-purpose CPUs.  The Lisp machines, for example,
are visibly dying as the combination of clever implementations and steadily
climbing general-purpose performance leaves them behind.

> If this is true, I would like to know if anyone out there is working
> on a similar machine designed to work more efficiently with C programs.

Since C is designed with efficiency on conventional machines in mind, it
is not clear why anyone would bother.  Actually, there was one such, the
BBN C-70, but I don't think it went anywhere.  About the only thing of
real note for running C well is an efficient procedure call, but after
the bad example of the VAX, most everybody does that very carefully anyway.

> It seems that this would be the ultimate UNIX machine, eliminating the
> need for assembly patches to the kernel.  It could run a completely 
> portable version of UNIX and be very quick about it.

Sorry, no.  There are still things that are difficult or impossible to do
in C unless your compiler essentially includes a primitive form of
assembler inside it.  Besides, why bother?  The assembler parts of the
kernel are manageably small already.  "Complete portability" is an illusion;
most of the work in a Unix port is in compilers and in things like memory
management and hardware handling anyway.

Also of significance is that many Unix machines are sold for applications
work.  For example, running big Fortran programs.  One reason for Unix's
appeal is precisely that it is *not* committed to a single language.

Oh, if you did it carefully you might be able to get some small win without
losing too much in other areas, but why bother?  You will get better results
investing that effort in speeding up the 68020, I'm afraid.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

johnl@ima.ISC.COM (John R. Levine) (12/09/87)

In article <6203@jade.BERKELEY.EDU> jkh@violet.berkeley.edu (Jordan K. Hubbard) writes:
>I think BBN developed a series of machines a few years back that
>were supposedly optimized for C. ...

The BBN C/70 was an interesting historical freak that really should never have
escaped from the lab. They had a microprogrammable engine originally designed
to emulate the DDP 516's used in Arpanet IMPs. That configuration, the C/30,
has sold reasonably well. As an experiment, somebody wrote a set of microcode
to implement something close to the reverse polish intermediate language of
one of the Unix C compilers, brought up Unix on the box, and the C machine was
born. On the way, they noticed that the 16 bit address space was inadequate
for a modern Unix system, so they stretched it, much the same way that Boeing
or Douglas stretches an airframe, so it ended up with a 20 bit address space
and ten-bit bytes, making it the world's first fully metric-compatible
computer. The original machine was the C/70, and my understanding was that the
C/60 was the same machine in a smaller configuration.

As you might imagine, the 99% of Unix software that assumes that bytes are
eight bits broke in funny ways when confronted with ten-bit bytes, but they
did get 7th edition Unix running reasonably well and used it extensively for
network control on the Arpanet and the many private packet-switched nets that
BBN has sold.

They tried with minimal success to sell it as a general Unix engine, but it
failed there for many reasons:

  -- programs broke on ten bit bytes
  -- twenty-bit addresses aren't bit enough
  -- heavily microcoded architecture caused lousy performance
  -- it wasn't designed for volume production, so it was expensive to build and
     so to sell
  -- nonstandard I/O interfaces meant that there weren't a lot of peripherals
     available.

I suppose the lesson here is that when you try to turn a horse into a zebra,
you end up with a camel.
-- 
John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869
{ ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something
The Iran-Contra affair:  None of this would have happened if Ronald Reagan
were still alive.

dmr@alice.UUCP (12/14/87)

djsalomon@watdragon.waterloo.edu, in common with lots of others,
thinks that C was designed to be optimal on the PDP-11, in particular
because of the ++ and -- operators.  Actually, this is much less
so than usually believed.

In particular, ++ and -- were inherited from B, which was invented
before the PDP-11 existed.  Now, the PDP-7 on which B was designed had a
primitive form of autoincrement cell, as did other machines of its era
and before, and this, no doubt, suggested the operators.  However, the
PDP-7 autoincrement was of no use in implementing them; B was
interpreted, not compiled, on that machine.  (Incidentally, what
the PDP-7 provided was the operation

    *(int *)X++

where X was an absolute address between 8 and 15, I believe.)

There is in fact rather little that is PDP-11 specific in C.
Aside from things that are nearly universal these days, it prefers

	1) byte addressed memory
	2) ability to do simple arithmetic on pointers
	3) recursive calls (i.e. ability to run a stack)
	4) ability to use variadic functions

and these, though not universal, are common.  Point 4 can be an annoyance
on pure stack machines where the caller and callee are strongly encouraged
to agree statically on the number of arguments.

One thing in current (pre-ANSI) C that was clearly influenced
by peculiarities of the PDP-11 was in the rules for the float->double
conversions; even here, I found independent justification, perhaps
not conclusive, for writing the rules as I did.

Another was the notion of signed characters.  Other than these, I can't
think of much.


	Dennis Ritchie

henry@utzoo.uucp (Henry Spencer) (12/16/87)

> I have always thought of C as a language designed to be optimal on the
> PDP/11.  The ++ and -- operators were designed to take advantage of the
> PDP/11's autoincrement and autodecrement features...

Sorry, not so.  Admittedly this one fooled me too, but Dennis set me
straight:  ++ and -- existed even in B, back on the PDP7.  Support for
such operations has been common in DEC hardware since long before the
11, and some did exist on the 7, which may have influenced the design
a bit.

> C also uses null
> terminated strings, which is OK for the PDP/11, but not especially
> efficient for machines with block move instructions.

It depends on what kind of block-move instructions you have.  And one
can argue that NUL-terminated strings are perhaps the right decision
regardless.  Few programs have their running times dominated by string
copying, after all, and NUL termination is convenient in various small
ways.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

crick@bnr-rsc.UUCP (Bill Crick) (12/16/87)

OOPS! I typed my article in the summary! Well I'll retype it here.

Drew Dean mentioned the"Buffalo machine". Does anyone know where I
can get some info on it? I've never heard of it before.
   Thanks   Bill Crick



Computo, Ergo Sum! (V Pbzchgr, Gurersber, V Nz!)

davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) (12/17/87)

One of the interesting things about B was that there were no types.
There were only objects. Therefore all objects had to be large enough to
handle a pointer. On machines which didn't support hardware byte
addressing there was a function (I *think* called char) which performed
the accesses.

Arrays were allocated as pointers and vectors. If I declared a[10], a
variable a was created, and a vector length ten. This is why in C the
name of an array behaves like an address. In addition, in B A[B] is the
same as B[A], since both are evaluated as *(A+B).

The original B compiler I saw was a total of 17 pages of code, and
produced assmebler source. The output of pass one was pseudo code, and
could be interpreted if desired. As I recall there was only a false
jump, not a true, so code resulting in a branch on true, such as
"if (a < b) break;" would generate a jump around an unconditional jump.

In about 1972 I developed a language called IMP, based on the ideas of
B. I also developed a peephole optimizer which operated on the pseudo
code (pass 1.5). The compiler was implemented on both GECOS (as it was
spelled then) and CP/M-80, and would crosscompile in either direction
(but only assmebler source went from CP/M to GECOS.

This was my first compiler, and in keeping with being typeless but
needing floating point, use floating operators instead. As I recall the
assignment operator was changed to ":=" ala Pascal, equality was just
"=", and inequality was "<>" like BASIC. This made the code somewhat
easier to read if you didn't know the language.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

mac3n@babbage.acc.virginia.edu (Alex Colvin) (12/17/87)

Autoincrement & decrement are probably in C for the same reason they are
in the 11 (& 10 & 8 & GE635 &c.).  They're useful.

I find myself beating on iNtel instructions these days, and have noticed
several things not amenable to C.

	several widths of registers.  this means that the implicit
	widening of char to int actually generates code.  ditto
	float to double.

	nonlinear addressing.  this means that arithmetic on pointers
	is dangerous.  particularly equality.  however, most
	reasonable uses work OK.

On the DPS8 (descendant of GE635) I note that C also assumes byte
addressability and not bit addressability.  You'll see this on most
new designs.

Finally, I like to think of C as the apotheosis of Algol 68.

ok@quintus.UUCP (Richard A. O'Keefe) (12/18/87)

In article <133@babbage.acc.virginia.edu>, mac3n@babbage.acc.virginia.edu (Alex Colvin) writes:
> Finally, I like to think of C as the apotheosis of Algol 68.
"apotheosis", n, deification, act of raising any person or thing
	      to the status of a god.
Algol 68 was a very nice language which let you do all sorts of
things that are important for clear and correct coding (like
dynamically sized arrays, heap allocation that didn't force you
to kick type-checking good-bye) including a number of things
that ANSI C is finally reinventing (prototypes, several sizes
of float).  There is nothing in C (other perhaps than the keywords
'int' 'void', 'struct' and the *word* "cast" -- which means something
utterly different in Algol 68) resembling Algol 68.  It would be
more accurate to describe C as the ANTITHESIS of Algol 68.

Relevance to this discussion:  there was at least one machine
which ran Algol 68 as its "native" language.  I think it was built at
RRE in the UK and was called FLEX.

collinge@uvicctr.UUCP (Doug Collinge) (12/19/87)

In article <133@babbage.acc.virginia.edu> mac3n@babbage.acc.virginia.edu (Alex Colvin) writes:
>
>Finally, I like to think of C as the apotheosis of Algol 68.

You mean that there is someone in North America who has heard of Algol 68?!

I had an Algol 68 phase a while back.  I even imported two compilers from
the UK: Cambridge, and Algol 68S.  I still like A68 - too bad it didn't
catch on here...

-- 
		Doug Collinge
		School of Music, University of Victoria,
		PO Box 1700, Victoria, B.C.,
		Canada,  V8W 2Y2  
		collinge@uvunix.BITNET
		decvax!uw-beaver!uvicctr!collinge
		ubc-vision!uvicctr!collinge

dave@sdeggo.UUCP (David L. Smith) (12/20/87)

In article <261@ivory.SanDiego.NCR.COM>, jan@ivory.SanDiego.NCR.COM (Jan Stubbs) writes:
> Personally, I can't imagine any convenience a null terminated string would 
> have over a string preceded by its length. 

Well, there is no limit imposed on the length of the string.  It also takes
up one less byte per string in overhead (unless you wanted to limit your
string length to 255 characters) which isn't very important today, but
probably was when C was first defined.

Besides, there's nothing that says you can't write a string package which
has the string preceeded by the length.  If this were still NULL-terminated,
it would still work with exsisting functions (by passing the actual beginning
of the string to them).


-- 
David L. Smith
{sdcsvax!man,ihnp4!jack!man, hp-sdd!crash, pyramid}!sdeggo!dave
man!sdeggo!dave@sdcsvax.ucsd.edu 
"rm -r / : A power toy is not a tool."

roy@phri.UUCP (Roy Smith) (12/20/87)

In article <164@sdeggo.UUCP> dave@sdeggo.UUCP (David L. Smith) writes:
> Besides, there's nothing that says you can't write a string package which
> has the string preceeded by the length.

	Sure you could, but the problem is that the C compiler only
supports null-terminated string constants (of the form "I am a string").
Either you have to learn to live without string constants, or put up with
having to initialize everything with 'cvt_null2count ("string constant")'
at run time.
-- 
Roy Smith, {allegra,cmcl2,philabs}!phri!roy
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016

lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) (12/21/87)

There isn't any "right" string representation. But, there are things 
worth saying.

Nul terminated strings: useful, simple.

Length count at front: overhead, but some operations (like concatenation or
block transfer) become easier.

Dope vector: (i.e. a pointer to the string, with length count attached to
the pointer, not to the string): more overhead, but some more operations
are eased. (Substring can be done in-place, with no moved characters.)

For example, suppose you want to read an entire text file to memory, and
then treat it as strings. It makes no sense to copy it piecemeal. So, build
dope vectors that point into the text region. Since you haven't written into
the stuff, you can easily put big chunks out to another file.

"All problems in computer science can be solved by adding layers of
indirection" - old joke.

We knew all this stuff years ago, so please, no rebuttals.
-- 
	Don		lindsay@k.gp.cs.cmu.edu    CMU Computer Science

chris@mimsy.UUCP (Chris Torek) (12/21/87)

[I am moving this to comp.lang.c]
>In article <164@sdeggo.UUCP> dave@sdeggo.UUCP (David L. Smith) writes:
>>Besides, there's nothing that says you can't write a string package which
>>has the string preceeded by the length.

It is not hard, but it is annoying:

In article <3078@phri.UUCP>, roy@phri.UUCP (Roy Smith) writes:
>... the problem is that the C compiler only supports null-terminated
>string constants (of the form "I am a string").  Either you have to
>learn to live without string constants, or put up with having to
>initialize everything with 'cvt_null2count ("string constant")' at run time.

The following works:

	typedef struct string {
		int	s_len;
		char	*s_str;
	} string;
	#define	STRING(x) { sizeof(x) - 1, x }

	string foo = STRING("this is a counted string");

Unfortunately, some compilers (including 4BSD PCC) then generate
the string twice in data space, and the `xstr' program only makes
it worse.  In addition, since automatic aggregate initialisers are
not permitted, and there are no aggregate constants, automatic data
must be initialised with something like this:

	f()
	{
		static string s_init = STRING("initial value for s");
		string s = s_init;
		...

(I believe the dpANS allows automatic aggregate initialisers.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

peter@sugar.UUCP (Peter da Silva) (12/28/87)

In article <554@PT.CS.CMU.EDU>, lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) writes:
> There isn't any "right" string representation. But, there are things 
> worth saying.
> 
> Nul terminated strings: useful, simple.
> 
> Length count at front: overhead, but some operations (like concatenation or
> block transfer) become easier.
> 
> Dope vector: (i.e. a pointer to the string, with length count attached to
> the pointer, not to the string): more overhead, but some more operations
> are eased. (Substring can be done in-place, with no moved characters.)

FORTH maintains static strings as a "cell" (byte, word) followed by that
many bytes. Once you start using them, however, they're maintained as dope
vectors on the stack (you convert from one to another with "count"). Some
strings (words read in but not yet interpreted, disk blocks, etc) are also
null terminated because that happens to be a handy way of dealing with them.

Of course FORTH has capabilities for building initialised static data that
would give a Modula-2 programmer nightmares for weeks.

Some strings are even maintained in a simple VM system, and addressed as
a block and offset. This is how the interpreter reads data from disk blocks.
It just lets the VM word "block" convert the block number into addresses
whenever it needs to read another word. Indirection becomes quite simple.
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.