stevem@auscso.UUCP (12/04/87)
A year or so ago I read about a machine called the Lilith (sp?) that was developed in Switzerland. From what I read it seemed that this machine had been developed with Modula 2 applications in mind. I was also led to understand that, because of this design, the machine could execute compiled Modula programs much faster than any general purpose CPU in the same word-size/clock-speed class. If this is true, I would like to know if anyone out there is working on a similar machine designed to work more efficiently with C programs. It seems that this would be the ultimate UNIX machine, eliminating the need for assembly patches to the kernel. It could run a completely portable version of UNIX and be very quick about it. If there are problems inherint in the design of C that make such a machine impossible I would like to know what these are. If there are no such problems then why in the world haven't I heard about it? I read all of the UNIX literature I can get my hands on and it seems that someone would have noticed the potential of such a machine by now. p.s. I've noticed alot of people putting trademark notices in their notes whenever they use the word UNIX. I hope I won't be taken away in the middle of the night by the secret phone police for not doing so myself. 8-O
mash@mips.UUCP (John Mashey) (12/06/87)
In article <759@auscso.UUCP> stevem@auscso.UUCP (Steven G. Madere) writes: >A year or so ago I read about a machine called the Lilith (sp?) that >was developed in Switzerland. From what I read it seemed that this >machine had been developed with Modula 2 applications in mind.... >If this is true, I would like to know if anyone out there is working >on a similar machine designed to work more efficiently with C programs.... >If there are problems inherint in the design of C that make such a machine >impossible I would like to know what these are. If there are no such >problems then why in the world haven't I heard about it? I read all of >the UNIX literature I can get my hands on and it seems that someone would >have noticed the potential of such a machine by now. What does it mean to develop a machine that works efficiently with C? It means using the statistics of real C-compiler generated code, and real C programs to help drive the design of the architecture. At least the following, to some extent or other, were so done (with recent references, which have many references to earlier work). (Others are invited to post references). *'d machines are commercially available today. Bell Labs "C" machines and CRISP Ditzel, McLellan, and Berenbaum, "Design Tradeoffs to Support the C PRogramming Language in the CRISP Microprocessor", Proc ASPLOS II, ACM SIGARCH 15, 5 (Oct 87) 158-163. *HP Precision Magenheimer, Peters, Pettis, Zuras, "Integer Multiplication and Division on the HP Precision Architecture.", Proc ASPLOS II, 90-99. *MIPS R2000 Chow, Correll, Himelstein, Killian, Weber, "How Many Addressing Modes Are Enough?", Proc. ASPLOS II, 117-121. *Sun SPARC (via some heritage from Berkeley RISC, at least) I don't have an immediate reference to a published paper that has statistical analyses: maybe someone from Sun can point at one.) I'm sure there are more, but at least these machines or their predecessors had considerable modeling work designed to make C good on them. Now, the more serious issue is whether or not it's a good idea to build a UNIX machine that's optimized ONLY for C, or in fact for any single language in particular. Contrary to popular belief, many UNIX machines run FORTRAN also, or even (gasp!) COBOL, BASIC, PL/I, ADA, LISP, Prolog, Modula, etc. Whether or not you care about these things depends on what markets you think you're in. For many of them [RISC bias on], you do fine if you just make sure loads, stores, branches, and function calls go fast. For others, you may have to do more work. [RISC bias off] Of the machines above, as far as I can tell [originators, please correct me if I'm wrong], tunings were mostly driven as follows: C machine & CRISP: C HP Precision: C, COBOL (there are 1-2 instructions that help decimal arithmetic in an appropriate RISC fashion); FORTRAN MIPS R2000: C+FORTRAN, PASCAL; not particularly COBOL, but turns out OK. SPARC: C, LISP. In general, unless botched badly, most reasonably modern (CISC or RISC) designs are at least reasonable for running C. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
drew@wolf.UUCP (Drew Dean) (12/07/87)
I apologize, as a non - vi hacker, I can't get this message to contain a copy of the one it refers to, if someone could send me a little description of vi I'd be very grateful ..... Anyways, the Lilith (you did spell it correctly) was / is a machine built by Dr. Nicklaus Wirth to run Modula-2 as its SOLE language. That is the OS, compiler, debugger, and everything else are written in Modula. The machine has a 16 bit architecture, and has 4 AMD 2900 series bit slice chips. (That is 2900, NOT 29000, work on the Lilith started in the late 1970's.) The 4 board proccesor has 256 instructions, all chosen to help in the writing of the Modula-2 compiler. For example, building a stack frame on procedure entry is ONE instruction. Due to the need to support a (at that time) hi-res display (768 by 594, mono. and interlaced), the system uses a 64 bit wide read data bus, and a 16 bit write data bus. With the average instruction length ~ 10 bits, each instruction fetch got about 6 instructions. The Lilith was a very CISCy design (process switching is also < 5 instructions), and was completely stack based. All math operations received operands on top of the stack, and pushed the result back on. If anyone wants further detail, send me email .... At any rate, the Lilith was a 1979 technology 16 bit machine running at 6 Mhz. It was blindingly fast, the 5 (yes 5) pass modula compiler was quicker that a lot of recent things, like Microsoft C 4.0. It had a barrel shifter for fast graphics, and the OS had several interesting features, and came in SOURCE code. I remember hearing a few years ago about something called the BBN C machine, which did essentially the same thing for C. Can someone supply further details about it ? Also, the Novix NC4000 and upcoming Buffalo processors do the same thing for Forth. The buffalo is is supposed to run @ 100 Mhz, and require .8 clock cycles/instruction. That's only rumor, but if it's anything close to that it will be FAST.... Drew Dean FROM Disclaimers IMPORT StandardDisclamier; UUCP: {ihnp4, sdcsvax}!jack!wolf!drew
jkh@violet.berkeley.edu (Jordan K. Hubbard) (12/07/87)
In article <1061@winchester.UUCP> mash@winchester.UUCP (John Mashey) writes: >In article <759@auscso.UUCP> stevem@auscso.UUCP (Steven G. Madere) writes: >>A year or so ago I read about a machine called the Lilith (sp?) that >>was developed in Switzerland. From what I read it seemed that this >>machine had been developed with Modula 2 applications in mind.... > >>If this is true, I would like to know if anyone out there is working >>on a similar machine designed to work more efficiently with C programs.... >At least the following, to some extent or other, were so done (with >recent references, which have many references to earlier work). >(Others are invited to post references). >*'d machines are commercially available today. I think BBN developed a series of machines a few years back that were supposedly optimized for C. They were available in various models, though all I've ever seen is the C/60. It's been a long time since I looked at the specs, so I really couldn't say how they were "optimized" or what sort of performance they got out of them. I *do* know that they didn't exactly sell like hotcakes and people seem to be using them as INP's on the internet. Does anyone know if this is because there are favorable reasons for using them this way? I suspect that BBN pushes them on people that need internet access.. Jordan Hubbard jkh@violet.berkeley.edu DISCLAIMER: "I don't know what the hell you're talking about.."
henry@utzoo.UUCP (Henry Spencer) (12/08/87)
> ... because of this design, the machine could > execute compiled Modula programs much faster than any general purpose CPU > in the same word-size/clock-speed class. The real question, though, is how it compares in speed to general-purpose CPUs that are in the same cost/delivery-time class. The usual answer is "not well", unless the language needs some odd feature that conventional hardware does not do well at. Even then, cleverness can often substitute for special hardware. Worse, even if the thing is competitive when first built, remember that there are enormous resources pushing the development of faster and better general-purpose CPUs. The Lisp machines, for example, are visibly dying as the combination of clever implementations and steadily climbing general-purpose performance leaves them behind. > If this is true, I would like to know if anyone out there is working > on a similar machine designed to work more efficiently with C programs. Since C is designed with efficiency on conventional machines in mind, it is not clear why anyone would bother. Actually, there was one such, the BBN C-70, but I don't think it went anywhere. About the only thing of real note for running C well is an efficient procedure call, but after the bad example of the VAX, most everybody does that very carefully anyway. > It seems that this would be the ultimate UNIX machine, eliminating the > need for assembly patches to the kernel. It could run a completely > portable version of UNIX and be very quick about it. Sorry, no. There are still things that are difficult or impossible to do in C unless your compiler essentially includes a primitive form of assembler inside it. Besides, why bother? The assembler parts of the kernel are manageably small already. "Complete portability" is an illusion; most of the work in a Unix port is in compilers and in things like memory management and hardware handling anyway. Also of significance is that many Unix machines are sold for applications work. For example, running big Fortran programs. One reason for Unix's appeal is precisely that it is *not* committed to a single language. Oh, if you did it carefully you might be able to get some small win without losing too much in other areas, but why bother? You will get better results investing that effort in speeding up the 68020, I'm afraid. -- Those who do not understand Unix are | Henry Spencer @ U of Toronto Zoology condemned to reinvent it, poorly. | {allegra,ihnp4,decvax,utai}!utzoo!henry
johnl@ima.ISC.COM (John R. Levine) (12/09/87)
In article <6203@jade.BERKELEY.EDU> jkh@violet.berkeley.edu (Jordan K. Hubbard) writes: >I think BBN developed a series of machines a few years back that >were supposedly optimized for C. ... The BBN C/70 was an interesting historical freak that really should never have escaped from the lab. They had a microprogrammable engine originally designed to emulate the DDP 516's used in Arpanet IMPs. That configuration, the C/30, has sold reasonably well. As an experiment, somebody wrote a set of microcode to implement something close to the reverse polish intermediate language of one of the Unix C compilers, brought up Unix on the box, and the C machine was born. On the way, they noticed that the 16 bit address space was inadequate for a modern Unix system, so they stretched it, much the same way that Boeing or Douglas stretches an airframe, so it ended up with a 20 bit address space and ten-bit bytes, making it the world's first fully metric-compatible computer. The original machine was the C/70, and my understanding was that the C/60 was the same machine in a smaller configuration. As you might imagine, the 99% of Unix software that assumes that bytes are eight bits broke in funny ways when confronted with ten-bit bytes, but they did get 7th edition Unix running reasonably well and used it extensively for network control on the Arpanet and the many private packet-switched nets that BBN has sold. They tried with minimal success to sell it as a general Unix engine, but it failed there for many reasons: -- programs broke on ten bit bytes -- twenty-bit addresses aren't bit enough -- heavily microcoded architecture caused lousy performance -- it wasn't designed for volume production, so it was expensive to build and so to sell -- nonstandard I/O interfaces meant that there weren't a lot of peripherals available. I suppose the lesson here is that when you try to turn a horse into a zebra, you end up with a camel. -- John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869 { ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something The Iran-Contra affair: None of this would have happened if Ronald Reagan were still alive.
dmr@alice.UUCP (12/14/87)
djsalomon@watdragon.waterloo.edu, in common with lots of others, thinks that C was designed to be optimal on the PDP-11, in particular because of the ++ and -- operators. Actually, this is much less so than usually believed. In particular, ++ and -- were inherited from B, which was invented before the PDP-11 existed. Now, the PDP-7 on which B was designed had a primitive form of autoincrement cell, as did other machines of its era and before, and this, no doubt, suggested the operators. However, the PDP-7 autoincrement was of no use in implementing them; B was interpreted, not compiled, on that machine. (Incidentally, what the PDP-7 provided was the operation *(int *)X++ where X was an absolute address between 8 and 15, I believe.) There is in fact rather little that is PDP-11 specific in C. Aside from things that are nearly universal these days, it prefers 1) byte addressed memory 2) ability to do simple arithmetic on pointers 3) recursive calls (i.e. ability to run a stack) 4) ability to use variadic functions and these, though not universal, are common. Point 4 can be an annoyance on pure stack machines where the caller and callee are strongly encouraged to agree statically on the number of arguments. One thing in current (pre-ANSI) C that was clearly influenced by peculiarities of the PDP-11 was in the rules for the float->double conversions; even here, I found independent justification, perhaps not conclusive, for writing the rules as I did. Another was the notion of signed characters. Other than these, I can't think of much. Dennis Ritchie
henry@utzoo.uucp (Henry Spencer) (12/16/87)
> I have always thought of C as a language designed to be optimal on the > PDP/11. The ++ and -- operators were designed to take advantage of the > PDP/11's autoincrement and autodecrement features... Sorry, not so. Admittedly this one fooled me too, but Dennis set me straight: ++ and -- existed even in B, back on the PDP7. Support for such operations has been common in DEC hardware since long before the 11, and some did exist on the 7, which may have influenced the design a bit. > C also uses null > terminated strings, which is OK for the PDP/11, but not especially > efficient for machines with block move instructions. It depends on what kind of block-move instructions you have. And one can argue that NUL-terminated strings are perhaps the right decision regardless. Few programs have their running times dominated by string copying, after all, and NUL termination is convenient in various small ways. -- Those who do not understand Unix are | Henry Spencer @ U of Toronto Zoology condemned to reinvent it, poorly. | {allegra,ihnp4,decvax,utai}!utzoo!henry
crick@bnr-rsc.UUCP (Bill Crick) (12/16/87)
OOPS! I typed my article in the summary! Well I'll retype it here. Drew Dean mentioned the"Buffalo machine". Does anyone know where I can get some info on it? I've never heard of it before. Thanks Bill Crick Computo, Ergo Sum! (V Pbzchgr, Gurersber, V Nz!)
davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) (12/17/87)
One of the interesting things about B was that there were no types. There were only objects. Therefore all objects had to be large enough to handle a pointer. On machines which didn't support hardware byte addressing there was a function (I *think* called char) which performed the accesses. Arrays were allocated as pointers and vectors. If I declared a[10], a variable a was created, and a vector length ten. This is why in C the name of an array behaves like an address. In addition, in B A[B] is the same as B[A], since both are evaluated as *(A+B). The original B compiler I saw was a total of 17 pages of code, and produced assmebler source. The output of pass one was pseudo code, and could be interpreted if desired. As I recall there was only a false jump, not a true, so code resulting in a branch on true, such as "if (a < b) break;" would generate a jump around an unconditional jump. In about 1972 I developed a language called IMP, based on the ideas of B. I also developed a peephole optimizer which operated on the pseudo code (pass 1.5). The compiler was implemented on both GECOS (as it was spelled then) and CP/M-80, and would crosscompile in either direction (but only assmebler source went from CP/M to GECOS. This was my first compiler, and in keeping with being typeless but needing floating point, use floating operators instead. As I recall the assignment operator was changed to ":=" ala Pascal, equality was just "=", and inequality was "<>" like BASIC. This made the code somewhat easier to read if you didn't know the language. -- bill davidsen (wedu@ge-crd.arpa) {uunet | philabs | seismo}!steinmetz!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me
mac3n@babbage.acc.virginia.edu (Alex Colvin) (12/17/87)
Autoincrement & decrement are probably in C for the same reason they are in the 11 (& 10 & 8 & GE635 &c.). They're useful. I find myself beating on iNtel instructions these days, and have noticed several things not amenable to C. several widths of registers. this means that the implicit widening of char to int actually generates code. ditto float to double. nonlinear addressing. this means that arithmetic on pointers is dangerous. particularly equality. however, most reasonable uses work OK. On the DPS8 (descendant of GE635) I note that C also assumes byte addressability and not bit addressability. You'll see this on most new designs. Finally, I like to think of C as the apotheosis of Algol 68.
ok@quintus.UUCP (Richard A. O'Keefe) (12/18/87)
In article <133@babbage.acc.virginia.edu>, mac3n@babbage.acc.virginia.edu (Alex Colvin) writes: > Finally, I like to think of C as the apotheosis of Algol 68. "apotheosis", n, deification, act of raising any person or thing to the status of a god. Algol 68 was a very nice language which let you do all sorts of things that are important for clear and correct coding (like dynamically sized arrays, heap allocation that didn't force you to kick type-checking good-bye) including a number of things that ANSI C is finally reinventing (prototypes, several sizes of float). There is nothing in C (other perhaps than the keywords 'int' 'void', 'struct' and the *word* "cast" -- which means something utterly different in Algol 68) resembling Algol 68. It would be more accurate to describe C as the ANTITHESIS of Algol 68. Relevance to this discussion: there was at least one machine which ran Algol 68 as its "native" language. I think it was built at RRE in the UK and was called FLEX.
collinge@uvicctr.UUCP (Doug Collinge) (12/19/87)
In article <133@babbage.acc.virginia.edu> mac3n@babbage.acc.virginia.edu (Alex Colvin) writes: > >Finally, I like to think of C as the apotheosis of Algol 68. You mean that there is someone in North America who has heard of Algol 68?! I had an Algol 68 phase a while back. I even imported two compilers from the UK: Cambridge, and Algol 68S. I still like A68 - too bad it didn't catch on here... -- Doug Collinge School of Music, University of Victoria, PO Box 1700, Victoria, B.C., Canada, V8W 2Y2 collinge@uvunix.BITNET decvax!uw-beaver!uvicctr!collinge ubc-vision!uvicctr!collinge
dave@sdeggo.UUCP (David L. Smith) (12/20/87)
In article <261@ivory.SanDiego.NCR.COM>, jan@ivory.SanDiego.NCR.COM (Jan Stubbs) writes: > Personally, I can't imagine any convenience a null terminated string would > have over a string preceded by its length. Well, there is no limit imposed on the length of the string. It also takes up one less byte per string in overhead (unless you wanted to limit your string length to 255 characters) which isn't very important today, but probably was when C was first defined. Besides, there's nothing that says you can't write a string package which has the string preceeded by the length. If this were still NULL-terminated, it would still work with exsisting functions (by passing the actual beginning of the string to them). -- David L. Smith {sdcsvax!man,ihnp4!jack!man, hp-sdd!crash, pyramid}!sdeggo!dave man!sdeggo!dave@sdcsvax.ucsd.edu "rm -r / : A power toy is not a tool."
roy@phri.UUCP (Roy Smith) (12/20/87)
In article <164@sdeggo.UUCP> dave@sdeggo.UUCP (David L. Smith) writes: > Besides, there's nothing that says you can't write a string package which > has the string preceeded by the length. Sure you could, but the problem is that the C compiler only supports null-terminated string constants (of the form "I am a string"). Either you have to learn to live without string constants, or put up with having to initialize everything with 'cvt_null2count ("string constant")' at run time. -- Roy Smith, {allegra,cmcl2,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016
lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) (12/21/87)
There isn't any "right" string representation. But, there are things worth saying. Nul terminated strings: useful, simple. Length count at front: overhead, but some operations (like concatenation or block transfer) become easier. Dope vector: (i.e. a pointer to the string, with length count attached to the pointer, not to the string): more overhead, but some more operations are eased. (Substring can be done in-place, with no moved characters.) For example, suppose you want to read an entire text file to memory, and then treat it as strings. It makes no sense to copy it piecemeal. So, build dope vectors that point into the text region. Since you haven't written into the stuff, you can easily put big chunks out to another file. "All problems in computer science can be solved by adding layers of indirection" - old joke. We knew all this stuff years ago, so please, no rebuttals. -- Don lindsay@k.gp.cs.cmu.edu CMU Computer Science
chris@mimsy.UUCP (Chris Torek) (12/21/87)
[I am moving this to comp.lang.c] >In article <164@sdeggo.UUCP> dave@sdeggo.UUCP (David L. Smith) writes: >>Besides, there's nothing that says you can't write a string package which >>has the string preceeded by the length. It is not hard, but it is annoying: In article <3078@phri.UUCP>, roy@phri.UUCP (Roy Smith) writes: >... the problem is that the C compiler only supports null-terminated >string constants (of the form "I am a string"). Either you have to >learn to live without string constants, or put up with having to >initialize everything with 'cvt_null2count ("string constant")' at run time. The following works: typedef struct string { int s_len; char *s_str; } string; #define STRING(x) { sizeof(x) - 1, x } string foo = STRING("this is a counted string"); Unfortunately, some compilers (including 4BSD PCC) then generate the string twice in data space, and the `xstr' program only makes it worse. In addition, since automatic aggregate initialisers are not permitted, and there are no aggregate constants, automatic data must be initialised with something like this: f() { static string s_init = STRING("initial value for s"); string s = s_init; ... (I believe the dpANS allows automatic aggregate initialisers.) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
peter@sugar.UUCP (Peter da Silva) (12/28/87)
In article <554@PT.CS.CMU.EDU>, lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) writes: > There isn't any "right" string representation. But, there are things > worth saying. > > Nul terminated strings: useful, simple. > > Length count at front: overhead, but some operations (like concatenation or > block transfer) become easier. > > Dope vector: (i.e. a pointer to the string, with length count attached to > the pointer, not to the string): more overhead, but some more operations > are eased. (Substring can be done in-place, with no moved characters.) FORTH maintains static strings as a "cell" (byte, word) followed by that many bytes. Once you start using them, however, they're maintained as dope vectors on the stack (you convert from one to another with "count"). Some strings (words read in but not yet interpreted, disk blocks, etc) are also null terminated because that happens to be a handy way of dealing with them. Of course FORTH has capabilities for building initialised static data that would give a Modula-2 programmer nightmares for weeks. Some strings are even maintained in a simple VM system, and addressed as a block and offset. This is how the interpreter reads data from disk blocks. It just lets the VM word "block" convert the block number into addresses whenever it needs to read another word. Indirection becomes quite simple. -- -- Peter da Silva `-_-' ...!hoptoad!academ!uhnix1!sugar!peter -- Disclaimer: These U aren't mere opinions... these are *values*.