brnstnd@stealth.acf.nyu.edu (02/17/90)
I'm bored, it's a cloudy day, and I can't stand Ada. This is about the most formal announcement there'll be of a new, still unnamed language. I'll bet that almost every programmer's tastes can be satisfied by a single language, and I'm willing to go the distance to find out. So what do you want in a compiled, imperative, perhaps object-oriented language? Take C as a starting point for good ideas and feel free to use parts of any other language. Remember: This isn't Ada. If it gets too complicated, trash it. Simple is beautiful. Modular design is beautiful. And above all, remember that this is going to be a language people can actually like. Don't bother complaining that there are too many languages already. I know. I'm just jumping on the bandwagon, with the unusual twist that the evolving design will incorporate (with credit) the ideas of programmers around the world. Undue formality is out: I'm not a standards committee. I'm not too worried about logistics: if and when this project heats up, I'll start imposing a bit more organization. Until then, I'll just archive the discussion. ---Dan
kjj@varese.UUCP (Kevin Johnson) (02/17/90)
In article <22569:05:10:24@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >So what do you want in a compiled, imperative, perhaps object-oriented >language? Take C as a starting point for good ideas and feel free to >use parts of any other language. Remember: This isn't Ada. If it gets >too complicated, trash it. Simple is beautiful. Modular design is >beautiful. And above all, remember that this is going to be a language >people can actually like. Rhetorical question: Aren't you talking about C++? Semi-rhetorical question: What would be this language's intended use? 1. How about string operators. I hate handling allocing of space for something silly like strings... 2. Ability to dynamically define new operators 3. Ability to use existing C libraries and headers. Otherwise, I want: a. screen handling poop b. internet poop b. X poop c. :-) Seriously, I would consider the ability to link in existing libraries, one way or another, an absolute must. #include <standard_disclaimer> .-----------------------------------------------------------------------------. | Kevin Johnson ...!mcdphx!QIS1!kjj | | QIS System Administrator Motorola MCD kjj@phx.mcd.mot.com |
jhallen@wpi.wpi.edu (Joseph H Allen) (02/18/90)
In article <22569:05:10:24@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >I'm bored, it's a cloudy day, and I can't stand Ada. >So what do you want in a compiled, imperative, perhaps object-oriented >language? Take C as a starting point for good ideas and feel free to >use parts of any other language. Remember: This isn't Ada. If it gets >too complicated, trash it. Simple is beautiful. Modular design is >beautiful. And above all, remember that this is going to be a language >people can actually like. Ok, I'll bite. Here's a compiled language I'd like to see: (1) No semicolons. (2) Except for end of line comments. /* These comments are evil */ (3) Block structure indicated by indentation level: while a!=b int q ; Multi-line body q=z*5 r+=foo(q) q=6 while a!=b r=foo(z*5) ; Single line body q=6 if a==b c=d if a==b c=d r=500 else q=r s=t etc. So that you can blocks in single lines, [ and ] can also be used to indicate block structure in the conventional way. (4) Overloadable AND definable operators (5) All characters allowed in symbols. For example, a typical definition might be: int :^&%&^*?: = 8 This is so that operators can be defined. There shouldn't be seperate character sets for operators and identifiers. I.E., instead of detecting the end of identifiers with the presence of operator or whitespace characters, the longest possible string which can be a symbol is deteced: if these are symbols: abc def abcdef then when the input sees: abc ; abc is recognized abcdef ; abcdef is recognized defabc ; def is recognized and then abc is recognized This requires that special seperators be used to delimit symbols in declarations (or wherever they first appear). Perhaps to save typing there might be a default identifier character set which doesn't require these delimiters. Symbol recognition should occure before constant recognition. I.E., this way you can define: int :4: = 5 ; Make 4 equal to 5 (6) Nifty C declarations which allow one type to be shared among multiple declarations each of which might have an initializer. Bad: it : integer; this : integer; Good: int it = 7, this = 5, that = 0, theother = 10 (7) However, the convoluted C declaration system needs to be replaced: instead of: int **foo[] an array of pointers to int pointers do this: [] * * int foo (8) Eliminate arrays. They arn't needed. Use pointers and macros instead. (9) For constants: $hex decimal %binary 'c' ; Character 'abc' ; String (sorry, no octal. You could do it with 0777 but that's gross) These are equivelent strings: 'a' \ 'b' \ 'c' \ 13 \ 10 \ 0 'abc' \ 13 \ 10 \ 0 I.E., no escape sequences needed. Strings are just integer constants concatenated together. And constant expression can be used in these constants soo: const int CR = 13 const int LF = 10 const int EOS = 0 'abc' \ CR \ LF \ EOF I would prefer ',' for the concatenation character but it's needed elsewhere. (10) Standard operators. Grouped together in equal precidence: ( ) Precidence [ ] Block and precidence ` Get symbol from previous scope level (C++'s '::') @ Get object at address (C's '*') # Return address of object (C's '&') . Member selector. No need for '->'. Why does C do use -> anyway? ~ Bit-wise not - Negate sizeof Size of argument on right base Distance between member indicated on right and base address of structure >> << Shift right and shift left * Multiply / Divide // Modulous & Bit-wise and + Add - Subtract | Bit-wise or ^ Bit-wise exclusive or = += -= |= ^= *= /= //= &= >>= <<= Assignments &&= ||= : +: -: |: ^: *: /: //: &: >>: <<: Assignments which work the &&: ||: the other way: a += b means add b to a and return the result a +: b means add b to a but return the original value of a == >= <= != > < Comparison ! Logical not && Logical and || Logical or (11) Blocks return the last value generated: a = [ int q q=r r=t t=q ] ; a gets r (12) Statements return their last value: a = if b==c 500 else 1000 ; if b equals c a gets 500 ; otherwise it gets 1000 (This way, there is no need for the '?:' operator) (13) Like C++, declarations can be made anywhere. (14) Statements if expr expr else expr do expr until expr while expr expr return (C's 'return expr' is 'expr return' in this language) break continue goto expr (gotos take code addresses) (15) Structure and code generation rules: int a int b these are always right next to each other and a is at a lower address. (GNU C actually puts b at a lower address) The rules for this are the same as in structures Structure members are placed in the order they appear in the defenition- they are never sorted. Bytes are first packed and then padded on machines with alignment problems. I.E., typedef IT int a char b char c char d int e (oh did I mention that there is no 'struct' symbol? Use typedef and blocks instead) b c and d are all in one integer. that integer has 1 extra byte of padding in it. (16) Basic types should be: int expr ; a signed of at least expr bits uint expr ; an unsigned integer of at least expr bits A set of macros might be used for the machine standard types. (17) More types shit: const ; for addressable constants inline ; for small non-addressable constants ; (and inline functions) register ; non-addressable variable ; fully addressable variable (blank) macro ; same as inline but with no type checking op LEFT RIGHT RETURN ; an operator or function ; LEFT indicates left-side arguments ; RIGHT indicates right-side arguments ; RETURN is the return type op void RIGHT RETURN ; This is a traditional function (18) There should be a symbol for the automatic conversion stuff. This way you can control how conversions can work: op void NSTRING s int CONVERT = atoi(s.text) This overloads the converion function CONVERT to allow automatic conversion from NSTRINGs (string with a number in it, say) to integers. (19) prec SYMBOL expr sets the precidence of operator SYMBOL to expr (a number). (20) In this function, the right argument is a pointer to a string (s is an address of (#) a character (int 8)) and returns a 32 bit integer. When it's called you actually give it an address. op void # int 8 s int 32 atoi = ... This defines the '+=' operator. a is a reference to an int. When you call it you put a variable on the left as usual: x += y but the function will actually receive the address of the variable: op @ int 32 a int 32 b int :+=: = @a = @a + b (and this is the '+:' operator) op @ int 32 a int 32 b int ::=: = int 32 tmp tmp=@a ; Remember original value of left side @a = @a + b ; Add tmp ; Return original value There should also be a modified so that the '@' is automatically assumed in the function (I.E., like pointers in pascal): op ref @ int 32 a int 32 b int ::=: = int 32 tmp tmp=a ; don't need @a since 'ref' is there a = a + b tmp (21) More about structures - Classes == structures - There should be a word 'inherit' which copies the contents of the indicated structure defenition into the new one. I.E.: typedef me int a int b typedef you inherit me int c is the same as typedef you int a int b int c - Inherits with clashing members are not allowed. Use instances instead. - Function arguments are really structures. If a function returns a structure, that structure is placed on the stack, not a in a global variable. - Member functions are indicated in function declarations. There should be another type qualifyer which indicates a function gets a pointer to the structure and all members of that structure look like local variables to the function. - To get the instance.message form, function pointers should be used in the structure. - There should be a way to indicate default structure values for when structures are created. Possibly this could be done in a constructor/destructor system. (22) Named arguments. You should be able to call a function in two ways: func(10,20,30) ; position arguments func(`a=20, `c=30, `b=20) ; argumnents are specifically named There's much, much more to do and there are problems with what I have. But this is the way my ideal language should sort of look like. The general goal is to make it both one step above assembly language and completely extendable. -- "Come on Duke, lets do those crimes" - Debbie "Yeah... Yeah, lets go get sushi... and not pay" - Duke
gateley@m2.csc.ti.com (John Gateley) (02/18/90)
In article <22569:05:10:24@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >So what do you want in a compiled, imperative, perhaps object-oriented >language? Take C as a starting point for good ideas and feel free to >use parts of any other language. Lets fix the brain damaged complicated syntax to start with: make all terms in the language look like: <simple term> ::= number or other constant etc. <term> ::= (<term> <term> ...) Here the first term is an "operation", like a special form name, or a function call or even possibly another term, and the remaining terms are "arguments" to the operation. Presto: easy to understand/learn syntax, no messy parsers, a nice uniform syntax which allows program manipulation tools to be developed much easiser. I don't take credit for this idea of course: it comes from Lisp. John gateley@m2.csc.ti.com
brnstnd@stealth.acf.nyu.edu (02/19/90)
Syntax is less important than semantics, though of course a clean, simple syntax is necessary for a language programmers actually like. (ALPAL: A Language Programmers Actually Like. Naaah, too pretentious.) For the moment, general principles are more important than specifics. There should be some number of macro (preprocessing) levels to handle trivial syntactic issues. I don't know what system would be best, or if there even is a best system. In article <8475@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes: [ lots of suggestions ] 1, 2, 3. No semicolons. End-of-line comments. Block structure indicated by indentation. These all relate to the syntax of simple statements and control structures. The most important general issue is whether structures should be explicitly terminated. The only advantage of C-ish failure to terminate is that single-statement structures are slightly shorter; and there are lots of syntactic disadvantages. Is there anyone out there who really wouldn't like loop ... end/endloop/pool, etc.? You propose letting indentation determine structure, and using newlines as statement terminators. It's easy to convert between this and a more traditional syntax; in fact, it would be nice to have a macro facility good enough to do the job. Anyway, I favor a syntax that doesn't depend on lines or indentation: otherwise it's too easy to make syntax errors. A line-based syntax also feels very dirty: there are exceptions for multiple statements on a line, exceptions for single-statement structures, etc. 4. Overloadable and definable operators This is another syntax issue. The language MUST provide an unambiguous syntax for everything. Fortran-90 is the only overloading language I know that does this well. Overloading just means ambiguous abbreviation, and definable operators are just a more convenient syntax for certain functions. I think overloading should be just kept in mind until function calls and any object-oriented facilities are worked out. 5. All characters allowed in symbols. Would you really want to read a program with ?)*[! as an identifier? I wouldn't mind a macro facility that could handle this, or the ability to partition the character set the way you want. However, the basic language must have some namespace control to do any parsing at all. Also, this language MUST be interoperable with other languages to be useful. The issue of defining your own character set relates strongly to the syntactic argument about overloading. Never force a reader to learn a new language. 6. C-like initialization power. Well, okay. Take it for granted that declarations and definitions will be at least as powerful as in C. 7. int **foo[] becoming [] * * int foo Yeah. C would be cleaner if all the ``type constructants'' had a single syntax. This needs to be considered in much more detail to see what people would like to use. Perhaps there's a simple, readable, consistent way to provide everything in both prefix and postfix form; then nobody can complain. 8. Eliminate arrays in favor of pointers and macros. Say what? You need some way to express the concept of a contiguous region of memory. That's what arrays are for. How do pointers cleanly express multidimensional arrays? The language should know something about arrays, even if just for efficiency. 9. Constants: $hex, decimal, %binary, 'c', 'abc' This is again a matter of taste; we'll see what people like. Many different forms of constants can be provided without hurting simplicity or readability. I don't agree with the combined syntax for strings and characters: what do you do with single-character strings? The language shouldn't have to know about strings; Pascal and Ada deal with strings poorly. (C's problem is that there isn't a good enough syntax to easily interface the language with different string-storage techniques.) I also disagree with the idea of leaving out octal: finding a better syntax is a good idea but there's no reason to take the feature away. 10. Standard operators. This is, again, something that must be considered in much greater detail to get right. (Yes, I agree that @ is a much more logical symbol than * for indirection.) For the moment let's stick to general issues: You're right that there should be Algol 68C-like assignments that relate to a = b and a op= b the same way that a++ relates to ++a. As for =/== vs. :=/= vs. your :=/== vs. statements-ain't-expressions =/= vs. =/.EQ. vs. ... : I dunno. When I'm coding on paper I alternate between paper-only left-arrow/= and C's =/==. On the screen I've begun using preprocessors that can handle my terminal's extended characters. As many writers have observed, the problem is balancing paper tradition with ASCII's rather inexpressive character set. 11, 12. BCPL-like statements returning values. Yes, of course. C's restriction that you can't do something like a = {if (b == c) 2; else 3; } is purely annoying. At the very least, the language should solve this the way that GNU's C compiler does. 13. Declarations anywhere. Yeah. 14. Control flow statements, control structures: [ various ] I have some rather heretical thoughts on this subject. I'll make them clear in another message. (Remember that this isn't Ada. Given an infinite loop ... endloop, if, and break, you don't need to provide a terminating loop as a basic construct. Define it instead as a standard macro. Ada's infinite variety of control structures is awful.) 15. Structure and code generation rules: Variables are in memory in the order of declaration. Yeah. I very much want more control over stack allocation and control flow than in C. This is not dealt with by any current language and needs a lot of thought. One idea I've been considering is replacing function types with statement types. This makes setjmp/longjmp, multiple function entry points, and various other techniques much cleaner. The problem is, once again, how and when to allocate stack variables. I think two goals along these lines are (simpler:) that the language support varargs (and varargs passing!) cleanly, and (harder, assuming both a good exception mechanism and OS-generated timer exceptions:) that the language support enough stack control and longjmp control that a programmer can build a portable threads library. Note that a truly working setjmp/longjmp would deal with register variables correctly; this is probably impossible without OS and hardware support for a ``register storage vector'' indicating storage locations for all register variables. It's certainly something to think about... You mention that structures can disappear in favor of typedef and blocks. To me it doesn't look like you're simplifying anything; and it's a bad idea to confuse statement blocks with structure blocks. Unions can and should be reorderable. union { int a; float b; } and union { float b; int a; } must, of course, be compatible---except that in C they'd be initialized differently. (I hope you don't mind unions?) 16. Basic types: int bits, uint bits I disagree. The basic types should be those types that the machine can handle quickly. The language must be efficient! It's perfectly fine to have a standard notation for ``a type long enough to handle N bits'' or ``how many bits are in type X?'' but the language should not make restrictions on the size of basic types. (Then again, every case in which portability takes second place to efficiency must be carefully considered and well documented. Two issues along these lines are bit sizes and the semantics of mod. As I feel very strongly that the second should be portable, I shouldn't assume that nobody feels the same way about the first. Then again, wouldn't a standard notation for your ``int 8'' be enough?) What about characters? What about floating-point types, which many machines support better than ints? What about Ada-like fixed-point types? ANSI C messed up in its restrictions on void. void should mean a 0-bit integer, aligned so that any pointer type can be safely converted back and forth to void *. So dereferencing a void always produces 0; sizeof(void) is 0; and so on. I agree with C's philosophy of only allowing bit packing inside structures. Other packing methods would really mangle the concept of pointers. 17. const, inline, register, macro, op LEFT RIGHT RETURN Interesting idea, the last one. The basic function call syntax should be what the most people like; if there's a clean way to integrate (say) C's functions, Forth's statements, and Lisp's whatevers, let's do it. It would be wonderful to have a way to express more complex data flow than algebraic expressions and single-type function calls. Unfortunately, I don't know any good syntax or semantics for data flow. (This is NOT going to become a so-called ``functional'' language, thank you.) Data flow is just a convenient way to express temporary (register) variables; whenever I use an expression twice I wonder if there's some natural ``teeing'' extension to C's ``piping'' notation that would simplify my code. 18. Automatic conversion. Yeah. Is it inconsistent how C really mangles the representation when you convert from int to float while it (typically) doesn't change it at all when you convert from int * to void *? I'm not sure. It may not be wise to integrate casts with user-defined conversion functions, as the former are implementation-dependent while the latter should not be. 19. User-defined precedence. This is yet another syntactic preprocessing problem. Remember that the language should be readable! 20. Parameter passing: [ various ideas ] This has to be dealt with very carefully. I like C's solution: it's clean while allowing every trick Ada can do. A general principle here (which you appear to disagree with) is that the form of a function call can make clear the fact that a variable is not modified. 21. Classes equal structures. Inheritance is just including one structure in another. Function arguments are really structures. This is the kind of idea that I'm looking for. Object-oriented programming can be very clean given a sufficiently powerful syntax and semantics for function pointers and structures. Your ``inherit'' keyword is beautiful. Function arguments being structures: This could be useful if it's combined with a simple way to deal with the program stack. Default structure values upon creation: This brings up the issue of whether there should be a way to call an initialization function the first time a function is called (as in Modula-2, Fortran-90, and a few other obsolete [1/2 :-)] languages). I don't think there's any point: all related ``features'' can be much more cleanly implemented by combining function pointers with the more usual initializations, or by keeping an appropriate local variable. (Those are the two methods used in Modula-2 compilers: the point is that if they're easily implemented with simpler features, they should be. Modularity.) > - Member functions are indicated in function declarations. > There should be another type qualifyer which indicates a > function gets a pointer to the structure and all members of > that structure look like local variables to the function. I'm not sure what you're getting at here. 22. Named arguments. Yeah. This is one of Ada's few good features. The syntax is a bit of a problem, but I'm sure it can be worked out. > There's much, much more to do No duh. > The general goal > is to make it both one step above assembly language and completely extendable. And modular. And clean. And robust. And likable, even fun to use! ---Dan
brnstnd@stealth.acf.nyu.edu (02/19/90)
In article <12507@mcdphx.phx.mcd.mot.com> kjj@varese.UUCP (Kevin Johnson) writes: > Rhetorical question: Aren't you talking about C++? Of course not. C++ isn't even close to perfect. Isn't there some change you'd like to make to C++ so that you'd like programming even better than you do now? Fine, say so. Then iterate. Hopefully almost everyone's wishes will be synthesized into this new, still-to-be-named language. (Now there's a name: FOO. Naaah, people might remember what foo stands for, and lots of young urban CS types will think it stands for something object oriented.) > Semi-rhetorical question: What would be this language's intended use? Similar to C. It will have the ``low-level'' features of C so that it's appropriate for systems programming, but there's no particular focus. (I use UNIX C for complex numerical programming, so I may be biased.) > 1. How about string operators. > I hate handling allocing of space for something silly like strings... This is mainly a library problem (though a good syntax helps). > 2. Ability to dynamically define new operators Expand. What exactly do you want? We're not talking p-code, you know. Are you looking for something that can't be implemented on top of the language? > 3. Ability to use existing C libraries and headers. At least to interface with the loader the same as other languages. As for headers: one of the first standard applications will be a program to convert C function prototypes to this language. (Having the same macro processing is too much to ask, because C's macro processor is so limited. But most libraries do fine with just the function interface.) It would be nice if the language could compile to C, but it already looks like C just isn't powerful enough. > Seriously, I would consider the ability to link in existing > libraries, one way or another, an absolute must. I agree. ---Dan
nick@lfcs.ed.ac.uk (Nick Rothwell) (02/19/90)
In article <22569:05:10:24@stealth.acf.nyu.edu>, brnstnd@stealth writes: >This is about the most formal announcement there'll be of a new, still >unnamed language. I'll bet that almost every programmer's tastes can >be satisfied by a single language, You're kidding, right? >Take C as a starting point for good ideas You're kidding, right?. If you're going to mess around with antiquated low-level languages like C, I don't see the point of bringing yet another one into the world. Look at the functional languages like ML, or the newer specification/programming languages, or the modern OO languages like Cardelli's Quest. Or even take Eiffel and give it a decent formal semantics. >---Dan Nick. -- Nick Rothwell, Laboratory for Foundations of Computer Science, Edinburgh. nick@lfcs.ed.ac.uk <Atlantic Ocean>!mcvax!ukc!lfcs!nick ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ...als das Kind, Kind war...
jhallen@wpi.wpi.edu (Joseph H Allen) (02/19/90)
In article <4489:05:14:19@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >Syntax is less important than semantics, though of course a clean, >simple syntax is necessary for a language programmers actually like. >(ALPAL: A Language Programmers Actually Like. Naaah, too pretentious.) Pretentious? How about D? >For the moment, general principles are more important than specifics. >There should be some number of macro (preprocessing) levels to handle >trivial syntactic issues. I don't know what system would be best, or >if there even is a best system. I think you hint at this later, but I think it should be just as easy to extend/add control statements as it is to extend/add functions. Perhaps some macro processing stage which is more heavily interwoven with the language is needed for this. I.E., a macro system in which you can say, "I want an expression here", "I want this symbol here" etc. >In article <8475@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes: > [ lots of suggestions ] > >1, 2, 3. No semicolons. End-of-line comments. Block structure indicated > by indentation. > >These all relate to the syntax of simple statements and control >structures. The most important general issue is whether structures >should be explicitly terminated. The only advantage of C-ish failure to >terminate is that single-statement structures are slightly shorter; and >there are lots of syntactic disadvantages. Is there anyone out there who >really wouldn't like loop ... end/endloop/pool, etc.? >You propose letting indentation determine structure, and using newlines >as statement terminators. I didn't mean newlines to be statement terminators. If a statment needs to go into an another line, that's fine. Statements should be terminated implicitly when they can no longer be parsed. This means we have to be very careful about not having identifiers which can both be operators and variables. Also infix must not be shared with prefix or postfix operators. One problem we will have is with '-'. When you see: it = this + 5 - 10 Does it mean it=this+5-10? or is the -10 a single return value for the block? I prepose we let this problem stand and solve it with parenthasis: it = this + 5 ( - 10) However, new lines could be used to terminate multi-statement lines (the single statement problem you talked about): If the statement (expression. No reason to distinguish between the two) starts after the if expression, then it's a single line statement. if expr expr expr expr expr \n If the statement doesn't start after the if expression then it's a multi-line block: if expr expr expr expr expr expr which ends when the indentation level becomes lower. > It's easy to convert between this and a more >traditional syntax; in fact, it would be nice to have a macro facility >good enough to do the job. Lets not cop out too early... >Anyway, I favor a syntax that doesn't depend >on lines or indentation: otherwise it's too easy to make syntax errors. I disagree with this. It's more of a pain when the indentationing doesn't match the block symbols: if dfjhkjddf {{{{{{{ sdfjkhdf }}}} else {{{{{ }}}}}} What people do with C makes things very confusing. ( :-) YOU use the macro processor to make it your way. The language defualt will, of course be my way.) >A line-based syntax also feels very dirty: there are exceptions for >multiple statements on a line, exceptions for single-statement >structures, etc. It's absolutely consistent. Only two rules are needed. Deeper indentation means a new block and when the body statement begins on the same line as the structure statement a single line block is indicated (oh, the end of line terminator shouldn't be "hard". Instead all statements beginning on the same line are part of block. The last statement should be able to continue onto the next line if it has to: if a==b a=d+ ; + means has to continue e ; onto this line ) >4. Overloadable and definable operators >I think overloading should be just kept in mind until >function calls and any object-oriented facilities are worked out. Overloading is too convenient not to have built into the language at every level. All of the language intrisics should be as unambiguous as possible. However, it will be possible for the user to screw up with definable operators. I think this is a style issure- don't overload unless you absolutely have to. >5. All characters allowed in symbols. > >Would you really want to read a program with ?)*[! as an identifier? Yes. And spaces should be allowed in symbols too (I hate those stupid _) >I wouldn't mind a macro facility that could handle this, or the ability >to partition the character set the way you want. Sure take it out of the language why don't you. >However, the basic >language must have some namespace control to do any parsing at all. No it doesn't. Operators and other symbols are all disginguished by what's in the symbol table not by what characters they use. The LEXer only finds words in the symbol table and passes this on to the parser. The LEXer doesn't do anything else (except constants (if you're a real purist, put these in the symbol table too- all of them :)) >Also, this language MUST be interoperable with other languages to be >useful. This and the fact that you can make some very instersting unambiguities are the downfall of this. I think the language shouldn't be restrictive. People should just excercise self control. Which would you be more annoyed at? The language not letting you use '$' in symbols so that you couldn't access VAX's special assembler symbols or using IBM's graphic characters and then discover that doing so isn't very portable? >The issue of defining your own character set relates strongly to the >syntactic argument about overloading. Never force a reader to learn a >new language. I don't want to start a war here but I'm more for writing then reading and maintaining. Let the managers force rules on the programmers to make things maintainable. >8. Eliminate arrays in favor of pointers and macros. > >Say what? You need some way to express the concept of a contiguous >region of memory. No, no, no. This should be an initializer issue: inline # int array = # 256 dup int Left side: non addressable pointer to integers (an equate). Right side: The address of 256 uninitialized ints. >That's what arrays are for. How do pointers cleanly >express multidimensional arrays? The language should know something >about arrays, even if just for efficiency. What are you some kind of math person :) ? System langauges don't need arrays. C doesn't even really have arrays.. there's no way of passing mutidimension arrays without seperately passing the size of each dimension. Efficiency is an other problem, however. >9. Constants: $hex, decimal, %binary, 'c', 'abc' > >This is again a matter of taste; No it's not. It's just plain stupid to do hex contants with 0x... or 0...h (the C and Intel way). >I don't agree with the combined syntax for strings and >characters: what do you do with single-character strings? The language absolutely must do this. I find it very annoying that there are things I can do in assembly language strings that I can't do with C's (namely, have constant expressions in each character). Single character strings? No problem: string = # 'A' string = # 65 string = # 'ABCDEFG' string = # 65 \ 66 \ 67 \ 68 \ 'EFG' Admittedly, having to put a '#' before each string to get its address is a pain. > The language >shouldn't have to know about strings; Pascal and Ada deal with strings >poorly. (C's problem is that there isn't a good enough syntax to easily >interface the language with different string-storage techniques.) I agree. C's problem is solved with macros and overloadable operators. > I also >disagree with the idea of leaving out octal: finding a better syntax is >a good idea but there's no reason to take the feature away. Ok. Lets make it the braindamaged type. Octal numbers end with 'O' (oh). (actually I'm kidding. I know we have to support octal. Perhaps there should even have base-n constants) >10. Standard operators. >As for =/== vs. :=/= vs. your :=/== vs. statements-ain't-expressions =/= >vs. =/.EQ. vs. ... : I dunno. When I'm coding on paper I alternate >between paper-only left-arrow/= and C's =/==. On the screen I've begun >using preprocessors that can handle my terminal's extended characters. >As many writers have observed, the problem is balancing paper tradition >with ASCII's rather inexpressive character set. Right. Allow all characers to be used in symbols. >You mention that structures can disappear in favor of typedef and blocks. >To me it doesn't look like you're simplifying anything; and it's a bad >idea to confuse statement blocks with structure blocks. What's the difference between this and what C does? I just don't see the need for C's 'struct' keyword. Every structure I ever make begins like this: typedef struct foo FOO; struct foo { FOO *next; etc... }; >Unions can and should be reorderable. union { int a; float b; } and >union { float b; int a; } must, of course, be compatible---except that >in C they'd be initialized differently. (I hope you don't mind unions?) Frankly I wish there were some easier way to deal with unions. As far as I'm concerned, unions are just as difficult to use as casts: x->thing.memeber=7 (union) (cast)x->thing=7 (cast) Perhaps we should have overloadable variables? >16. Basic types: int bits, uint bits > >I disagree. The basic types should be those types that the machine can >handle quickly. The language must be efficient! It's perfectly fine to >have a standard notation for ``a type long enough to handle N bits'' or >``how many bits are in type X?'' but the language should not make >restrictions on the size of basic types. This isn't making any restrictions. The only types provided are the machine primitive ones (char short long etc..) this just provides a way of selecting the proper one for the machine being used. >nobody feels the same way about the first. Then again, wouldn't a >standard notation for your ``int 8'' be enough?) Yes, use a header file. >What about characters? What about floating-point types, which many >machines support better than ints? What about Ada-like fixed-point >types? Lots more to do... >ANSI C messed up in its restrictions on void. void should mean a 0-bit >integer, aligned so that any pointer type can be safely converted back >and forth to void *. So dereferencing a void always produces 0; >sizeof(void) is 0; and so on. Yes perhaps there should both be 'void' and 'unspecified'. There might be some way to combine this with variable arguments. >[only allow bit packing in structures] I don't know. I think if the machine can handle chars and ints and also has alignment problems char packing isn't that bad. The only time there is a problem with this is when you try to increment a pointer from a char to an int. Inside of structures this isn't a problem because I want to provide a special 'sizeof' like operator: 'base' return the distance between the structure base address and one of its members. >19. User-defined precedence. >This is yet another syntactic preprocessing problem. Remember that the >language should be readable! This is really just part of definable operators. -- "Come on Duke, lets do those crimes" - Debbie "Yeah... Yeah, lets go get sushi... and not pay" - Duke
gateley@m2.csc.ti.com (John Gateley) (02/20/90)
In article <8475@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes:
<Ok, I'll bite. Here's a compiled language I'd like to see:
<(1) No semicolons.
<(2) Except for end of line comments. /* These comments are evil */
<(3) Block structure indicated by indentation level:
<
< while a!=b
< int q ; Multi-line body
< q=z*5
< r+=foo(q)
<...
Now, just take it a little further: get rid of all infix notation,
and let "blocks" be denoted by ( and ) and you get:
(begin
(while (!= a b)
(let ((q integer))
(= q (* z 5))
(+= r (foo q))))
...)
and you have achieved a truly simple easy to use syntax where you
dont have to worry about indentation (the editor does it for you),
and programs which manipulate source code are much much easier to
write.
<(4) Overloadable AND definable operators
Using syntax like I descibed above avoids this issue entirely (there is
no such thing as an operator). The guy that occurs in the first position
of ( ... ) can be overloaded/defined in well behaved ways easily.
<(5) All characters allowed in symbols.
Using syntax like I described above, the ONLY characters not allowed
in symbols are (, ), and ; (for comments). In real life, this set usually
is a little bit larger, but not much.
< the longest possible string which can be
< a symbol is deteced:
< if these are symbols:
< abc
< def
< abcdef
< then when the input sees:
< defabc ; def is recognized and then abc is recognized
Uggh, gross!!
< Symbol recognition should occure before constant recognition. I.E.,
< this way you can define:
< int :4: = 5 ; Make 4 equal to 5
You are starting to have to make lots of rules to handle
your way of life, it should be simple not complex, and
having to worry about details like "Does :4: means : 4 : or a symbol
named ":4:"" confuses people.
<(6) Nifty C declarations which allow one type to be shared among multiple
< declarations each of which might have an initializer.
Even better, lets support dynamic typing, then we can avoid this
issue as well.
<(7) However, the convoluted C declaration system needs to be replaced:
< instead of:
< int **foo[] an array of pointers to int pointers
< do this:
< [] * * int foo
Hmmm, this is very similar to my arguments for a simpler syntax:
by specifying everything in prefix notation, you avoid the
precedence problem (among other things). Even though you keep
operators and precedence for expressions, you want to remove them
from declarations! The first line is perfectly acceptable if you
remember the precedence/associativeness of [] and *. Of course, I
think that it should be ([] (* (* int))) for the type spec.
<(8) Eliminate arrays. They arn't needed. Use pointers and macros
< instead.
Ummm, if you pursue this argument ad infinitum (is that the right
latin word?), then you wind up with a turing machine, the lambda calculus,
or a combinatory logic, or some other beast which is close to impossible
to program in. On the other hand if you have a standard interface to
arrays, then looking at other peoples code will be easier: you won't
have to learn their version of the macros.
Perhaps, though, you meant provide a standard set of macros to implement
arrays on top of pointers, I have no argument with this.
<(10) Standard operators. Grouped together in equal precidence:
< [HUGE TABLE DELETED]
I hate precedence: I can never remember which is which.
Instead, using the lisp-style syntax again avoids this issue:
everything is parenthesized automatically and you don't ever
have to worry about precedence.
(+ 2 (* 3 4) (- 5 6))
<(11) Blocks return the last value generated:
<(12) Statements return their last value:
Now go one step further and eliminate the concept of a statement: now
everything is an expression and returns values. In the spirit of something
posted somewhere recently: we have reduced the number 2 (statements and
expressions) to 1 (expressions).
John
gateley@m2.csc.ti.com
gateley@m2.csc.ti.com (John Gateley) (02/20/90)
In article <12507@mcdphx.phx.mcd.mot.com> kjj@varese.UUCP (Kevin Johnson) writes: >1. How about string operators. > I hate handling allocing of space for something silly like strings... But, string sizes are not known at compile time, and so must be handled by the heap (i.e. alloc). >3. Ability to use existing C libraries and headers. This is truly a difficult problem, because you have to say: "whats so special about C, I want my <foo> libraries" where <foo> might be Ada, Lisp, PDP-11 assembler, or whatever. Instead, why not write a routine which, given a C library, would convert it into a library for the new language. This would involve changing the entry points a little, but that should be about it. John gateley@m2.csc.ti.com
peter@ficc.uu.net (Peter da Silva) (02/20/90)
Coroutines! Some minor C syntax cleanups. Replace C pointer syntax (prefix *) with Pascal pointer syntax (postfix ^). This will automatically clean up declarations and get rid of an ugly two character symbol ("->"). Replace = with :=, but don't replace ==. That way it's *never* OK to say a = b. Keep +=, -=, etc the same. -- _--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ \_.--._/ Xenix Support -- it's not just a job, it's an adventure! v "Have you hugged your wolf today?" `-_-'
peter@ficc.uu.net (Peter da Silva) (02/20/90)
> > 2. Ability to dynamically define new operators > Expand. What exactly do you want? MATRIX matrix.*(MATRIX a, MATRIX b); A := A matrix.* B; This could be abbreviated to: A := A .* B; But not (as in ADA): A := A * B; -- _--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ \_.--._/ Xenix Support -- it's not just a job, it's an adventure! v "Have you hugged your wolf today?" `-_-'
daniels@teklds.WR.TEK.COM (Scott Daniels) (02/20/90)
A pet peeve: why only decimal radix for floating point? How about a notation like: d:xxx.xxx Where d is the largest digit allowable in the radix, (thus solving the "what base do I write the base in problem). Then decimal 4.75 ould be 9:4.75, (hex) F:4.C, (octal) 7:4.6, or (binary) 1:100.11. The base for the exponent should be the same as the radix of the number itself (So the exponent indicates radix-point shifts) Note: exponent separator better be @ or something instead of `E'. Another thing I would like to be able to do is to indicate that a function is "pure" (only depends on its args), thus allowing the compiler to (1) complain when I violate that, and (2) compile-time evaluate values to produce constants, and (3) expand inline as it sees fit. On strings: SAIL had strings which consisted of a length and pointer part I know this prevents the "infinite length string" idea, but (1) substrings are easy, [lexing a line involves copying no character data], and (2) it is (almost) as easy to get to the last few chars of a string as the first few. On structures: (1) How about some way to provide structs which have negatively indexed fields as well as positively indexed fields. This allows structure elaboration in two ways (great for protocol layering.) (2) Rather than tightening the layout of records, have a modifier (like Pascal's "packed", but not optional to implement), which says "no padding", and otherwise use a loose rule that says "adding a field must not change the arrangement of variables previously placed," and explicitly "fields may be placed in holes in the structure". Thus the compiler is free to use the same layout for the following structs (assuming aligned ints): struct { char c,d; int a; } and struct { char c; int a; char d; } (3) Allow "anonymous" incorporation of a structure (or union): ie bring the field names to the same level as explicitly provided names, (of course name conflicts are errors). On types: >16. Basic types: int bits, uint bits >I disagree. The basic types should be those types that the machine can >handle quickly. The language must be efficient! It's perfectly fine to >have a standard notation for ``a type long enough to handle N bits'' or >``how many bits are in type X?'' but the language should not make >restrictions on the size of basic types. This was provided as a notation for accessing the basic types, but does not go far enough. I would like to be able to give a range which must be representable (like integral[1..29]), and have the type chosen, I don't always know the number of bits (and I could always use [0:(1<<7)-1] for 7-bitters). >(Then again, every case in which portability takes second place to >efficiency must be carefully considered and well documented. Two issues >along these lines are bit sizes and the semantics of mod. As I feel very >strongly that the second should be portable, I shouldn't assume that >nobody feels the same way about the first. Then again, wouldn't a >standard notation for your ``int 8'' be enough?) How about ADDING an operation `modulo' in addition to `%'. Then we can say either "fast, fits integer divide," or "result in range [0,modulus)." >I agree with C's philosophy of only allowing bit packing inside structures. But, it would be nice to have a packed vector of bits available somewhere (inside structures only would be fine with me). >Function arguments being structures: This could be useful if it's >combined with a simple way to deal with the program stack. This probably introduces another form of structure packing (and is a good idea but...be sure to allow the compiler to delete unused variables if it can remove them) Type coersions: Something between C's "forget all your type checking" and many other language's "you can't get there from here." How about a coersion that has both `from' and `to' parts. suppose: coerce ::= ( to_type : from_type ) e then for: ( dest_type : source_type ) e; It is an error if e is not "easily-coerced" to type "source_type", the internal conversion (to dest_type) is performed, and that is the type of the whole expression. -Scott Daniels daniels@cse.ogi.edu
new@udel.edu (Darren New) (02/20/90)
Actually, I've been playing with a language that has flexibility as its greatest goal. Each function is compiled as it is seen and can in turn compile other functions. This is essentially what FORTH and to some extent LISP do. However, in my language there are no parsers that cannot be overridden. That is, to parse the language, each character is read and appended to a buffer. Then each entry in a "parse" array is called in turn. Once one of the entries recognises the token in the buffer, it outputs the object code for that token and clears the buffer. This technique allows overloaded functions, new literals, "create-on-reference" functions (for example, it could create a "typecast" function given the name) and so on. This would be an especial boon if the intermediate code was standardized and fairly flexible, allowing optimizations to different architechures much like the Smalltalk or Pascal PCodes do. I think that if you can't add new literal types, you don't have a truely new language. To do this, you must be able to define internal representations of high-level structures in terms of low-level structures (like bit strings). From that, the language can be customized out the ptooks. Of course, it may not be readable when you are done, but... :-) -- Darren
cik@l.cc.purdue.edu (Herman Rubin) (02/20/90)
In article <4489:05:14:19@stealth.acf.nyu.edu>, brnstnd@stealth.acf.nyu.edu writes: > Syntax is less important than semantics, though of course a clean, > simple syntax is necessary for a language programmers actually like. > (ALPAL: A Language Programmers Actually Like. Naaah, too pretentious.) What is a simple syntax? Simple for whom, the human or the machine? For example, most assembler languages, macro designs, etc., have simple syntax for the machine but not for the human. > For the moment, general principles are more important than specifics. > > There should be some number of macro (preprocessing) levels to handle > trivial syntactic issues. I don't know what system would be best, or > if there even is a best system. I find the lack of a versatile typed macro processor extremely inconvenient, and I would find one preferable to any existing language, even if no other tools were available. For example, x = y - z should be the (= -) macro (or some such designation, and it should allow for the types of the arguments. > In article <8475@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes: > [ lots of suggestions ] > > 1, 2, 3. No semicolons. End-of-line comments. Block structure indicated > by indentation. > > These all relate to the syntax of simple statements and control > structures. The most important general issue is whether structures > should be explicitly terminated. The only advantage of C-ish failure to > terminate is that single-statement structures are slightly shorter; and > there are lots of syntactic disadvantages. Is there anyone out there who > really wouldn't like loop ... end/endloop/pool, etc.? I believe that we should have semicolons, but that an end-of-line should terminate a statement unless a specific exception is made. This is one of the most common sources of errors in C programs, and is in any case a nuisance. I definitely do not like to have to use such clumsiness as typing unnecessary strings for the convenience of the compiler. I do not like endloop/pool. I also do not believe that indentation is necessarily the right method for block structure. For one thing, by the 10th block in, it is certainly a nuisance. A suggestion would be to allow arbitrary block labels, and have an end pseudoinstruction with multiple labels. This is especially important when aborting to an explicit earlier place. .................... > 4. Overloadable and definable operators > > This is another syntax issue. The language MUST provide an unambiguous > syntax for everything. Fortran-90 is the only overloading language I > know that does this well. Overloading just means ambiguous abbreviation, > and definable operators are just a more convenient syntax for certain > functions. NO NO NO! An operator is not a function, especially if it is different for arguments of different types, such as the sum, product, power operators, etc. Also, I see no more reason for a function call, or even function notation, for power than for sum. It is no more reasonable to require x = pow(y,z) than x = sum(y,z). > 5. All characters allowed in symbols. > > Would you really want to read a program with ?)*[! as an identifier? Only as a macro name (see above), which the macro being more in the form of x ? y )* z [n! or something similar. > I wouldn't mind a macro facility that could handle this, or the ability > to partition the character set the way you want. However, the basic > language must have some namespace control to do any parsing at all. > Also, this language MUST be interoperable with other languages to be > useful. This means that global names must not be changed by the compiler. It is a real nuisance that the function sin in C becomes _sin to the loader, and that erf in Fortran becomes _erf_. When writing a program, I should not have to know from which language the subroutine library got the subroutines used, nor should I have to replicate subroutine libraries because of this. It is definitely the case that one may want to use subroutines from different sources, and this requires that names be unchanged by the compiler. This even applies if blocks are used across subroutines. > The issue of defining your own character set relates strongly to the > syntactic argument about overloading. Never force a reader to learn a > new language. This may be necessary. My basic operations are frequently so clumsy to duplicate in the existing languages that it is necessary to do otherwise. This includes the introduction of operator symbols and strings. For example, suppose I want to unpack floating point numbers into their exponents and mantissas. I do not want to have to try to do this with the debilities of languages like C. > 6. C-like initialization power. > > Well, okay. Take it for granted that declarations and definitions will > be at least as powerful as in C. > > 7. int **foo[] becoming [] * * int foo Or even better @ @ int foo. This is an unnecessay overloading of *, done because early UNIX had @ as the line kill character. > Yeah. C would be cleaner if all the ``type constructants'' had a single > syntax. This needs to be considered in much more detail to see what > people would like to use. Perhaps there's a simple, readable, consistent > way to provide everything in both prefix and postfix form; then nobody > can complain. > > 8. Eliminate arrays in favor of pointers and macros. > > Say what? You need some way to express the concept of a contiguous > region of memory. That's what arrays are for. How do pointers cleanly > express multidimensional arrays? The language should know something > about arrays, even if just for efficiency. I agree. this is one of the great lacks in C. > 9. Constants: $hex, decimal, %binary, 'c', 'abc' > > This is again a matter of taste; we'll see what people like. Many > different forms of constants can be provided without hurting simplicity > or readability. I don't agree with the combined syntax for strings and > characters: what do you do with single-character strings? The language > shouldn't have to know about strings; Pascal and Ada deal with strings > poorly. (C's problem is that there isn't a good enough syntax to easily > interface the language with different string-storage techniques.) I also > disagree with the idea of leaving out octal: finding a better syntax is > a good idea but there's no reason to take the feature away. Here nothing should be left out. There is a great need for floating point numbers not in decimal, at least octal or hex for the mantissa and exponent, but a base 2 exponent. > 10. Standard operators. > > This is, again, something that must be considered in much greater > detail to get right. (Yes, I agree that @ is a much more logical symbol > than * for indirection.) For the moment let's stick to general issues: > You're right that there should be Algol 68C-like assignments that relate > to a = b and a op= b the same way that a++ relates to ++a. The use of ++ and -- is another example which leads to problems. I have no problem with op=, but using bad symbols because you did not think of anything better is at least highly debatable. There is also the systematic use of symbols in C which conflict with long-standing notation. ASCII is not enough in any case. > As for =/== vs. :=/= vs. your :=/== vs. statements-ain't-expressions =/= > vs. =/.EQ. vs. ... : I dunno. When I'm coding on paper I alternate > between paper-only left-arrow/= and C's =/==. On the screen I've begun > using preprocessors that can handle my terminal's extended characters. > As many writers have observed, the problem is balancing paper tradition > with ASCII's rather inexpressive character set. ........................ > 13. Declarations anywhere. > > Yeah. > > 14. Control flow statements, control structures: [ various ] > > I have some rather heretical thoughts on this subject. I'll make them > clear in another message. (Remember that this isn't Ada. Given an > infinite loop ... endloop, if, and break, you don't need to provide > a terminating loop as a basic construct. Define it instead as a standard > macro. Ada's infinite variety of control structures is awful.) Mine are even more heretical. I insist on goto, and frequently terminate a loop by jumping out of it. Spaghetti algorithms call for spaghetti code, and I have lots of them. Structured programming can cause huge inefficien- cies, as well as being harder to understand. > 15. Structure and code generation rules: Variables are in memory in the > order of declaration. > > Yeah. I very much want more control over stack allocation and control > flow than in C. This is not dealt with by any current language and needs > a lot of thought. One idea I've been considering is replacing function > types with statement types. This makes setjmp/longjmp, multiple function > entry points, and various other techniques much cleaner. The problem is, > once again, how and when to allocate stack variables. DO NOT INSIST ON PASSING ARGUMENTS WITH STACKS. Register arguments are frequently better, and there are numerous other ways, such as argument arrays. There are situations where stacks are the way to do it, but memory references in general should be avoided where possible. ....................... > 16. Basic types: int bits, uint bits > > I disagree. The basic types should be those types that the machine can > handle quickly. The language must be efficient! It's perfectly fine to > have a standard notation for ``a type long enough to handle N bits'' or > ``how many bits are in type X?'' but the language should not make > restrictions on the size of basic types. A type need not even consist of adjacent elements. If a string requires a beginning address and a length, the pair is the designator of the string. It may or may not be desirable to have the indices in adjacent memory locations. An array descriptor would have the location of the 0,0, ..., 0 element, the dimensions, and if necessary the storage locations; these need not be adjacent, and some of this information can be shared. > (Then again, every case in which portability takes second place to > efficiency must be carefully considered and well documented. Two issues > along these lines are bit sizes and the semantics of mod. As I feel very > strongly that the second should be portable, I shouldn't assume that > nobody feels the same way about the first. Then again, wouldn't a > standard notation for your ``int 8'' be enough?) > > What about characters? What about floating-point types, which many > machines support better than ints? What about Ada-like fixed-point > types? It is almost impossible to get full portability on anything other than integer arithmetic, and even here there are problems. ....................... > 19. User-defined precedence. > > This is yet another syntactic preprocessing problem. Remember that the > language should be readable! I suspect that much of the problems are with precedence. I am not sure that we would not be better off without trying to make it rigid. Some of the precedences in C are gotten wrong by just about everybody. We could possibly used numbered parentheses to get around it. > 20. Parameter passing: [ various ideas ] > > This has to be dealt with very carefully. I like C's solution: it's clean > while allowing every trick Ada can do. A general principle here (which > you appear to disagree with) is that the form of a function call can > make clear the fact that a variable is not modified. But it is clumsy and slow. Now if we had a decent notation, so a function could return a list of results (NOT a struct). we could get around this. But trying to keep functions from having side effects is a losing proposition. > 21. Classes equal structures. Inheritance is just including one structure > in another. Function arguments are really structures. Classes equal typedefs. Structures are needed for more complicated situations. DO NOT insist on function arguments being structures; more time can be wasted by forming the structure than by computing the function. A list of values is more general than a structure by far, and the list need not be in consecutive locations in any way. ...................... > > The general goal > > is to make it both one step above assembly language and completely extendable. > > And modular. And clean. And robust. And likable, even fun to use! All of this can be done, but not portable. Good code can usually at most be semi-portable. ------------------------------------------------------------------------------ Another point that I wish to address is what I call the usurpation of notation. There are many somewhat standard uses of symbols in mathematics which are used in languages such as C for totally different meanings. The two most flagrant ones here are | and ^. I know of several uses of | in mathematics, none of which is "or". The notation by Backus was a long vertical, and it was used in the sense of || in C. The most common use of vertical lines in mathematics is for absolute value, and I believe it should have this meaning in programming as an overloaded operator. The use in mathematics extends for well over 100 years. There are even other uses of ^ in CS before C, none of which was exclusive or. This taking of symbols and defining their use to be something else because the inventors of the language are not sufficiently knowledgeable is a bad idea. Fortran designers avoided this and made it clear that they used * and ** because these were not already used for something else, and what they wanted to use was not available. I suggest that we make every effort to avoid using symbols which have other meanings. Whenever mathematical notation disagrees with that of applications, it is usually the mathematics that got there first. Also, make sure that the language is not restricted so that only an idiot can stand it. I believe it is possible to produce a language in which good programming can be done. Something to keep in mind is that people will find ways to do things that you have not thought of. So it is necessary to allow computer objects to be used as bit strings, to allow bitwise operations on floating point numbers, to allow the use of a number as something other than the language intended. Furthermore, the programmer may know things the compiler can use like frequency, etc. The programmer may have a good reason for keeping something in a register, or even insisting it be stored, which it is hard for the compiler to figure out. I have a natural example of a recursive program in which several registers should be kept across recursions. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
machaffi@fred.cs.washington.edu (Scott MacHaffie) (02/20/90)
In article <4489:05:14:19@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >14. Control flow statements, control structures: [ various ] > >I have some rather heretical thoughts on this subject. I'll make them >clear in another message. (Remember that this isn't Ada. Given an >infinite loop ... endloop, if, and break, you don't need to provide >a terminating loop as a basic construct. Define it instead as a standard >macro. Ada's infinite variety of control structures is awful.) Unconditional loops have a serious problem: you have to read all of the code inside the loop to find out when (or if) it terminates. Replacing while and for with loop would be a bad idea. Even providing loop means that people will use it and stick "end loop" inside the loop (this happens in ada). The advantage of a while/for loop is that the terminating condition (or at least the standard terminating condition) is easy to find. Then, only exceptional terminating conditions are inside the loop. Scott MacHaffie
kjj@varese.UUCP (Kevin Johnson) (02/20/90)
In article <4623:05:31:06@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >In article <12507@mcdphx.phx.mcd.mot.com> kjj@varese.UUCP (Kevin Johnson) writes: >> Rhetorical question: Aren't you talking about C++? >Of course not. C++ isn't even close to perfect. >Isn't there some change you'd like to make to C++ so that you'd like >programming even better than you do now? Fine, say so. Then iterate. >Hopefully almost everyone's wishes will be synthesized into this new, >still-to-be-named language. (Now there's a name: FOO. Naaah, people >might remember what foo stands for, and lots of young urban CS types >will think it stands for something object oriented.) I don't want to get in a flaming war - but that's an awfully long response to a rhetorical question :-) Oh well, that's what I get for not putting on a smily face when I ask a rhetorical question :-| >> Semi-rhetorical question: What would be this language's intended use? >Similar to C. It will have the ``low-level'' features of C so that >it's appropriate for systems programming, but there's no particular >focus. (I use UNIX C for complex numerical programming, so I may be >biased.) >> 1. How about string operators. >> I hate handling allocing of space for something silly like strings... >This is mainly a library problem (though a good syntax helps). A good syntax is the crux of the biscuit (to quote a favorite Zappaism). >> 2. Ability to dynamically define new operators >Expand. What exactly do you want? We're not talking p-code, you know. >Are you looking for something that can't be implemented on top of the >language? I realize 'We're not talking p-code'... How about something similar in flavor to OO methods. It doesn't have to be as lean and mean as the core operators, but having the ability to do it would be extremely useful... BTW, doesn't this smells like the string-operator point(s) brought up earlier. A good syntax helps... >> 3. Ability to use existing C libraries and headers. >At least to interface with the loader the same as other languages. As >for headers: one of the first standard applications will be a program to >convert C function prototypes to this language. (Having the same macro >processing is too much to ask, because C's macro processor is so >limited. But most libraries do fine with just the function interface.) I agree. >> Seriously, I would consider the ability to link in existing >> libraries, one way or another, an absolute must. >I agree. Well, now that that's over... Some of the other replies to your original article mentioned a language with semi-colons, with indentation providing the information about loop bodies, etc... HERE HERE! This feature is extremely cheap to provide. In reference to article <4489:05:14:19@stealth.acf.nyu.edu>: In general you have my vote (for what it's worth) on your responses in the article. The following items cause me to input: >Is there anyone out there who really wouldn't like loop ... >end/endloop/pool, etc.? My own personnal feeling is that they are contrary to human nature. Well, maybe not that bad, but... (maybe) >Anyway, I favor a syntax that doesn't depend on lines or indentation: >otherwise it's too easy to make syntax errors. The same can, most definitely, be said of the trad C form...
raymond@twinkies.berkeley.edu (Raymond Chen) (02/20/90)
[I regret that I have but one line to give for my...MUNCH]
One of the few things I like about Pascal is its rigid typing.
If I have defined [using C-style notation]
typedef int xcoord;
typedef int ycoord;
typedef int attrib;
void put_as_at(char c, attrib a, xcoord x, ycoord y) { ... }
then it would be nice if the compiler would flag
{ xcoord x; ycoord y; char c; attrib a;
foo(y,x,a,c);
}
as potentially erroneous (A warning is fine). It's amazing how many
stupid errors are caused by passing parameters to a function in the
wrong order. (And yes, of course, there should be a way to tell
the compiler "No, really, I know what I'm doing, trust me.")
Would also be fun if I could invoke the function above as
put c as a at (x,y)
^^^ ^^ ^^ <- these guys are the function name
(In the never-ending quest to make pseudo-code a proper computer language!)
As for using indentation to indicate block structure: This can lead
to ridiculous code when your indentation marches off the edge of the
paper. It also could create entries into the Obfuscated [language-name]
Code Contest like this:
foo(a,b,c)
if blah
while blah
for blah
if blah
for blah
while blah
if blah
for blah
if blah
while blah
if blah
grumble
blurfle // guess what indentation level this is at!
bar(x,y,z) // this too. Is it a function call or
// a function declaration?
Apart from having immense fun with indentation, it also causes problems
if you cut and paste a clump of code from one place to another if the
destination has a different indentation level from the source.
If my memory serves me right, this experiment with "indentation determines
block structure" was used in a Pascal dialect many years ago. I don't
remember what eventually happened to it.
--
raymond@math.berkeley.edu mathematician by training, hacker by choice
poser@csli.Stanford.EDU (Bill Poser) (02/20/90)
In article <2346@castle.ed.ac.uk> nick@lfcs.ed.ac.uk (Nick Rothwell) writes: > >You're kidding, right?. If you're going to mess around with antiquated >low-level languages like C, I don't see the point of bringing yet >another one into the world. Look at the functional languages like ML,... I don't quite agree. It's true that there are too many low-level languages (and there probably always will be), but there is a role for languages of this type. Many of the new-fangled languages are either nice for limited sorts of tasks (e.g. logic programming languages) or carry with them a lot of overhead and or "protection" from low-level aspects of the machine. However much you may like ML or Prolog or whatever, there is a role for system programming languages, and C is one of the best. Nonetheless, it isn't perfect, so it makes sense to design improved C-like languages. Perhaps eventually some of the nicer higher-level languages will prove to be good for the sorts of applications that C is used for, but that hasn't yet proved to be the case, has it? Another point to consider is how easy it is to try out innovations. Languages that are highly developed in certain directions may be quite difficult to modify in other directions. Lower-level languages like C are often better testbeds for innovations. Consider, for example, the sources of some programming language innovations. Pattern matching and goal-directed evaluation, as in ICON, have their origin in SNOBOL (and to some extent in still more primitive languages like COMIT). SNOBOL's structuring is poor even by the standards of the day (compare ALGOL) but it led to some important and interesting ideas. Similarly, object-oriented programming started off with Simula, a language that in other respects probably wasn't state-of-the-art. And look at how many innovations come from LISP. (To ward off the claim that LISP is a high-level language, let me point out that while it was innovative from the beginning, LISP was for many years quite low level, in terms, for example of the primitive nature of its control structures, its lack of aggregate data structures other than lists, and, to take a stand on a religious issue, the lack of useful syntax. It is only in recent years that LISP has acquired more reasonable control structures and other modern conveniences.)
freek@fwi.uva.nl (Freek Wiedijk) (02/20/90)
In article <111355@ti-csl.csc.ti.com> gateley@m2.csc.ti.com (John Gateley) writes: >In article <8475@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes: ><(3) Block structure indicated by indentation level: ><... >Now, just take it a little further: get rid of all infix notation, >and let "blocks" be denoted by ( and ) and you get: > ... >and you have achieved a truly simple easy to use syntax where you >dont have to worry about indentation (the editor does it for you), >and programs which manipulate source code are much much easier to >write. Ehrm, no, I don't like this, because it is too verbose: % cat foo.foo while a!=b int q q=z*5 r+=foo(q) % cat foo.sch (begin (while (!= a b) (let ((q integer)) (= q (* z 5)) (+= r (foo q))))) % wc -c foo.* 36 foo.foo 92 foo.sch 128 total % bc scale=2 92/36 2.55 Your solution has two-and-a-half times as many characters! In my opinion the main advantage of C with respect to Pascal, is that C enables you to write "int" where Pascal forces you to say "integer" :-) Also, I don't like this amount of parentheses: (+= r (foo q))))) ^^^^^ I know that you can let the editor handle it, but it still confuses me. -- Freek "the Pistol Major" Wiedijk Path: uunet!fwi.uva.nl!freek #P:+/ = #+/P?*+/ = i<<*+/P?*+/ = +/i<<**P?*+/ = +/(i<<*P?)*+/ = +/+/(i<<*P?)**
kjj@varese.UUCP (Kevin Johnson) (02/21/90)
In article <1990Feb20.025947.16211@agate.berkeley.edu> raymond@math.berkeley.edu (Raymond Chen) writes: >Apart from having immense fun with indentation, it also causes problems >if you cut and paste a clump of code from one place to another if the >destination has a different indentation level from the source. Surely you jest...
peter@ficc.uu.net (Peter da Silva) (02/21/90)
In article <1990Feb20.025947.16211@agate.berkeley.edu> raymond@math.berkeley.edu (Raymond Chen) writes: > put c as a at (x,y) > ^^^ ^^ ^^ <- these guys are the function name > (In the never-ending quest to make pseudo-code a proper computer language!) Sounds like Smalltalk. Except then it'd be: Put: c as: a at: x@y -- _--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ \_.--._/ Xenix Support -- it's not just a job, it's an adventure! v "Have you hugged your wolf today?" `-_-'
nick@lfcs.ed.ac.uk (Nick Rothwell) (02/21/90)
In article <12336@csli.Stanford.EDU>, poser@csli (Bill Poser) writes: >I don't quite agree. It's true that there are too many low-level >languages (and there probably always will be), but there is a role >for languages of this type. Many of the new-fangled languages are either >nice for limited sorts of tasks (e.g. logic programming languages) >or carry with them a lot of overhead and or "protection" from low-level >aspects of the machine. However much you may like ML or Prolog or >whatever, there is a role for system programming languages, and C >is one of the best. True; but the original article said something about programmers being able to agree on the nice features of a language; whereas what you're saying above (and I agree) is that this will never happen, since (for example) I'm going to be hacking away in C on my Mac for a long while yet, even though I think C is dreadful and that programming language design has moved on a long way. I think, though, that any serious time and effort on designing a better C-style language would be better spent getting decent modern languages to a state of maturity. Nick. -- Nick Rothwell, Laboratory for Foundations of Computer Science, Edinburgh. nick@lfcs.ed.ac.uk <Atlantic Ocean>!mcvax!ukc!lfcs!nick ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ...als das Kind, Kind war...
siebren@piring.cwi.nl (Siebren van der Zee) (02/21/90)
gateley@m2.csc.ti.com (John Gateley) writes: >In article <12507@mcdphx.phx.mcd.mot.com> kjj@varese.UUCP (Kevin Johnson) writes: >>1. How about string operators. >> I hate handling allocing of space for something silly like strings... >But, string sizes are not known at compile time, and so must be handled >by the heap (i.e. alloc). Right. Now if you're gonna put dynamic allocation in your language anyway, don't forget to handle "automatic" growing of the stacks in multithreaded environments. This cannot be done by the operating system, since the virtual memory just above the top of the stack that needs to be grown may be used by another thread's stack. The compiler can do this by checking at each procedure invocation that the stack is at least big enough for this frame, and if not, allocate a stackframe at the heap. If this allocation fails, you got a stack overflow. (I did something similar in a multithreading package for the AtariST). You can also take this message as a funny way to try to convince you that you're probably not going to succeed in designing a language that will please everybody. I guess the original poster is probably already convinced by messages suggesting to implement Lisp, C++, ML, and what-have-you-there :-) Siebren, siebren@cwi.nl
gateley@m2.csc.ti.com (John Gateley) (02/21/90)
In article <447@fwi.uva.nl> freek@fwi.uva.nl (Freek Wiedijk) writes: >In article <111355@ti-csl.csc.ti.com> gateley@m2.csc.ti.com (John Gateley) >writes: >> [Introduces Lisp style syntax] >Ehrm, no, I don't like this, because it is too verbose: > >[character counts deleted] > >Your solution has two-and-a-half times as many characters! In my >opinion the main advantage of C with respect to Pascal, is that C >enables you to write "int" where Pascal forces you to say "integer" :-) If you are worried about character counts, you can abbreviate the words I spelled out in my example, that takes care of some. However, as long as the verbosity does not clutter the language (as is the case with cobol, though some will probably argue that), I feel that it should not be an issue. You can always custom build an editor with macros to do your typing for you. >Also, I don't like this amount of parentheses: > (+= r (foo q))))) >I know that you can let the editor handle it, but it still confuses me. This is the main complaint I have heard with Lisp style syntax. However, after a few sessions with the editor to learn how to use it, the confusion goes away. Each paren has a matching paren, thats not confusing, its wondering which paren goes with which when you see )))))) that makes you feel confused. However, with automatic indenting (no human mistakes), things line up nicely, and with the editor to jump from any paren to its partner, life becomes easy. John gateley@m2.csc.ti.com
jlg@lambda.UUCP (Jim Giles) (02/21/90)
From article <22569:05:10:24@stealth.acf.nyu.edu>, by brnstnd@stealth.acf.nyu.edu: > [...] Take C as a starting point for good ideas and feel free to > use parts of any other language. [...] I'm sorry, but I can't find any good ideas in C. J. Giles
poser@csli.Stanford.EDU (Bill Poser) (02/21/90)
In article <2386@castle.ed.ac.uk> nick@lfcs.ed.ac.uk (Nick Rothwell) writes: > >True; but the original article said something about programmers being >able to agree on the nice features of a language; whereas what you're >saying above (and I agree) is that this will never happen, Yes, I agree. Different languages are good for different tasks, and even within a given area there are real differences in individual taste. Attempts to design a language that does everything generally seem to produce unpleasant results (predictable swipe at ADA ommitted for brevity.)
jlg@lambda.UUCP (Jim Giles) (02/21/90)
From article <4489:05:14:19@stealth.acf.nyu.edu>, by brnstnd@stealth.acf.nyu.edu: > [...] > A line-based syntax also feels very dirty: there are exceptions for > multiple statements on a line, exceptions for single-statement > structures, etc. I don't understand what excaptions you are refering to. If the recognized statement separater is semicolon, then make the end_of_line character an alias for semicolon. Now you have a line-based syntax which is (almost) identical to the non line-based version you started with. Productivity experiments have shown that people work better if the end_of_line is also the end_of_statement and the end_of_comment marker. (And, don't bring up compound statements at this point. Experiments have also shown that people work better if flow control is _not_ done with so-called compound statements.) > 8. Eliminate arrays in favor of pointers and macros. > > Say what? You need some way to express the concept of a contiguous > region of memory. That's what arrays are for. How do pointers cleanly > express multidimensional arrays? The language should know something > about arrays, even if just for efficiency. Hear, hear!!! > 13. Declarations anywhere. > > Yeah. I third that motion! Compilers are multipass these days everywhere. Why not take advantage of it? > 14. Control flow statements, control structures: [ various ] > > I have some rather heretical thoughts on this subject. I'll make them > clear in another message. (Remember that this isn't Ada. Given an > infinite loop ... endloop, if, and break, you don't need to provide > a terminating loop as a basic construct. Define it instead as a standard > macro. Ada's infinite variety of control structures is awful.) I disagree. Ada's control constructs are among the few thing they did well. Furthermore, they aren't all that complicated - they are completely defined in the last 7 pages of chapter 5 in the Ada standard document (and, _MOST_ of that text is examples). > [...] > I agree with C's philosophy of only allowing bit packing inside > structures. Other packing methods would really mangle the concept of > pointers. In languages with recursive data types, direct dynamic memory (like ALLOCATABLE in Fortran 90), and type coersion I've never seen the need for pointers _AT_ALL_!! So, rejecting something because it interferes with pointers is a null issue. > 20. Parameter passing: [ various ideas ] > > This has to be dealt with very carefully. I like C's solution: it's clean > while allowing every trick Ada can do. A general principle here (which > you appear to disagree with) is that the form of a function call can > make clear the fact that a variable is not modified. It is a good idea for side-effects to be visible. On the other hand, it is also a good idea (an often more important) for aliasing (or the lack thereof) to be visible. C's solution is anything but 'clean'. J. Giles
scott@bbxsda.UUCP (Scott Amspoker) (02/21/90)
In article <14241@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >From article <22569:05:10:24@stealth.acf.nyu.edu>, by brnstnd@stealth.acf.nyu.edu: >> [...] Take C as a starting point for good ideas and feel free to >> use parts of any other language. [...] > >I'm sorry, but I can't find any good ideas in C. Well, here are a few ideas in C (although not unique to C): comments if-then-else looping arrays structures *sigh* I was thinking about joining in on this discussion but now I think I'll pass. -- Scott Amspoker Basis International, Albuquerque, NM (505) 345-5232 unmvax.cs.unm.edu!bbx!bbxsda!scott
ted@nmsu.edu (Ted Dunning) (02/21/90)
In article <12520@mcdphx.phx.mcd.mot.com> kjj@varese.UUCP (Kevin Johnson) writes: In article <1990Feb20.025947.16211@agate.berkeley.edu> raymond@math.berkeley.edu (Raymond Chen) writes: >Apart from having immense fun with indentation, it also causes problems >if you cut and paste a clump of code from one place to another if the >destination has a different indentation level from the source. Surely you jest... actually what he meant is that when you take the cards from one part of the deck and move them to another, you have trouble with indentation. -- Offer void except where prohibited by law.
brnstnd@stealth.acf.nyu.edu (02/21/90)
In article <111357@ti-csl.csc.ti.com> gateley@m2.csc.ti.com (John Gateley) writes: > >3. Ability to use existing C libraries and headers. > This is truly a difficult problem, because you have to say: "whats so > special about C, I want my <foo> libraries" where <foo> might be Ada, > Lisp, PDP-11 assembler, or whatever. No. Under UNIX, for example, one can without any trickery load Fortran, Pascal, and C routines together. This is useful, though it does dictate that the stack be used in particular ways. With N languages running around it's impossible to write N^2 translators. ---Dan
xanthian@saturn.ADS.COM (Metafont Consultant Account) (02/21/90)
Better, make the language strictly postfix, give it exactly one grouping operator, parentheses will do, and you get a regular syntax and lots of help for the compiler. See the on going discussion in comp.lang.forth. For operators where it makes sense, have a default number of arguments, and also an optional compilation (overloading) to handle the variable length list of arguments case. This completely eliminates questions of precedence, etc. For example: a b + is an expression for the sum of a and b. (a b c d) + is an expression for the sum of a, b, c, and d. and a b c Func1 is a fixed number of arguments function call (a b c d) Func2 is a variable number of arguments function call -- xanthian@ads.com xanthian@well.sf.ca.us (Kent Paul Dolan) Again, my opinions, not the account furnishers'.
kevin@argosy.UUCP (Kevin S. Van Horn) (02/22/90)
In article <14242@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: > >In languages with recursive data types, direct dynamic memory (like >ALLOCATABLE in Fortran 90), and type coersion I've never seen the >need for pointers _AT_ALL_!! So, rejecting something because it >interferes with pointers is a null issue. > Would you care to expand on this? I'm not sure what "direct dynamic memory" is, for starters. ------------------------------------------------------------------------------ Kevin S. Van Horn | The means determine the ends. kevin@argosy.maspar.com |
jlg@lambda.UUCP (Jim Giles) (02/22/90)
From article <628@bbxsda.UUCP>, by scott@bbxsda.UUCP (Scott Amspoker): > In article <14241@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: > [...] >>I'm sorry, but I can't find any good ideas in C. > > Well, here are a few ideas in C (although not unique to C): > [...] I'm sorry, I should have been more clear. I can't find any good ideas in C which aren't done as well or better (usually better) in many other languages. This includes languages which predate the invention of C. > [...] > arrays > [...] C doesn't _have_ arrays! It has a strange variant of pointers which can (on rare occasions) simulate arrays in a way that is almost as efficient and easy to read as arrays would have been. Usually, however, the simulated arrays are less efficient and cumbersome to use. J. Giles
gateley@m2.csc.ti.com (John Gateley) (02/22/90)
In article <10979@saturn.ADS.COM> xanthian@saturn.ADS.COM (Metafont Consultant Account) writes: > >Better, make the language strictly postfix, give it exactly one So, why is postfix better than prefix? One other poster mentioned postfix, and made the claim that it was better than prefix as well, and I am curious why y'all think so. (i'd take postfix over infix any day, but prefer prefix because I am used to it). John gateley@m2.csc.ti.com
cg@myrias.com (Chris Gray) (02/22/90)
In article <22569:05:10:24@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >So what do you want in a compiled, imperative, perhaps object-oriented >language? Take C as a starting point for good ideas and feel free to >use parts of any other language. Remember: This isn't Ada. If it gets >too complicated, trash it. Simple is beautiful. Modular design is >beautiful. And above all, remember that this is going to be a language >people can actually like. The first thing to decide is a bit more detail on what kind of language you are after - many of the decisions about features are affected by that. Your hope that one language will satisfy most programmers is pretty well doomed to failure - programmers are much too particular. For example, the description posted by Joseph H Allen represents a language I would be very uncomfortable with. One goal for ANY language is that it be quickly readable by anyone, whether they are familiar with that class of language or not. Another goal for a compilable language is that it be reasonably compilable. C's main problem with compilation is that the syntax is so ambiguous that a single error (try putting a semicolon after a function header in gcc!) can lead to hundreds of error messages. Some of Mr. Allen's ideas would be even worse in terms of error recovery (and, in my opinion, in terms of readability). So as to contribute a different viewpoint to this discussion, let me try to summarize my language. It has some problems, but I and others have found it to be quite usable. The language is called "Draco" and it currently exists only for prehistoric CP/M computers and for the Commodore Amiga. Its syntax is somewhat like that of Algol68, but with much less overloading. Its semantics is somewhat like C's, but it is much more strongly typed. Draco is a very "dull" language - it uses regular rules for identifiers, has pretty standard syntax and semantics, is only slightly extendible, and has no especially interesting new ideas. It is very easy to parse, and fairly simple to generate good code for. Most people have no trouble at all reading it for the first time (given experience in C, Pascal, Algol, whatever). Rather than try to given a grammar, I'll just type in a mishmash program that tries to use most of the features. #drinc:dos/libraries.g /* include a system header */ extern fred(int i; short j; long k; char c; bool flag)float; uint MAX_DISKS = 10; proc hanoi(unsigned MAX_DISKS n; *char left, middle, right)void: if n ~= 0 then hanoi(n - 1, left, right, middle); writeln("Move disk ", n, " from ", left, " peg to ", right, " peg."); hanoi(n - 1, middle, left, right); fi; corp; proc doit()void: hanoi(5, "left", "right", "center"); corp; proc matmult([*,*] float a; [*,*] float b; [*,*] float c)bool: register uint i, j, k; float sum; if dim(a, 2) ~= dim(b, 1) or dim(a, 1) ~= dim(c, 1) or dim(b, 2) ~= dim(c, 2) /* did I get this right???? */ then false else for i from 0 upto dim(a, 1) - 1 do for j from 0 upto dim(b, 2) - 1 do sum := 0; ... od; od; true fi corp; /* actually all declarations have to be inside functions or before all functions, but this is only an example */ type array_t = [MAX_DISKS * 3, MAX_DISKS + 17] uint, element_t = unknown 100, /* 100 bytes long */ list_t = struct { *list_t l_next; element_t l_this; }, thingType_t = enum {red, blue, green}, /* NOT just ints */ otherThingType_t = union { *char ott_name; long ott_counter; *somethingorothertype ott_default; }; /* I won't try to do a user extendible type here - the compiler comes with a complex number package that I can use like: */ complex I = (0.,1.); complex a, b, c; *list_t BaseList; [0b101] struct { *char name; thingType_t colour; } words := { {"red", red}, {"blue", blue}, {"green", green}, {"white", red}, {"", red} }; proc insert(element_t e)void: register *list_t l; l := new(list_t); /* language construct */ l*.l_next := BaseList; l*.l_this := e; BaseList := l; corp; proc constructs(**[2,3,4][5,6]*float youGottaBeKidding)void: thingType_t tt; *list_t l; ulong n; a := b + c; if re(a) < 0.0 then writeln("a = ", a); elif a = complex(1.1, 1.1) then readln(a, b, c); fi; l := BaseList; while l ~= nil do ... l := l*.l_next; od; case tt incase red: writeln("It was red"); incase green: writeln("It was green"); incase blue: writeln("It was blue"); default: writeln("Somebody boobooed!\(BEL)\(0x07)\(0x06+2)"); esac; l := if tt = red then nil else l*.l_next fi; if tt = blue then return fi; while write("This is a prompt. Enter two integers: "); readln(i, j) do writeln("You entered ", i, '(', i:x:8, ':', i:b:32, ':', i:o:7, ")"); od; n := (0xfdecba98 & 0o237458 >< 0b100000001) << ~n; l := pretend(128, *list_t) + 6 * sizeof(list_t); corp; Anyone who wants to try out the language and compiler is invited to go out and buy an Amiga - the compiler, etc. are available freely. -- Chris Gray Myrias Research, Edmonton +1 403 428 1616 cg@myrias.COM {uunet,alberta}!myrias!cg
djones@megatest.UUCP (Dave Jones) (02/22/90)
From article <14242@lambda.UUCP>, by jlg@lambda.UUCP (Jim Giles): > Experiments have also > shown that people work better if flow control is _not_ done with > so-called compound statements.) > Please elaborate. >> ... >> I agree with C's philosophy of only allowing bit packing inside >> structures. Other packing methods would really mangle the concept of >> pointers. > > In languages with recursive data types, direct dynamic memory (like > ALLOCATABLE in Fortran 90), and type coersion I've never seen the > need for pointers _AT_ALL_!! So, rejecting something because it > interferes with pointers is a null issue. > Huh? I think somebody missed the point. Those 'recursive data structures' etc. are just teaming with pointers. Whether the programer declares them, or the compiler sneaks them in, the hardware still wants them to point to properly aligned data.
brnstnd@stealth.acf.nyu.edu (02/22/90)
In article <1873@wrgate.WR.TEK.COM> daniels@teklds.WR.TEK.COM (Scott Daniels) writes: > A pet peeve: why only decimal radix for floating point? Yeah. (Anyone know why Ada stops at base 16?) > Another thing I would like to be able to do is to indicate that a function > is "pure" (only depends on its args), I like this. > On strings: SAIL had strings which consisted of a length and pointer part Library problem. It should be kept in mind as a syntax issue. > (1) How about some way to provide structs which have negatively indexed > fields as well as positively indexed fields. I'm not sure what you're looking for. [ struct layout ] Must this be forced upon the compiler? Quality-of-implementation issues are important but I can't imagine a portable program taking advantage of this. > (3) Allow "anonymous" incorporation of a structure (or union): Yeah. JHA suggested ``inherit''---makes perfect sense to me. [ integral[1..29] ] This seems reasonable. [ % ] Would someone please show me an example of a real program that uses C's % in a context where not knowing the sign would be okay? Until that example shows up, this argument is purely facetious. It's fine to have two portable operators with different results, like Ada's rem and mod. > >I agree with C's philosophy of only allowing bit packing inside structures. > But, it would be nice to have a packed vector of bits available somewhere > (inside structures only would be fine with me). Do you mean an actual array of bits? How would you integrate this with the normal meaning of arrays? > Type coersions: > Something between C's "forget all your type checking" and many other > language's "you can't get there from here." How about a coersion that > has both `from' and `to' parts. Wait a minute! C's weak typing has nothing to do with the overloading of its conversion functions. Even Ada, with the strongest typing around, has overloaded conversions. Merely introducing the above change wouldn't do anything to C's type checking. ---Dan
brnstnd@stealth.acf.nyu.edu (02/22/90)
In article <10790@june.cs.washington.edu> machaffi@fred.cs.washington.edu.cs.washington.edu (Scott MacHaffie) writes: > Unconditional loops have a serious problem: you have to read all of the > code inside the loop to find out when (or if) it terminates. Replacing > while and for with loop would be a bad idea. Even providing loop > means that people will use it and stick "end loop" inside the loop > (this happens in ada). This is specious. Assume break and if; then while, do, for, and loop are all equivalent, in the sense that each can be written purely syntactically in terms of any of the others. Therefore (in line with the goals of the language, to evolve in another thread) the simplest construction wins. (There are several criteria for choosing ``loop'' as the simplest; but I don't think anyone will argue, so I won't elaborate further.) ---Dan
brnstnd@stealth.acf.nyu.edu (02/22/90)
In article <14242@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: [ make syntax line-based ] I agree! Let's make it column-based, too. And let's reserve a column for indicating comments. Wow, I think we're onto something here. :-) > Productivity > experiments have shown that people work better if the end_of_line is > also the end_of_statement and the end_of_comment marker. Comments, yes. Statements, no. The studies you're referring to compared line-based syntax to Pascal or C, where adding a semicolon after a block or statement can cause a syntax error or change the meaning of the code. Such syntactic problems disappear when loops are explicitly terminated. A study in Computer Languages several years back found that terminated loops ended up with by far the fewest errors per programmer. > Ada's control constructs are among the few thing they did well. Given the lack of any macro facility, they did fine. They would have done much better to provide a standard method to define new control structures, then reduced the standard set to three or four with no special cases. > In languages with recursive data types, direct dynamic memory (like > ALLOCATABLE in Fortran 90), and type coersion I've never seen the > need for pointers _AT_ALL_!! So, rejecting something because it > interferes with pointers is a null issue. Oh, yeah! In a language with tasks, automatic threading, and message passing, I've never seen the need for semaphores _AT_ALL_!! C'mon, be serious. > It is a good idea for side-effects to be visible. One solution is declaring ``pure'' functions. A more general solution is to allow a block to list each outside variable explicitly. This leads to the idea that the compiler should be able to automatically generate such lists within the code. > On the other hand, it > is also a good idea (an often more important) for aliasing (or the lack > thereof) to be visible. C's solution is anything but 'clean'. You have this hangup with aliasing... :-) I agree that there should at least be a way to specify that parameters aren't aliased, and to include that information into the function call syntax. This would take care of most of the common cases. ---Dan
preston@titan.rice.edu (Preston Briggs) (02/22/90)
In article <24349:04:46:47@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >Given the lack of any macro facility, they did fine. They would have >done much better to provide a standard method to define new control >structures, then reduced the standard set to three or four with no >special cases. Consider perhaps the Scheme ideal of lambda, if, call, set!, and catch as a base, then adding all the rest of the control structure with macros. >> In languages with recursive data types, direct dynamic memory (like >Oh, yeah! In a language with tasks, automatic threading, and message >passing, I've never seen the need for semaphores _AT_ALL_!! C'mon, be >serious. It's not such a bad point. ML (and other languages) with recursive data types manage very nicely without (explicit) pointers. Heap allocated data is probably needed, but automatic type coercions aren't. So, how about garbage collection?
khan@milton.acs.washington.edu (I Wish) (02/22/90)
In article <4489:05:14:19@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: >14. Control flow statements, control structures: [ various ] > >[...] Remember that this isn't Ada. Given an >infinite loop ... endloop, if, and break, you don't need to provide >a terminating loop as a basic construct. Define it instead as a standard >macro. Ada's infinite variety of control structures is awful. This may be a nit-picky detail, but what's the difference whether a terminating loop is a "basic construct" or a "standard macro"? If it's "standard," everyone using ALPAL wwill have it, and, I believe, use it. If you're trying to discourage this variety of control structures, it seems you'd want to discourage this sort of macro.... and whether it's actually handled in the preprocessor or the compiler proper is an efficiency-of-implementation detail that could be transparent. (Ideally -- I sure *hope* no one wants to redefine "for" (: ) -- "indecipherable strangers handing out inexplicable humiliation and an unidentified army of horsemen laughing at him in his head ..." -- Douglas Adams Erik Seaberg (khan@milton.u.washington.edu)
new@udel.edu (Darren New) (02/22/90)
In article <111706@ti-csl.csc.ti.com> gateley@m2.csc.ti.com (John Gateley) writes: >So, why is postfix better than prefix? Well, for one it is much easier to parse. I would estimate it is about as much easier to parse compared to prefix as prefix is to infix. Two, it is much more flexible. For example, in FORTH there is a word (aka procedure, function, ...) called : (pronounced COLON :-) that reads the next string from the input and starts compiling a new function with its name being that string. Much like defun in LISP, except that the syntax is not fixed. With prefix, you must make a distinction between functions that evaluate their arguments and functions that do not (defun, cons, etc). In postfix, the evaluated arguments come first and the non-evaluated arguments come afterwards. Of course, I've been working on my own language that is postfix and completely syntax-free; this may make me see some of this stuff in a prejudiced way. Maybe it is possible to make prefix as flexible as postfix, but I don't know how off hand. -- Darren
mike@cs.arizona.edu (Mike Coffin) (02/22/90)
From article <14244@lambda.UUCP>, by jlg@lambda.UUCP (Jim Giles): > C doesn't _have_ arrays! It has a strange variant of pointers which > can (on rare occasions) simulate arrays in a way that is almost as > efficient and easy to read as arrays would have been. A correction for those readers not familiar with C: the above is not true. Arrays and pointers are different beasts. The confusion arises because array names are *converted* to pointers when passed as parameters and because the [] operator can be used on both. To make an analogy, in both Fortran and C, integers are sometimes converted automatically to reals (floats in C) and many of the same operators apply to integers and reals but that doesn't mean that Fortran and C don't really _have_ an integer data type. -- Mike Coffin mike@arizona.edu Univ. of Ariz. Dept. of Comp. Sci. {allegra,cmcl2}!arizona!mike Tucson, AZ 85721 (602)621-2858
gateley@m2.csc.ti.com (John Gateley) (02/23/90)
READABILITY CONSIDERED HARMFUL! In article <635635738.28255@myrias.com> cg@myrias.com (Chris Gray) writes: >One goal for ANY language is that it be quickly readable by anyone, whether >they are familiar with that class of language or not. I disagree with this statement. It might be fair to say that a language should be readable by anyone who is familiar with the basic concepts, but what Chris's statement does is limit a language to concepts that everyone will be familiar with. Anything that is subtle, powerful, or exciting also has the unfortunate drawback that it is harder to read. Consider "call/cc" in Scheme, forward inferencing a la OPS5, backwards inferencing a la Prolog, Combinatory logic, first class functions and ... John gateley@m2.csc.ti.com
toma@tekgvs.LABS.TEK.COM (Tom Almy) (02/23/90)
In article <111706@ti-csl.csc.ti.com> gateley@m2.csc.ti.com (John Gateley) writes: >So, why is postfix better than prefix? >One other poster mentioned postfix, and made the claim that it >was better than prefix as well, and I am curious why >y'all think so. (i'd take postfix over infix any day, but prefer >prefix because I am used to it). Most postfix fanatics (typically Forth programmers, of which I am one) will say postfix is better than prefix (thinking of LISP) because it eliminates the need for all of those parenthesis. In fact, parenthesis (as grouping operators) are only needed if you don't know how many arguments are needed for a function. You can get rid of most parenthesis in a prefix language, for instance LOGO behaves much like a parenthesis-free LISP. The only real advantage of postfix is that it can be directly executed in a stack architecture. What bothers me is not so much pre vs. in vs. post but what I call mixfix. Mixfix is a hodgepodge of pre/in/post fix notation that can be very confusing. At least LISP is consistantly prefix (LOGO isn't). Some offenders: Back in the days of the calculator wars, there was HP touting postfix while TI touted algebraic (which I would consider to be infix diadic functions and prefix monadic functions eg 4 + sin(-x) ). Yet HP wasn't fully postfix -- the register access functions (sto and rcl) were prefix. And TI's monadic functions were really postfix eg 4 + (x - sin) ). Forth, the "king of postfix" uses prefix functions for string arguments. C is a hodgepodge of pre/post/in fix. eg ++ is either pre or post, * is either pre or in, [] is post. In an integer constant 0 and 0x are pre and L is post. Tom Almy toma@tekgvs.labs.tek.com Standard Disclaimers Apply
econrad@thor.wright.edu (Eric Conrad) (02/23/90)
From article <10979@saturn.ADS.COM>, by xanthian@saturn.ADS.COM (Metafont Consultant Account): > Better, make the language strictly postfix, give it exactly one ^^^^^^^ Why not prefix notation? Prefix notation is more common than postfix in mathematical literature, f(x,y,z) rather than (x,y,z)f I suspect that it is a easier to read for those of us used to reading from left to right since it emphasizes the operators rather than the operands. Of course I haven't used an HP calculator in a long time so I am probably prejudiced. -- Eric Conrad +----------------------------------------------------------+ | Eric Conrad - Wright State University | | "Progress was all right once, but it went on too long." | +----------------------------------------------------------+
jlg@lambda.UUCP (Jim Giles) (02/23/90)
From article <390@argosy.UUCP>, by kevin@argosy.UUCP (Kevin S. Van Horn): > In article <14242@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >>In languages with recursive data types, direct dynamic memory (like >>ALLOCATABLE in Fortran 90), and type coersion I've never seen the >>need for pointers _AT_ALL_!! So, rejecting something because it >>interferes with pointers is a null issue. > > Would you care to expand on this? I'm not sure what "direct dynamic memory" > is, for starters. Ok. I'll tackle each feature separately. I) Recursive data types. These are things like linked lists, graphs, trees, etc.. Internally, they are almost certainly implemented using pointers although, there are other implementations possible if you don't mind a constraint against cyclic structures - C.A.R. Hoare claims that cyclic data objects are unstructured monsters that shouldn't be allowed. (Functional languages usually don't allow cyclic data structures.) As an example, consider a possible declaration of a binary tree data type (all examples are in a Fortran/C-ish syntax - that is, the data types are on the left and a possible initialization is on the right of each declared item): type b_tree is recursive node value b_tree left b_tree right end type b_tree and the data type 'node' might be a discriminated union so that the tree could hold a variety of data. Note that the declaration is very like what C does, except that C requires the left and right objects to be pointers. So what is the advantage of this over pointers? Well, for one thing, after you get used to it, it's easier to use and to read. There are no 'dereferencing' operators and there is no distinction between '->' and '.' for field selectors. btree x = null btree y = null Recursive data structures can be empty, these declarations explicitly initialize 'x' and 'y' to be empty (which is probably the default, but it's better to be sure). btree a = {1,null,null} This declares 'a' to be a tree with one node already defined. It also assumes that data type 'node' is assignment conformable with the integer '1' (which the compiler should check). x = btree{2,a,null} y = btree{3,x,btree{4,null,null}} a.value = 7 These executable statements have built the tree: y:3 / \ x:2 :4 / a:7 Note that new nodes are allocated by a the type name applied to a conformable list. Nodes are deallocated when there are no references to them (for example: y.right = null will cause the node with the value '4' to be deallocated). Deallocation is done through reference counts or garbage collection. If overhead is a problem, an explicit deallocation could be added to the language. The implicit deallocation has the advantage of completely eliminating two common types of pointer errors: dangling pointers (which still point to memory that some other part of the code has deallocated) and orphanned data (which is not accessable through any pointer but was never returned to the memory manager). II) Direct dynamic memory. The concept behind this is simple. If you want an object to be dynamically allocated, you say so in the declaration and then you have to code an allocation statement which must be executed before the object can be used. The Fortran 90 syntax is used in the following example: real, allocatable :: x(:,:) ... allocate (x(100,1000)) ... code using 'x' deallocate (x) ... Between the execution of the ALLOCATE and DEALLOCATE statements, the object behaves (from the user's point of view) _exactly_ the same as a statically allocated object does. In particular, dynamically allocated objects are _NOT_ aliased with any other objects! This means that code can be fully optimized, etc.. Note that arrays can be dynamically allocated to any size. Although I didn't show it here, the dimensions of 'x' could have been supplied by any integer expression. For some reason, the fortran committee restricted the ALLOCATABLE attribute to arrays only. Obviously, there is no reason that scalar objects can't be dynamically allocated as well. The function ALLOCATED(x) returns whether the array is currently allocated or not. The object is automatically deallocated when control returns from the its scope unless the object also has the SAVE attribute. There are several advantages to this. The obvious one is that no aliasing is involved (or even possible - without pointers). Since ALLOCATE is a statement and not a function call, it is generic in its arguments: no error prone (and oft ommitted) type casting on the returned pointer (like C has), and no need to manually scale the memory request by sizeof() multiples. The ALLOCATE statement is more legible as are the uses of the allocated object (no dereferencing operator to mess with, no confusion between the object and its pointer, etc.). III) Run-time type coersion. There are two different activities that are both called 'coersion'. One is type conversion (like x = (double) i; in C). You can debate whether the language should be 'strict' and not do any such conversions automatically vs. a 'non-strict' language which allows 'mixed-mode' operations. This is _NOT_ the kind of coersion I am refering to. When doing system programming it is often necessary to 'ignore' the data type of an object in order to directly manipulate its internal structure. This requires the ability to alter the type-tag that the compiler sees for the object - _WITHOUT_ altering the data itself. This is the type of coersion I'm refering to in this discussion. 'Structured Programming' enthusiasts will claim this is ugly and you shouldn't do it. Unfortunately, it is often the only efficient way to get something done (or, would you rather your customers chose someone else's system?). I won't go into the 'ethical' question here. I am only going to talk about _how_ to do the deed - once you've decided that it's a useful feature. In C, the usual way is to use a pointer cast: double x; struct dbl_struct { /* structure of an IEEE double */ int sign_bit : 1; int expon : 11; int fraction : 54; } *p; ... p = (dbl_struct *) &x; ... Unfortunately, this won't work since C is allowed to 'pad' bit-fields in structs. So, C users usually just cast to a char pointer and shift&mask the fields they need. Either way, this is nothing more than (or less than) a run-time EQUIVALENCE statement. But, what is needed has nothing to do with aliasing or pointers! What is really needed is a way do the type-coersion directly. How about: double x; map x as struct { int sign_bit : 1; int expon : 11; int fraction : 54; } ... x = 1.0 /* x still works as a regular 'double' if no 'map fields are present */ x.sign_bit = 1; /* force x negative */ x.expon = x.expon-3; /* divide x by 8 */ ... This, together with a rule that 'map' structures are never padded, will accomplish what's needed without pointers. Your code will not take a performance hit from possible aliasing, etc.. Notice that all these mechanisms are much more precise than the pointer implementations are. Only the recursive data structures involve possible aliasing - but you could argue that you _want_ to allow aliasing here. Since they are more precise, they are easier to read and to write as well as being easier to compile. I have yet to find any algorithms for which the above kinds of features aren't sufficient (both functionally and for efficiency). So, I don't see the need for pointers at all! J. Giles
jlg@lambda.UUCP (Jim Giles) (02/23/90)
From article <24349:04:46:47@stealth.acf.nyu.edu>, by brnstnd@stealth.acf.nyu.edu: > In article <14242@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: > [ make syntax line-based ] > I agree! Let's make it column-based, too. And let's reserve a column for > indicating comments. Wow, I think we're onto something here. :-) Don't laugh! There are people seriously considering using indentation level as the flow-control mechanism. If that's not column sensitive, I don't know what is! (By the way, indentation level _has_ been used this way in an existing language: MODCAP (used to be called MADCAP - don't know why they changed the name).) > [... end_of_line == end_of_construct ...] > Comments, yes. Statements, no. The studies you're referring to compared > line-based syntax to Pascal or C, where adding a semicolon after a block > or statement can cause a syntax error or change the meaning of the code. > Such syntactic problems disappear when loops are explicitly terminated. On the other hand, I've seen a study of C which indicated a large percentage of syntax errors was due to missing semicolons. The overwhelming majority of such errors were at the end_of_line. This seems to indicate that even C programmers tend to regard the end_of_line as the end_of_statement (at least subconsiously). Meanwhile, I've done my own study of C code available to me (sources that come with SUN, etc.). Out of over 5000 lines of C, I found only 11 cases of a simple statement which wrapped across a line boundary. One of these 11 was, manifestly, an error! So, it causes a lot of problems and is used very infrequently - why keep it? > A study in Computer Languages several years back found that terminated > loops ended up with by far the fewest errors per programmer. Yes, and the infamous 'Hare' experiments in the 70's showed that programs using 'IF ... ENDIF' had fewer errors than 'IF() GOTO L ... L:', which in turn had _FEWER_ errors than 'IF()BEGIN ... END'. After that, even N. Wirth gave up on 'compound-statements' as a flow control mechanism. > [...] >> Ada's control constructs are among the few thing they did well. > > Given the lack of any macro facility, they did fine. They would have > done much better to provide a standard method to define new control > structures, then reduced the standard set to three or four with no > special cases. There _are_ only three or four! IF ... ELSEIF ... ELSE ... ENDIF, LOOP ... END LOOP, CASE ... END CASE, GOTO. That's four. The special cases all apply to loops: the FOR ... LOOP, the WHILE ... LOOP, and the LOOP ... EXIT ... END LOOP. Also, there is a problem with using macros to make control structures: the lack of standardization. If you make up a CASE construct which I find peculiar, I will have difficulty with your code. Giving me the expanded version won't be of much help either - unless it turns into _very_ legible standard code (in which case, why bother inventing a new control structure?). > [... no need for pointers _AT_ALL_ ...] > Oh, yeah! In a language with tasks, automatic threading, and message > passing, I've never seen the need for semaphores _AT_ALL_!! C'mon, be > serious. I am serious! By your argument the following is valid: Oh, yeah! On a machine with conditional jumps, built-in arithmetic, and instructions in memory, I've never seen the need for a Turing machine _AT_ALL_!! C'mon, be serious. And yet, I don't know anyone who does production work on Turing machines. In the language you described above, semaphores would probably _not_ be very heavily used (at least, not directly). And, with a well designed language, pointers would probably not be needed either (at least, not directly). The advantage of these higher level ways of computing is that they allow the programmer to be more _precise_ about what the program does. By the way, the correct analogy for pointers is GOTO statements. Pointers and GOTOs are isomorphic to each other when you compare control constructs to data structuring features. Pointers can lead to 'spaghetti' data in the same way that GOTOs lead to 'spaghetti' code. Pointers have more severe disadvantages though: whereas GOTOs are usually restricted to purely local targets, pointers can, and usually do, point outside the local scope. The compiler is therefore completely unable to make any simplifying assumptions about the data flow of any code which involves pointers together with global data or other pointers or even local data which has been passed by reference to some other procedure! J. Giles
jlg@lambda.UUCP (Jim Giles) (02/23/90)
From article <12098@goofy.megatest.UUCP>, by djones@megatest.UUCP (Dave Jones): > [... don't need pointers ...] > > Huh? I think somebody missed the point. Those 'recursive data structures' > etc. are just teaming with pointers. Whether the programer declares them, > or the compiler sneaks them in, the hardware still wants them to point to > properly aligned data. Huh? I think somebody missed the point. Those 'structured' control constructs are just teaming with GOTOs! Whether the programer uses them directly, or the compiler sneaks them in, the hardware still wants them to point to code. The point is (so to speak), in both these cases, if the programmer doesn't code the thing directly the compiler can make simplifying assumptions about the way they are used. Also, the code is more readible and easier to maintain is you use the higher-level constructs. J. Giles
jlg@lambda.UUCP (Jim Giles) (02/23/90)
From article <18172@megaron.cs.arizona.edu>, by mike@cs.arizona.edu (Mike Coffin): > [...] > A correction for those readers not familiar with C: the above is not > true. Arrays and pointers are different beasts. The confusion arises > because array names are *converted* to pointers when passed as > parameters and because the [] operator can be used on both. To make > an analogy, in both Fortran and C, integers are sometimes converted > automatically to reals (floats in C) and many of the same operators apply > to integers and reals but that doesn't mean that Fortran and C don't > really _have_ an integer data type. A correction for those readers not familiar with C: the above is not true. Arrays are pretty useless unless you can pass them around as procedure arguments. C converts all arrays to pointers when passing them to procedures. AND: YOU _CAN'T_ CONVERT THEM BACK ONCE YOU'RE THERE!!!!! They are not treated as arrays anywhere except the scope in which they were declared - and CAN'T be treated as arrays anywhere except their home scope. To make an analogy, it's as if, once an integer was converted to real, it could _never_ be converted back! And normal usage of integers _forces_ you to convert them to real on frequent occasions. So that, in effect, you really _DON'T_ have integers. Fortunately, even C doesn't really do this to integers. But is DOES do the corresponding thing to arrays. J. Giles
new@udel.edu (Darren New) (02/23/90)
In article <14245@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >I) Recursive data types. > As an example, consider a possible declaration of a binary tree > data type (all examples are in a Fortran/C-ish syntax - that is, > y = btree{3,x,btree{4,null,null}} Ok. Now how do I write the declaration of a function that returns a pointer to the first sub-tree with the value four at its root without winding up with an aliased value? AND how do I do this without using recursion? -- Darren
sakkinen@tukki.jyu.fi (Markku Sakkinen) (02/23/90)
In article <18172@megaron.cs.arizona.edu> mike@cs.arizona.edu (Mike Coffin) writes: >From article <14244@lambda.UUCP>, by jlg@lambda.UUCP (Jim Giles): >> C doesn't _have_ arrays! It has a strange variant of pointers which >> can (on rare occasions) simulate arrays in a way that is almost as >> efficient and easy to read as arrays would have been. > >A correction for those readers not familiar with C: the above is not >true. Arrays and pointers are different beasts. The confusion arises >because array names are *converted* to pointers when passed as > ... A further correction for those readers not familiar with C: Giles's posting is a slight exaggeration. In the sense of memory allocation, C does have arrays: if you define an external, static, or automatic array, space is really reserved for it. But arrays certainly are not first-class objects in the same way as records (struct's) are. The confusion between arrays and pointers is perhaps the worst single flaw in C. Markku Sakkinen Department of Computer Science University of Jyvaskyla (a's with umlauts) Seminaarinkatu 15 SF-40100 Jyvaskyla (umlauts again) Finland SAKKINEN@FINJYU.bitnet (alternative network address)
ted@nmsu.edu (Ted Dunning) (02/24/90)
In article <14246@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
I don't know what is! (By the way, indentation level _has_ been used
this way in an existing language: MODCAP (used to be called MADCAP - don't
know why they changed the name).)
we can only hope that modcap isn't an existing language much longer.
not only did it use a wildly non-standard character set as well as
indentation for blocking, it used variable amounts of white space
around operators to vary their precedence.
--
Offer void except where prohibited by law.
jlg@lambda.UUCP (Jim Giles) (02/24/90)
From article <11911@nigel.udel.EDU>, by new@udel.edu (Darren New): > In article <14245@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >>I) Recursive data types. >> As an example, consider a possible declaration of a binary tree >> data type (all examples are in a Fortran/C-ish syntax - that is, >> y = btree{3,x,btree{4,null,null}} > > Ok. Now how do I write the declaration of a function that returns > a pointer to the first sub-tree with the value four at its root > without winding up with an aliased value? AND how do I do this > without using recursion? -- Darren Well, since ther aren't any pointers, a function which returns a pointer is not a problem - it's impossible. And, who said anything about not using recursion? I didn't. I think recursion is a grand idea! Actually, you didn't read my entire post. I pointed out that recursive data structures probably _SHOULD_ allow aliasing. It was the only one of the set of data structures I gave which did. However, there are implementations of recursive data structures which _DON'T_ allow aliasing. (Many functional programming languages don't allow aliasing - they most certainly _DO_ have recursion in both data and function structures. If you think about it, aliasing isn't an issue in functional programming since assignment isn't allowed.) The advantage of using recursive data structures instead of pointers is mostly notational (no 'dereferencing' operator, no confusion between pointers and the objects they point to, etc.). This makes programs easier to read and maintain. Also, it helps the compiler to determine that the _only_ thing a 'b_tree' (for example) can be aliased to is another 'b_tree'. There isn't even the possibility of a pointer 'cast' accidentally (or deliberately) aliasing a 'b_tree' to a char string or something. This improves the compilers ability to optimize the code. J. Giles
mike@umn-cs.cs.umn.edu (Mike Haertel) (02/24/90)
In article <14245@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >I) Recursive data types. [...] > type b_tree is recursive > node value > b_tree left > b_tree right > end type b_tree This is reasonable, even elegant. My one question is: How do you do mutually recursive data structures, rather than just diretly recursive ones? -- Mike Haertel <mike@ai.mit.edu> "We are trying to support small memory machines." -- Larry McVoy
jlg@lambda.UUCP (Jim Giles) (02/24/90)
From article <3528@tukki.jyu.fi>, by sakkinen@tukki.jyu.fi (Markku Sakkinen): > In article <18172@megaron.cs.arizona.edu> mike@cs.arizona.edu (Mike Coffin) writes: >> [...] Arrays and pointers are different beasts. [...] This is the only part of Mike Coffin's submission which wasn't totally misleading. The question is, if arrays and pointers are different beasts, why does C mangle them together? Why doesn't it provide a mechanism for un-mangling them once the damage has been done? > A further correction for those readers not familiar with C: > Giles's posting is a slight exaggeration. [...] But _only_ a slight exaggeration. Modularity and library use is _very_ important. You can't do all your array manipulation with just globals or restrict your work to the same procedure in which the arrays were declared. In fact, for programs of any really useful size, passing array arguments is vital. In this respect, at least, C really _DOESN'T_ have arrays. > [...] > The confusion between arrays and pointers is perhaps the worst > single flaw in C. I almost agree. But C has an awful lot of flaws. It's hard to choose just this one as _the_ worst. J. Giles
dwp@willett.UUCP (Doug Philips) (02/25/90)
In <11558@nigel.udel.EDU>, new@udel.edu (Darren New) writes: > However, in my language there are > no parsers that cannot be overridden. That is, to parse the language, > each character is read and appended to a buffer. Then each entry in > a "parse" array is called in turn. Once one of the entries recognises > the token in the buffer, it outputs the object code for that token > and clears the buffer. Why aren't the parsers run in parallel (or to that effect?) I take it that the parsers don't have to emit anything if they don't want? How do you do the equivalent of read-time-macros-or-functions? (I don't remember my lisp well enough, but something like reader macros) (or Forth's IMMEDIATE words) > I think that if you can't add new literal types, you don't have a > truely new language. To do this, you must be able to define > internal representations of high-level structures in terms of low-level > structures (like bit strings). One of the nice things about Forth is that it is simple and transparent enough to allow you to do just this. You can rewrite the interpreter or just NUMBER. -Doug --- Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp Daily: ...!{uunet,nfsun}!willett!dwp [in a pinch: dwp@vega.fac.cs.cmu.edu]
dwp@willett.UUCP (Doug Philips) (02/25/90)
In <11864@nigel.udel.EDU>, new@udel.edu (Darren New) writes: [Brief discussion of simplicity of post-fix over pre-fix.] > In postfix, the evaluated > arguments come first and the non-evaluated arguments come afterwards. PostScript is an even cleaner (i.e. more postfix) language that Forth. All you need is a simple way of 'not-eval'ing a word. PostScript has '/foo' which means to push the symbol 'foo', '(asdf)' which is a string literal, and '{'/'}' which delimits code bodies whose address is pushed onto the stack. T'would be nice to avoid any kind of prefixing at all, but I don't yet see a way around it. -Doug --- Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp Daily: ...!{uunet,nfsun}!willett!dwp [in a pinch: dwp@vega.fac.cs.cmu.edu]
dwp@willett.UUCP (Doug Philips) (02/25/90)
In <111800@ti-csl.csc.ti.com>, gateley@m2.csc.ti.com (John Gateley) writes: > In article <635635738.28255@myrias.com> cg@myrias.com (Chris Gray) writes: > >One goal for ANY language is that it be quickly readable by anyone, whether > >they are familiar with that class of language or not. > > I disagree with this statement. It might be fair to say that > a language should be readable by anyone who is familiar with > the basic concepts, but what Chris's statement does is limit > a language to concepts that everyone will be familiar with. > Anything that is subtle, powerful, or exciting also has the unfortunate > drawback that it is harder to read. Consider "call/cc" in Scheme, > forward inferencing a la OPS5, backwards inferencing a la Prolog, > Combinatory logic, first class functions and ... I agree with John Gateley. If you limit yourself to what is already 'familiar' you can't do anything really new. Of course, it maybe that, in the end, you end up coming back to something that is familiar. There is a problem of difference-distance. It will be more confusing to have something that looks familiar but doesn't work/act familiar than to have something different that both looks and works/acts different. Anyway, I don't think you necessarily need to aim at the bizarre, but it oughta be an option. BTW: I personally find readability to be tied to the regularity and simplicity of the language. Of course once you allow extensibility at the syntacic level you open the door for people to write incomphrensible gobbilty-gook. A clear case of contradictory 'goods'! -Doug --- Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp Daily: ...!{uunet,nfsun}!willett!dwp [in a pinch: dwp@vega.fac.cs.cmu.edu]
dwp@willett.UUCP (Doug Philips) (02/25/90)
In <6925@tekgvs.LABS.TEK.COM>, toma@tekgvs.LABS.TEK.COM (Tom Almy) writes: > Most postfix fanatics (typically Forth programmers, of which I am one) will > say postfix is better than prefix (thinking of LISP) because it eliminates > the need for all of those parenthesis. In fact, parenthesis (as grouping > operators) are only needed if you don't know how many arguments are needed > for a function. You can get rid of most parenthesis in a prefix language, > for instance LOGO behaves much like a parenthesis-free LISP. The only > real advantage of postfix is that it can be directly executed in a stack > architecture. There is another issue which is eval time. In prefix languages you can have the 'function' itself control when it evaluates its arguments. The advantage to post-fix over pre-fix, for me, is that I find the clutter of parens gets in my way. It may be that I never stayed with LISP long enough to chunk that kind of processing. > What bothers me is not so much pre vs. in vs. post but what I call mixfix. > Mixfix is a hodgepodge of pre/in/post fix notation that can be very > confusing. At least LISP is consistantly prefix (LOGO isn't). Yes, consistency is a big win. It conflicts with extensibility. Are you really using an extensible language if you can't extend the syntax? > Forth, the "king of postfix" uses prefix functions for string arguments. PostScript does too, but perhaps in a more consistent way. There is no reason, a priori, that a Forth dialect can't be built which does this. This kind of extensibilty is what makes Forth interesting to me. However, it is also easy to abuse. -Doug --- Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp Daily: ...!{uunet,nfsun}!willett!dwp [in a pinch: dwp@vega.fac.cs.cmu.edu]
dwp@willett.UUCP (Doug Philips) (02/25/90)
In <1106@thor.wright.EDU>, econrad@thor.wright.edu (Eric Conrad) writes: > Why not prefix notation? Prefix notation is more common than postfix > in mathematical literature, > f(x,y,z) rather than (x,y,z)f > I suspect that it is a easier to read for those of us used to reading > from left to right since it emphasizes the operators rather than the > operands. Whoa. Lets avoid the right-to-left-is-better-ethnocentricism trap here. My personal inclination is to try new things, since that is the easiest way to get new perspectives and to evaluate existing practices. It may be that mathematics is the way to go, since it has been around for how many orders of magnitude longer than programming? On the other hand it may be impossible to make a new insights or breakthroughs without abadoning the old ways. -Doug P.S. My personal viewpoint is a fixation for post-fix. This is due to my perspective of 10 years of C programming and my recent discovery of Forth. From what I know of PostScript it is an even cleaner post-fix language, in that it is more regular and consistent in being post-fix. I'd be curious to hear what other people's perspectives are. --- Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp Daily: ...!{uunet,nfsun}!willett!dwp [in a pinch: dwp@vega.fac.cs.cmu.edu]
cik@l.cc.purdue.edu (Herman Rubin) (02/26/90)
In article <17349:05:24:18@stealth.acf.nyu.edu>, brnstnd@stealth.acf.nyu.edu writes: > In article <111357@ti-csl.csc.ti.com> gateley@m2.csc.ti.com (John Gateley) writes: > > >3. Ability to use existing C libraries and headers. > > This is truly a difficult problem, because you have to say: "whats so > > special about C, I want my <foo> libraries" where <foo> might be Ada, > > Lisp, PDP-11 assembler, or whatever. > > No. Under UNIX, for example, one can without any trickery load Fortran, > Pascal, and C routines together. This is useful, though it does dictate > that the stack be used in particular ways. > > With N languages running around it's impossible to write N^2 translators. It would only be necessary for each compiler to have the possibility of using the calling sequence for the others. This can come up even with one language; the Fortran compilers on the CDC 6x00 could not call each other's subroutines. It is not likely to be the case that each compiler uses a different calling sequence, but each must be able to use the appropriate sequences of the others it calls, or a linking module must be able to change one into another. With the systems I used before UNIX, this was the only problem. But the implementations of UNIX, and the compilers, I have seen change global names. I do not know to what extent this is universal, but C (and I have been told Pascal) prepends an underscore to every global name, and Fortran prepends and postpends an underscore likewise. I have not seen an object file editor which can change names to ones of different length. Now I have no problem in using these names to call Fortran from C and vice versa. But suppose one wishes to produce a subroutine to be called from different languages? If the names were unchanged, and calling sequences could be supplied to the compilers, this would be easy. I suggest we try to make it easy. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
cik@l.cc.purdue.edu (Herman Rubin) (02/26/90)
In article <24123:04:14:07@stealth.acf.nyu.edu>, brnstnd@stealth.acf.nyu.edu writes: > In article <10790@june.cs.washington.edu> machaffi@fred.cs.washington.edu.cs.washington.edu (Scott MacHaffie) writes: < > Unconditional loops have a serious problem: you have to read all of the < > code inside the loop to find out when (or if) it terminates. Replacing < > while and for with loop would be a bad idea. Even providing loop < > means that people will use it and stick "end loop" inside the loop < > (this happens in ada). > > This is specious. Assume break and if; then while, do, for, and loop > are all equivalent, in the sense that each can be written purely > syntactically in terms of any of the others. Therefore (in line with > the goals of the language, to evolve in another thread) the simplest > construction wins. (There are several criteria for choosing ``loop'' > as the simplest; but I don't think anyone will argue, so I won't > elaborate further.) This is correct but bad. Any machine can be simulated in a small universal Turing machine with few states, but the operations will run very slowly. Also, there are situations where anything other than goto is very inefficient. Another one omitted is a "break to stage ..." which may close several blocks at once. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
new@udel.edu (Darren New) (02/27/90)
In article <551.UUL1.3#5129@willett.UUCP> dwp@willett.UUCP (Doug Philips) writes: > >> However, in my language there are >> no parsers that cannot be overridden. That is, to parse the language, >> each character is read and appended to a buffer. Then each entry in >> a "parse" array is called in turn. Once one of the entries recognises >> the token in the buffer, it outputs the object code for that token >> and clears the buffer. >Why aren't the parsers run in parallel (or to that effect?) Because I'm running on a uni-processor :-). Also, the programmer has control over the order of items in the list. If you want the mention of a function to create the function, then this could be put in front of the routine to compile the call to the function. There are other transformations too: e.g., all functions are upper-case only, so there is a parser that uppercases the buffer and then claims to not recognise it. (Not really, but an example) >I take it that >the parsers don't have to emit anything if they don't want? The parsers are functions. They return either a special null value if the buffer is not recognised or return the "thunk" of the object they recognised. >How do you >do the equivalent of read-time-macros-or-functions? (I don't remember my >lisp well enough, but something like reader macros) (or Forth's IMMEDIATE >words) The parser itself reads ahead in the input stream. For example, for the "procedure" type, an openning left brace is recognised and then the parser and input stream is called RECURSIVELY to compile the executable procedure. Any sub-procedures look at the return stack (yuck) or a special variable to see if they are inside of another procedure and clear the "executable" bit. The "if" command goes: u @ 0 < { true part } { false part } if and"if" takes three parameters: a condition and two procedures. The internal coding of IF is { /* IF routine */ 012->120 /* ROT in FORTH */ 'true t-throw /* goto label 'true if condition is true */ 01->1 /* swap drop */ execute /* execute the false part */ 'end throw /* skip to end */ 'true label 01->0 /* drop */ execute 'end label } 'if define Note that 'true is parsed by creating an entry in the symbol table if not already there and then returning a "thunk" to it. Hence, "'true" pushes the symbol table entry for "true" on the stack. "'true label" just pops off 'true from the stack, but leaves a code sequence that "throw" can find by looking in and modifying the return stack (which is of course directly accessible). Also note that 012->120 is a function created by a parser: if the normal lookup-in-dictionary fails, there is a parser which will recognise a string of digits, an arrow, and a string of digits and will build code to make the indicated transformation. Then the name will be installed in the dictionary so other lookups will use the same code. I do overloaded procedures and message sends the same way. >> I think that if you can't add new literal types, you don't have a >> truely new language. To do this, you must be able to define >> internal representations of high-level structures in terms of low-level >> structures (like bit strings). >One of the nice things about Forth is that it is simple and transparent >enough to allow you to do just this. You can rewrite the interpreter or >just NUMBER. So's mine. Mine is simple enough to add new parsers WITHOUT having to rewrite the entire world. One of the advantages of this language (which I call 2OL) is that the dynamic memory handling is much better than in FORTH, allowing things like 012->0221 to generate functions by allocating memory in a separate area from the procedure currently compiling. -- Darren P.S., I should make clear that the implementation, for various non-programming related reasons, has not progressed much beyond the design phase. I'm considering making the design and code-so-far public. -- Darren
phipps@garth.UUCP (Clay Phipps) (02/27/90)
In article <5184@brazos.Rice.edu>, preston@titan.rice.edu (Preston Briggs) wrote: > >In article <24349:04:46:47@stealth.acf.nyu.edu>, >brnstnd@stealth.acf.nyu.edu (Dan Bernstein) wrote: > >>>In languages with recursive data types [etc., >>>pointers are never or rarely needed] >> >>Oh, yeah! [...] > >It's not such a bad point. [...] [L]anguages [] with >recursive data types manage very nicely without (explicit) pointers. ^^^^^^^^^^^^^^^^^^^^ You Q fans owe it to yourselves to read C.A.R. Hoare's "Notes on data structuring", in the 1972 Dahl, Dijkstra, Hoare classic _Structured Programming_. -- [The foregoing may or may not represent the position, if any, of my employer, ] [ who is identified solely to allow the reader to account for personal biases.] [Besides, this article was written and posted after normal business hours.] Clay Phipps Intergraph APD: 2400#4 Geng Road, Palo Alto, CA 94303; 415/852-2327 UseNet (Intergraph internal): ingr!apd!phipps UseNet (external): {apple,pyramid,sri-unix}!garth!phipps EcoNet: cphipps
cik@l.cc.purdue.edu (Herman Rubin) (02/27/90)
In article <1106@thor.wright.EDU>, econrad@thor.wright.edu (Eric Conrad) writes: > From article <10979@saturn.ADS.COM>, by xanthian@saturn.ADS.COM (Metafont Consultant Account): > > Better, make the language strictly postfix, give it exactly one > ^^^^^^^ > > Why not prefix notation? Prefix notation is more common than postfix > in mathematical literature, > f(x,y,z) rather than (x,y,z)f > I suspect that it is a easier to read for those of us used to reading > from left to right since it emphasizes the operators rather than the > operands. > > Of course I haven't used an HP calculator in a long time so I am > probably prejudiced. Mathematical notation is mostly a mixture of prefix and infix notation, with infix predominating in use. There are syntactic advantages to prefix and postfix (lack of parentheses), although additional separator characters may be needed. If we had started out with a strictly prefix (Polish notation) or a strictly suffix (reverse Polish), by all means we would design our computer languages accordingly. However, there are situations where infix notation is easier to understand. When it comes to carrying out a calculation which is not inlined, how is this done? The normal procedure is to create the argument list (this is done in various ways) and then to call the procedure. Thus, computation seems to be essentially postfix. This is even true with simpler situations on calculators, and for hand calculation. When an arithmetic operation is performed, the arguments are normally assembled before the operation is begun. This is why I find RP calculators easier to use, besides being less error prone (someone came into the office when I was making an entry; did I hit the + key or not?). The aim of the language is for the human, not the computer, to have the easy time. This means that the language should be able to communicate to the compiler what the human wants done, and this should not be restricted to what the compiler-writer thinks the human might want done. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
dwh@twg.com (Dave W. Hamaker) (03/03/90)
In article <1966@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >In article <24123:04:14:07@stealth.acf.nyu.edu>, brnstnd@stealth.acf.nyu.edu writes: >> In article <10790@june.cs.washington.edu> machaffi@fred.cs.washington.edu.cs.washington.edu (Scott MacHaffie) writes: >< > Unconditional loops have a serious problem: you have to read all of the >< > code inside the loop to find out when (or if) it terminates. Replacing >< > while and for with loop would be a bad idea. Even providing loop >< > means that people will use it and stick "end loop" inside the loop >< > (this happens in ada). >> This is specious. Assume break and if; then while, do, for, and loop >> are all equivalent, in the sense that each can be written purely >> syntactically in terms of any of the others. Therefore (in line with >> the goals of the language, to evolve in another thread) the simplest >> construction wins. (There are several criteria for choosing ``loop'' >> as the simplest; but I don't think anyone will argue, so I won't >> elaborate further.) >This is correct but bad. Any machine can be simulated in a small >universal Turing machine with few states, but the operations will >run very slowly. Also, there are situations where anything other >than goto is very inefficient. Another one omitted is a "break to >stage ..." which may close several blocks at once. An idea I find attractive (I'm weird), is to delimit loops with statements in the following kind of way: label: loop while (top_break_condition); . . . exit label if (middle_break_condition); . . . repeat label while (bottom_break_condition); The "label" is a normal kind of statement label (although there is as case for prohibiting a goto the label on a loop statement). Also, the label and label references may be omitted, defaulting to the "innermost" case. The "while" and "if" (and any other allowed forms such as "for," "unless," ...) are all optional. I like this because it ties together all the looping forms into a single concept at the language level. It also gives you a consistent way to show if the exit test is done at the beginning of each iteration, or after, or both, or neither, or something else. It even allows unambiguous partial overlapping of loops, especially if you can do a similar bracketing of the if-else construct. In fact, I strongly suspect (but cannot prove) that such "if-else" and "loop" constructs, taken together, have exactly the same expressive power as "goto" combined with the simplest "if." In other words, a compiler would have no difficulty generating just as efficient code either way. This is because "loop-repeat" is fundamentally an unconditional goto backwards in the control flow from "repeat" to "loop" (with provisions for jumping or continuing forward to the statement following "repeat"), and "if-else" provides for conditionally jumping in the forward direction. The only thing this doesn't seem to cover is jumping over completely dead code, or jumping to-from a piece of code which is logically but not physically sequential (and I don't mean a subroutine/function call). Neither of those cases are "efficient" or "useful" in anything but a bizarre context. I should point out that I don't see how to express this idea in a syntax form that meshes well with C. It seems a bit wordy for C and goes counter to C's statement-within-statement approach to things. C would also use "break" instead of "exit;" this produces vivid mental images of "broken loops"! Seems OK for Fortran, though. -Dave Hamaker dwh@twg.com ...!sun!amdahl!twg-ap!dwh
peter@ficc.uu.net (Peter da Silva) (03/03/90)
In c-ish syntax, how about this as the loop construct: while (expr) statement while (expr) statement ... Explicit breaks or continues would be deprecated... The for loop would come out as: initial while (test) body while (true) increment -- _--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ \_.--._/ Xenix Support -- it's not just a job, it's an adventure! v "Have you hugged your wolf today?" `-_-' --
karl@haddock.ima.isc.com (Karl Heuer) (03/10/90)
In article <78=1NPDxds13@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: >In c-ish syntax, how about this as the loop construct: > while (expr) > statement > while (expr) > statement > ... You'd need to tweak the syntax to distinguish a two-exit loop from two adjacent loops, of course. >Explicit breaks or continues would be deprecated... I'm not sure how well the issue addressed by this "multiple while" corresponds to that addressed by break and continue. (Note: If it's really a *new* language, rather than a suggestion for the next version of C, then you don't need to deprecate things--you simply remove them immediately. There's no backward compatability problem.) >The for loop would come out as: > initial > while (test) > body > while (true) > increment I must object. The whole point of C's for-statement is to put all the loop control in one place, at the top of the statement. This looks like a step backwards. And isn't that clause "while (true)" a no-op in this language? I'd expect it to normally be omitted, so that the increment is simply the last statement of the body. Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint
peter@ficc.uu.net (Peter da Silva) (03/11/90)
In article <16136@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes: > You'd need to tweak the syntax to distinguish a two-exit loop from two > adjacent loops, of course. Yeh, I though of this: do statement while(expr) statement ... > (Note: If it's really a *new* language, rather than a suggestion for the next > version of C, then you don't need to deprecate things--you simply remove them > immediately. There's no backward compatability problem.) Good point. We'll use gotos instead. > I must object. The whole point of C's for-statement is to put all the loop > control in one place, at the top of the statement. This looks like a step > backwards. We can put it back later. Let me be a literal minded language purist for a while. A good macro facility (like lisp's) that works within the syntax of the language can be used to put it back... macro "for(@?initial; @?test; @?final) @body" { @if(?initial) @initial; while(@if(?test) @test @else true) { @body; 1: @if(?final) @final; } 2: } macro "continue" { goto forward(1); } macro "break" { goto forward(2); } > And isn't that clause "while (true)" a no-op in this language? It provides a target for "continue". But since we're using gotos instead, let's leave that out. -- _--_|\ `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ 'U` \_.--._/ v
ge@kunivv1.sci.kun.nl (Ge' Weijers) (03/16/90)
siebren@piring.cwi.nl (Siebren van der Zee) writes: >Right. Now if you're gonna put dynamic allocation in your language >anyway, don't forget to handle "automatic" growing of the stacks >in multithreaded environments. >This cannot be done by the operating system, since the virtual >memory just above the top of the stack that needs to be grown may >be used by another thread's stack. A decent, multi-segment machine can have N stacks that all grow and shrink as needed. Say: an MC68030/SPARC/.... operating system can't ... Ge' Weijers Internet/UUCP: ge@cs.kun.nl Faculty of Mathematics and Computer Science, (uunet.uu.net!cs.kun.nl!ge) University of Nijmegen, Toernooiveld 1 6525 ED Nijmegen, the Netherlands tel. +3180612483 (UTC-2)
ge@kunivv1.sci.kun.nl (Ge' Weijers) (03/16/90)
peter@ficc.uu.net (Peter da Silva) writes: >We can put it back later. Let me be a literal minded language purist for a >while. A good macro facility (like lisp's) that works within the syntax >of the language can be used to put it back... >macro "for(@?initial; @?test; @?final) @body" >{ > @if(?initial) @initial; > while(@if(?test) @test @else true) { > @body; >1: @if(?final) @final; > } >2: >} Wow: STAGE2 rising from the grave. (STAGE2 was a macroprocessor that could process multi-line macros. It was written in itself towards an assembler) Ge' Weijers Internet/UUCP: ge@cs.kun.nl Faculty of Mathematics and Computer Science, (uunet.uu.net!cs.kun.nl!ge) University of Nijmegen, Toernooiveld 1 6525 ED Nijmegen, the Netherlands tel. +3180612483 (UTC-2)
dwh@twg.com (Dave W. Hamaker) (03/17/90)
machaffi@fred.cs.washington.edu (Scott MacHaffie) writes: >Unconditional loops have a serious problem: you have to read all of the >code inside the loop to find out when (or if) it terminates. Replacing >while and for with loop would be a bad idea. Even providing loop >means that people will use it and stick "end loop" inside the loop >(this happens in ada). > >The advantage of a while/for loop is that the terminating condition >(or at least the standard terminating condition) is easy to find. >Then, only exceptional terminating conditions are inside the loop. I like the loop syntax: <optional label:> loop <optional control clause> . . . repeat <optional label reference> <optional control clause> And I suggest compilers should issue a warning for multi-exit loops. In that case, you only need to examine multi-exit loops in detail. Otherwise, you look for an exit condition in the control clause at the top; if not there, look for it at the bottom; otherwise locate it inside. Use of loop label references handles extra "repeat" statements, as well as multi-level exits. Does this approach mitigate the "serious problem"? I especially like the way _all_ loops become variations of a single form at the language level. I believe this would improve thinking about loops overall. -Dave Hamaker dwh@twg.com ...!sun!amdahl!twg-ap!dwh