johnl@ee.brunel.ac.uk (John Lancaster) (12/02/86)
Will Modula-2 be Successful? NO! I have been using Modula-2 for some time and watching the efforts of the BSI Modula-2 Working Group [1] to standardize the language. Throughout this period I have often asked myself: 'Will Modula-2 be efficient/versatile enough to be a winner in the general programming language market place?' In the form proposed by Wirth I feel the answer is no. The BSI Working Group has done much to improve the situation and I would like to thank its members for their unpaid efforts. They have formalised the language definition and introduced extensions/changes to it where necessary: multi-dimension open arrays [2] and co-routines [3] are two examples. However, there are still deficiencies. Below I highlight a number of problems and propose solutions to them. PROBLEM 1. Structured Constants At present structured constants are implemented by declaring a global variable which is initialized at runtime by code in the body of each module. There are two disadvantages to this approach: 1. The "constant" is not safe because it is really a variable, hence the compiler cannot protect it from unintentional change. 2. For those structures whose value cannot be derived by simple computation, constants are duplicated in the data and code areas of the program. This can be a high overhead in memory-sensitive ROM based systems. SOLUTION 1. Allow for the declaration of structured constants (CONST) as provided in the Turbo Pascal [4] dialect of Pascal. PROBLEM 2. Parameter types. Consider the following two procedures: PROCEDURE MatrixOp (Op1, Op2 : ARRAY OF WORD, VAR Result : ARRAY OF WORD) PROCEDURE Length (String : ARRAY OF CHAR) : CARDINAL These procedures have their input parameters passed by value. Although this is a safe method, the time spent creating a local copy of input parameters can have a severe effect on execution speed. Many programmers consider the above inefficient and work around it by the unsafe practice of declaring structured input parameters as VARs. Unfortunately in the case of the 'Length' procedure this causes another problem, namely the following is not legal: Size := Length('Literal String') SOLUTION 2. Add to the language a new formal parameter PVAR (protected variable). The implementation of a PVAR parameter is identical to that of a VAR parameter except that compile-time checks protect it from modification within the procedure by not allowing assignment to it or its use as a VAR parameter. As an aside I would also like to be able to code Result := MatrixOP (Op1, Op2) where the function is supplied with a pointer to Result and directly operated on it rather than creating an internal result which is passed out and assigned to Result. Although the syntax of Pascal could be modified to accommodate it, it appears that Modula-2's syntax cannot. Does anyone have any idea? PROBLEM 3. User exception handling. There is no provision in Modula-2 for the programmer to implement exception handling. This is primarily due to the absence of an equivalent to the Pascal construct "GOTO Label_InOuterBlock". The BSI proposed IO library [5] works around this shortcoming by predicate pretesting, i.e. testing if an operation can be performed before trying to do it. Although such an approach has merit it is not universally applicable. SOLUTION 3. Allow for user-written exception handlers as in ADA. Borland's [6] have proposed a possible Modula-2 implementation. PROBLEM 4. BITSET size and syntax. The data type BITSET provides a simple mechanism for bit addressing (thereby allowing the crippled data type SET to be replaced by something more useful) and performing logical operations on word wide variables. The deficiencies of BITSET become apparent on a machine with a variable word size addressing architecture like the 8086 & 68000 microprocessor families. Direct bit-twiddling of the hardware registers on such machines requires the language to support more than one size of BITSET. For the 68000 family BITSETs of 8, 16 & 32 elements (bits wide) are required. SOLUTION 4. Introduce a new type construction, which needs to be imported from SYSTEM, with syntax of the form: RegBitMap = BITSET OF [TxOn,RxOn,NIL,NIL,Reset,NIL,NIL,Error] The size of the memory 'word' being addressed is given by the number of elements in the set. Allowing the elements of the set to belong to an enumerated type in addition to a CARDINAL sub-range brings to bit addressing the same benefits it gives to variable (word) addressing. The reserved word NIL is used as a padding (spacing) element, but also indicates to the compiler those bits which should not be accessed. PROBLEM 5. Word subfields. Consider a hardware device with the following bit allocation --------------------------------- |. . . .|* * * *|* * . .|. . . .| | | | | | | | | | | | | | | | | | --------------------------------- where '.' are 1-bit flags and '*' is a 6-bit integer To be efficient a low-level module which accesses the integer subfield should generate in-line code for bit shifting and sign extension. SOLUTION 5. Require SYSTEM to export Shift and SignExtend functions which operate on all word sizes. PROBLEM 6. Low-level escape path. When SYSTEM does not export a suitable low-level facility the programmer must resort to calling an assembly language module. Such a solution is unattractive because: 1. Calling a module to execute one or two machine instructions is inefficient. 2. The special link format used by many of the currently available Modula-2 compilers cannot (easily) be linked with non-Modula-2 object files. SOLUTION 6. SYSTEM must export a "code insert" facility. I have presented above a number of extensions to the currently proposed BSI Modula-2 standard. With the exception of the BITSET proposal their adoption would not invalidate any existing code. I feel they are justified by the inability of the proposed language to implement these features with (library) modules else they are justified by the resulting increase in program reliability and efficiency they offer. Although I welcome comment on my proposals I feel that the interests of the community are best served by submissions to the BSI Modula-2 Working Group and encourage readers to do so. REFERENCES: 1. The Modula-2 Working Group of the British Standards Institution can be contacted via Barry Cornelius, Department of Computer Science, University of Durham, Durham, DH1 3LE, United Kingdom. Barry_Cornelius%mts.durham.ac.uk@UCL-CS.ARPA Barry_Cornelius@uk.ac.durham.mts bjc@uk.ac.nott.cs 2. "BSI Accepted Change: Multi-dimensional open arrays", Willy Steiger, "MODUS Quarterly" Issue 5, pp. 8-9. 3. "Coroutines and Processes", Roger Henry, "BSI Modula-2 Working Group, Second Open Meeting", July 24th 1986. 4. Turbo Pascal is a registered trademark of Borland International Inc. 5. "Draft BSI Standard I/O Library for Modula-2", Susan Eisenbach, "MODUS Quarterly" Issue 5, pp. 15-18. 6. "Proposal for a standard library and an Extension to Modula-2", Odersky, Sollich & Weisert of Borland International, "MODUS Quarterly" Issue 4, pp. 13-25. John Lancaster, London, UK johnl@uk.ac.brunel.ee johnl%ee.brunel.ac.uk@ucl-cs.arpa
bills@cca.UUCP (Bill Stackhouse) (12/06/86)
A comment about problem #4 which had to do with a 16 bit word which has a 6 bit integer between a bunch of 1 bit fields. Why not solve it with the Pascal approach of a packed structured record? x = packed record of b1, b2, b3, b4 : boolean; i : 0..63; b5, b6, b7, b8, b9, b10 : boolean; end; The syntax is loose but the key is that a compiler can detect the intent that everything should fit into one word. You might want a shift function but not to solve the given example. Something I would like to see in all procedure based languages is some syntax in the procedure def. that indicates that the procedure is to be included inline at all places it is called. That would allow the abstraction to still occur but would do away with the overhead of calling and returning just for a few lines of code. -- Bill Stackhouse Cambridge, MA. bills@cca.cca.com
marty@ism780c.UUCP (Marty Smith) (12/08/86)
In article <11458@cca.UUCP> bills@cca.UUCP (Bill Stackhouse) writes: > >Something I would like to see in all procedure based languages >is some syntax in the procedure def. that indicates that the >procedure is to be included inline at all places it is called. >That would allow the abstraction to still occur but would do >away with the overhead of calling and returning just for a few >lines of code. Good idea. Marty Smith
dan@prairie.UUCP (Daniel M. Frank) (12/09/86)
In article <11458@cca.UUCP> bills@cca.UUCP (Bill Stackhouse) writes: > >Something I would like to see in all procedure based languages >is some syntax in the procedure def. that indicates that the >procedure is to be included inline at all places it is called. This is implemented in C++, and in many Ada compilers (it's an optional pragma). -- Dan Frank uucp: ... uwvax!prairie!dan arpa: dan%caseus@spool.wisc.edu
firth@sei.cmu.edu (Robert Firth) (12/09/86)
In article <4814@ism780c.UUCP> marty@ism780c.UUCP (Marty Smith) writes: >In article <11458@cca.UUCP> bills@cca.UUCP (Bill Stackhouse) writes: >> >>Something I would like to see in all procedure based languages >>is some syntax in the procedure def. that indicates that the >>procedure is to be included inline at all places it is called.
nagler@seismo.CSS.GOV@olsen.UUCP (Robert Nagler) (12/10/86)
In response to the comments about ``inline'' procedures in M2 (and other procedure languages. I do not see why the programmer should provide hints to the compiler or linker as to the ``correct'' way to optimize a particular program. Given that most linkers nowadays are static, I see no reason why a linker couldn't decide to in-line a particular procedure given enough information by the compiler (like the size of the procedure). After programming in C a few years, it was particularly annoying to me to have to figure out which variables were supposed to go in a register and which were not. I also always wondered why I couldn't make global variables register variables, but that is besides the point. Clearly, I am making a common argument: let the compiler do it, it is smarter than you are. The Volition's M2 compiler for the P-system provides a method for doing in-line code in the definition module. It always seemed silly to me to provide such an interface. The argument was that it was a cheap way towards improving the speed of M2 code. I say that it is a good way to decrease the maintainability of a program or library. The same goes for in-line procedures which are not in the definition module. Specifically, some machines have expensive procedure calling mechanisms and others have cheap ones (VAX vs. RISC). It is certainly reasonable that if you in-line something on a VAX it may go faster while the same ``optimization'' may reduce the speed of the same code running on a RISC. The reasons for this are complicated, but it boils down to how much is in the Cache and how fast is a procedure call. M2 has plenty of silly programmer directed optimization techniques and some of them yield incorrect results when used without special checks (e.g. the CAP function). We should work on getting a good optimizing compiler for the language that exists instead of trying to add features that make the language even more complicated. Rob Nagler Olsen & Associates seismo!mcvax!cernvax!unizh!olsen!nagler
stuart@bms-at.UUCP (Stuart D. Gathman) (12/10/86)
In article <11458@cca.UUCP>, bills@cca.UUCP (Bill Stackhouse) writes: > x = packed record of > b1, b2, b3, b4 : boolean; > i : 0..63; > b5, b6, b7, b8, b9, b10 : boolean; > end; This has my vote. Even bit arrays fit this idea (packed array of . . .) > Something I would like to see in all procedure based languages > is some syntax in the procedure def. that indicates that the > procedure is to be included inline at all places it is called. Many global optimizers do this automatically. Procedures less than a certain size or called a small number of times are automatically converted to inline. This optimization can even be performed on the binary! (I.e. language independent) Adding an 'inline' key word allows good code from simpler compilers, however. -- Stuart D. Gathman <..!seismo!dgis!bms-at!stuart>
firth@sei.cmu.edu.UUCP (12/10/86)
In article <307@bms-at.UUCP> stuart@bms-at.UUCP (Stuart D. Gathman) writes: >Adding an 'inline' key word allows good code from simpler compilers, however. >-- >Stuart D. Gathman <..!seismo!dgis!bms-at!stuart> Let me agree at once with Stuart and others that a "good" compiler should automatically expand procedures inline where appropriate. However, I think some pragma or hint is needed to tell most compilers where this is indeed appropriate. Suppose, for instance, that a DEFINITION module contains a procedure definition, and the IMPLEMENTATION module contains the body. If the procedure is compiled as true out-of-line code, then you can replace the implementation (body) without recompiling dependents. If however it is made inline, then generally you can't - all the dependents reference the body and therefore will change. There is a tradeoff between execution efficiency and ease of change, and in general the compiler can't make that tradeoff, because it doesn't know whether you are in test-&-debug mode or whether you are in performance assessment mode, or even building the final load image At the highest level, therefore, you need a compiler option that says "yes, go for inline". At a finer level of control, you might want to say of any one procedure: never inline (rarely called, huge body, or whatever) always inline (critical code, body won't change, ...) sometimes inline (under option control) But here I'd state my personal preference for more intelligent compilers. In general, the user should give the compiler any information it CANNOT find out for itself, and no other information.
bobc@tikal.UUCP (12/11/86)
There are several articles based on John Lancaster's (johnl@uk.ac.brunel.ee) article entitled "Modula-2 standard". I will not include any of the original text to reduce net traffic. John lists six problems he sees in Modula2 as it is currently defined. I feel that only one of these problems requires a "Language Change", and that most of them are requests for what I feel should be "Compiler Options" or a larger standard "SYSTEM" module. I don't feel that any of these problems will make the language be unsuccessful as even with out these features I find that the language is better then either "C" or "Pascal", and is still much smaller then Ada. In all cases I consider Compiler Options to be embedded in comment blocks such that the code can be ported to compiler which does not support the features with out rewritting large parts of the code. Problem 1. structured constants This would require a modification to the syntax of the language, if it is really greatly needed then perhaps it should be added to the language. If the BSI were really on their toes they might write a optional standard which could define the way that compilers which support this feature should do so. (And maybe John Lancaster should write it up and submit it to them). A optional compiler feature which might solve this problem is a method for the compiler (of definition modules) to indicate read only variables (this would allow a module to set up the "constant" variables and disallow other modules from modifing them this would also mean that they can't be passed by address except as may be as required in item 2). Problem 2 PVAR Is a code generation problem in which the programmer wishes to suggest to the compiler that the variable should be passed by value and will not be changed by the module. At times I have felt that the compiler should not allow the programmer to modify the values of "passed by value" parameters. I feel that the correct solution is for a optional compiler switch which can be used to indicate that the program wishes to have the compiler pass strings/structures by address and inforce the rule of not allowing the "value" parameters to be modified. Note that this "pass by address will not change" type of definition will have to be in the definition modules, but could be done with a compiler switch. Problem 3. User exception handling. I don't agree that any more support for this is needed then is already available. I feel that a "setjmp"/"longjmp", and a signal catching routine can be created that will be every bit as good as the ones used by "C". I also know that Wirth's implementations supply all the low level features required to do this (even if it does require writting a small routine in machine language). I have a almost complete version of coroutines for MacMETH (Macintosh compiler from ETH) which prove that low level things like this can be done if the correct "SYSTEM" module is provided by the compiler. Problem 4. BITSET size and syntax. I don't see where this does any good. What would appear to be needed is a compiler which defines how a set is to be represented in menory. Then these sets can be used for hardware device registers. I also beleive that the NILs are not needed as the programmer can create names like "PAD1", "PAD2" ... this can be done without any changes to the language as it is only a compiler implementation issue. Problem 4. PACKING OF RECORDS I feel that some form of compiler option could be created to allow the programmer to specify that a RECORD is to be packed in to the smallest space possible. I also feel that it should be done in a way that it is portable to the current compilers, this is why I prefer options of the form (*$PACK*) which would allow me to quickly detect routines which make use of special compiler features, and at the same time allow compilers which do not support the feature still compile the program (which small adjustments). This feature is not required, it is just a extra feature that allows the compiler to do some the of decoding for you (given that many processors now support bit fields it would be easier for the compiler to do it but this is not true for things like PDP-11s, 68000s etc). Problem 5. SHIFT, and SignExtend. These functions should I agree be included in the standard SYSTEM module (and in fact the compilers that I have used have supported them in some form either ASH, or SHIFT, and the function LONG to do sign extention of a short). Problem 6. Low-level escape path. I feel that this is a function of finding a good compiler. First the MacMETH compiler (also the generic 68K compiler, and I suppose the Lilth, and 32032 versions also) support inline machine language code, as well as allowing the definition of one word code procedures which get expanded in line to support Operating System Calls via TRAPs. In any case the low-level escape paths should be considered to be machine specific, but it would be nice if there was a standard syntax for insert code inline. (The current MacMETH compiler supports the psuedo function INLINE to do this). I also feel that the current implemenetations do not support linking to non-Modula-2 object files because of external restrictions on identifer lengths on some systems. I currently have a compiler which runs on a Macintosh, and can link to assembler code, and there are only a few small changes left to allow it to link to Pascal routines. Also the code from this compiler can be called from either assembler, "C", or "Pascal" with the only problem is that the init routines have to be called by the program (if for any reason someone would want to do this). Bob Campbell Teltone Corporation 18520 - 66th AVE NE P.O. Box 657 Seattle, WA 98155 Kirkland, WA 98033 {amc,dataio,fluke,hplsla,sunup,uw-beaver}!tikal!bobc
marty@ism780c.UUCP (Marty Smith) (12/12/86)
In article <8612091923.AA23422@olsen.uucp> nagler@seismo.CSS.GOV@olsen.UUCP (Robert Nagler) writes: >In response to the comments about ``inline'' procedures in M2 (and other >procedure languages. I do not see why the programmer should provide hints >to the compiler or linker as to the ``correct'' way to optimize a particular >program. > [...] >Clearly, I am making a common argument: let the compiler do it, it is smarter >than you are. I know of no compiler that is smarter than I am. When I want a program optimized for execution speed rather than memory usage, I want the compiler to optimize for speed. Marty Smith
ahe@k.cc.purdue.edu.UUCP (12/12/86)
In article <4814@ism780c.UUCP> marty@ism780c.UUCP (Marty Smith) writes: >In article <11458@cca.UUCP> bills@cca.UUCP (Bill Stackhouse) writes: >> >>Something I would like to see in all procedure based languages >>is some syntax in the procedure def. that indicates that the ^^^^^^^^^ >>procedure is to be included inline at all places it is called. In article <475@aw.sei.cmu.edu.sei.cmu.edu>, firth@sei.cmu.edu.UUCP writes: > Let me agree at once with Stuart and others that a "good" compiler > should automatically expand procedures inline where appropriate. > > However, I think some pragma or hint is needed to tell most > compilers where this is indeed appropriate. Suppose, for instance, > that a DEFINITION module contains a procedure definition, and the > IMPLEMENTATION module contains the body. If the procedure is > compiled as true out-of-line code, then you can replace the > implementation (body) without recompiling dependents. [...] The original article referenced *procedures* and not modules. Furthermore, in the case of a module, the compiler has NO CHOICE but to generate out-of-line code; the separateness must be maintained. In article <475@aw.sei.cmu.edu.sei.cmu.edu>, firth@sei.cmu.edu.UUCP writes: > But here I'd state my personal preference for more intelligent > compilers. In general, the user should give the compiler any > information it CANNOT find out for itself, and no other > information. The compiler already has all the information it needs. When one specifies "module", the compiler can, should, and will take this as a clear indication that the code is *not* to be expanded in-line. Bill Wolfe
firth@sei.cmu.edu (Robert Firth) (12/15/86)
In article <1656@k.cc.purdue.edu> ahe@k.cc.purdue.edu (Bill Wolfe) writes: >In article <4814@ism780c.UUCP> marty@ism780c.UUCP (Marty Smith) writes: >>In article <11458@cca.UUCP> bills@cca.UUCP (Bill Stackhouse) writes: >>> >>>Something I would like to see in all procedure based languages >>>is some syntax in the procedure def. that indicates that the > ^^^^^^^^^ >>>procedure is to be included inline at all places it is called. > >In article <475@aw.sei.cmu.edu.sei.cmu.edu>, firth@sei.cmu.edu.UUCP writes: >> Let me agree at once with Stuart and others that a "good" compiler >> should automatically expand procedures inline where appropriate. >> >> However, I think some pragma or hint is needed to tell most >> compilers where this is indeed appropriate. Suppose, for instance, >> that a DEFINITION module contains a procedure definition, and the >> IMPLEMENTATION module contains the body. If the procedure is >> compiled as true out-of-line code, then you can replace the >> implementation (body) without recompiling dependents. [...] > > The original article referenced *procedures* and not modules. > Furthermore, in the case of a module, the compiler has NO CHOICE but to > generate out-of-line code; the separateness must be maintained. > And I was talking about procedures defined in modules. In M2, ALL procedures are defined in modules, so we are talking about the same thing. The Modula-2 Report [third edition, Ch 14] requires only that modules be separately compiled, it does not require that an implementation module be REcompilable without visible effect. This is mentioned in the user manual merely as one possible feature of a "sophisticated" Modula compiler. >In article <475@aw.sei.cmu.edu.sei.cmu.edu>, firth@sei.cmu.edu.UUCP writes: >> But here I'd state my personal preference for more intelligent >> compilers. In general, the user should give the compiler any >> information it CANNOT find out for itself, and no other >> information. > > The compiler already has all the information it needs. When one > specifies "module", the compiler can, should, and will take this > as a clear indication that the code is *not* to be expanded in-line. > > Bill Wolfe > Such a compiler would be wrong. It really does help to read the language definition, you know.
nagler@seismo.CSS.GOV@olsen.UUCP (Robert Nagler) (12/16/86)
From Marty Smith: When I want a program optimized for execution speed rather than memory usage, I want the compiler to optimize for speed. I agree with the above statement whole-heartedly. I think all compilers should have a range of optimization: execution speed vs. space vs. compilation time. The programmer could specify it on the command line. The compiler could then trade off on the amount of loop unrolling, size of procedures to inline, complexity of register coloring algorithms, etc. From Doug Johnston: The decision to include procedures inline may often be better done by compilers but compilers often do not have all of the information necessary to make good decisions. Exactly what information does the compiler have that the programmer doesn't? I think there are many things that the programmer shouldn't know that the compiler does, e.g. knowing the cache size of the target is 128Kbytes (fully associative), thus it might be a good idea to re-execute the 5 lines of code from the cache than re-fetching the instructions if the procedure is called three times in a row. ...I know that programmers may not always make the best decisions but it is not my job to control them. Given two tasks, people always tend towards the easier one. Micro-optimizing source code is *easy* for your average programmer, but designing quality systems is hard (documented, maintainable, extensible, robust). I make this statement after catching a lot of very fast bugs. As a manager, I want the programmers I work with to design fast systems not optimize bad designs. Sometimes, I think it would be better if we weren't even allowed to write code and we would have to hand over a design to a data entry typist who knows how to translate pseudo-code to Modula-2. Getting to the bits is very important, but only in a very few cases (10% of the code does 90% of the work). I don't know. Maybe I am wrong. If this discussion results in a Modula-2 standard for INLINE, I think we should also consider the following primitives: LOCKDOWN - inserted before a memory declaration (procedure or variable), the declared memory is ``locked down'' in so that swapping or paging will not occur. REGISTER - before a variable declaration indicates that the variable should be kept in memory (even globals). Before a parameter procedure (in def mod of course) would specify that the parameters should be passed in the registers. SUBEXP[xxx] - before a parenthesized expression indicates to the compiler that the following expression is identified as 'xxx'. If another SUBEXP[xxx] appears, then the compiler should use the previous value instead of generating the code for this new expression. REF - before a parameter declaration indicates to the compiler that the parameter should be passed by reference, but to the caller it should appear as a pass-by-value. The idea of course is that the called procedure will probably not touch the variable. This would replace the over-used VAR (see Strings.Length in Logitech's library for a good use of this new primitive). DONTLOOK - instead of these silly function-like coercions that always seem to get in the way. This primitive would tell the compiler to turn off all levels of type, range, etc-checking until a matching END statement. Did I leave out anything? If you would like to see a more complete list, see the "C Puzzle Book". Rob Nagler [If my employer knew what I was writing, I think he would fire me for wasting time. As far as opinions go, my employer has a lot of them.]
AlHall.osbunorth@XEROX.COM (01/06/87)
I'm interested in hearing from anyone who has given some serious thought to implementing "User exception handling" (as per Bob Campbell's message of 12/11/86) in Modula-2. Return codes really make me C-sick; I'd much rather write less-than-portable code in Modula-2 than pollute my normal-case code with a ridiculous number of IF statements [how would you like your conscious mind cluttered with paranoid thoughts like, "Have I been hit by a 747?" every time you cross the street?]. Implementation details would be nice, but I'm mostly interested in a "sanity check" regarding my own thoughts on its ease of implementation in Modula-2 as opposed to C. Thanks, Al