tjr@ihnet.UUCP (Tom Roberts) (05/11/84)
[WARNING: this article is 246 lines long] Thanks to those of you who have sent me suggestions why YOU choose to program in C. Here are my current thoughts on the subject. This is intended to provide some "intellectual ammunition" to those unfortunate desciples of C who must justify its use to their management (or other Higher Authority). Its length is due to the inherent complexity of the problem of selecting a Programming Language. I have NOT attempted to include ANY economic justifications, because they depend strongly upon your organization's history (I am thinking of such things as: "It would cost N staff-months to re-train our staff to use Programming-Language X"). Ultimately, such arguments are the strongest ones to use to convince management; they are also the most difficult to quantify. Please send all flames and "ad hominem" attacks to /dev/null; quibbles about details are best left to mail; reasonable (and reasoned) discussions about issues raised herein are welcome. INTRODUCTION: Here are a few random notes on the choice of a programming-language for a medium-to-large software project. I assume that the project will include computers from several manufacturers, ranging from micros to mainframes, so that only reasonably portable high-level languages need be considered (FORTRAN, COBOL, PL/I, PASCAL, ADA, and C). I assume the project is primarily data-management, does not contain large amounts of floating-point computations, and will involve several (many) programmers. I consider such languages as LISP and SNOBOL to be too limited in scope; I consider MODULA-2 to be too new and un-developed (but interesting....). CAVEATS: My personal bias shows: I have written ~50000 lines of FORTRAN (Elementary Particle Physics), ~250 lines PASCAL (curiosity), ~12000 lines COBOL (data-base system), ~15000 lines C (OS, and misc applications), ~15000 lines assembler (Z80, CDC-6500, HP2100, PDP-10, PDP-15, IBM-404 (3-plugboards!)). I have read several texts on ADA, but have not programmed in it. At present, I program in C whenever permissible, because it has the best blend of features and efficiency I desire. I know next to nothing about PL/I. ALL OPINIONS EXPRESSED ARE MY OWN. PORTABILITY: FORTRAN, PASCAL, and COBOL have a central, portable base; this often is NOT sufficient to write complete systems (e.g. the lack of portability in their data-files). PASCAL is un-usable without some (non-portable) extensions. ADA is intended to be portable, and complete; this has yet to be demonstrated. C has a de-facto standard (K&R), and is portable as a language; the library functions are not always portable; operation under UNIX is completely portable, if the original code is written with portability in mind (the same can almost be said for the other languages - this is more a statement about UNIX than the language). PRODUCTIVITY: FORTRAN and COBOL rate "low" on productivity, because of their lack of structured-programming features, and their lack of compile-time constructs (#define/#include/#if); pre-processors (e.g. RATFOR) can mollify this. PASCAL rates "high", except for hardware-dependent programs, strings, and file I/O, where it has problems. C rates "high", except possibly for large projects, where poor readability can be a drawback; this is highly dependent upon the skills and experience of the programming staff. ADA has more features and complexity than C, and so can be less readable; it can overload operators and functions, which could make it either less or more readable; the readability of ADA programs will probably depend even more upon the skills and experience of the staff. PERFORMANCE: COBOL tends to be slow in execution, mainly due to the type of applications using it; implementations on small machines (e.g. Z80) are usually interpreters (VERY slow). FORTRAN is OK, but the lack of pointers means that data-structures are often implemented as two-dimensional arrays, with a multiplication for each reference - this can be intolerable on hardware without a multiply instruction. PASCAL is OK, if it is compiled; many implementations are tokenized interpreters (P-code), with poorer performance; its array-bounds checking (part of the language) can reduce performance significantly. C has none of these drawbacks, and can be partially "hand-optimized" by declaring some variables as "registers"; most C compilers do not optimize code as well as might be wished. C models the instruction-sets of many computers very well (especially the modern microprocessor chips like the MC68000, WE32000, etc.); on such CPUs, C can approach assembly-language in speed and efficiency. ADA has not provided much experience; initial guesses are that "simple" things will be OK, "complicated" things (e.g. tasks) might not. PROGRAM COMPLEXITY: PASCAL and ADA, with their strong-typing, can cause complexity to increase (e.g. dynamic-memory allocation). FORTRAN causes complexity because it lacks structures (this is the MAJOR reason to avoid FORTRAN); call-by-name can interact with COMMON in un-expected (and non-portable) ways; dynamic allocation is very difficult, and usually involves non-portable operations (e.g. referencing arrays out-of-bounds). The lack of recursive functions (FORTRAN, COBOL) can seriously complicate inherently recursive algorithms. COBOL relies heavily upon global data, making scoping virtually impossible; its restricted set of statements makes even simple programs LOOK complex; dynamic allocation is virtually impossible. C makes it impossible to do single-precision floating-point arithmetic (double-precision is used); complex arithmetic is not defined (must be implemented as structures and functions). C (and to some extent, ADA) can perform low-level operations (e.g. I/O drivers in UNIX are routinely written in C); this can greatly improve productivity, complexity, and readability when such operations must be performed. SOFTWARE ENGINEERING: FORTRAN, PASCAL, and COBOL offer little or no help; pre-processing is essential. C contains its own pre-processor with the most-used features (#include/#define/#if). PASCAL does not specify separate compilation - a VERY big drawback. C is specified by a grammar, which can GREATLY ease the construction of sophisticated language- processing tools; it also improves the performance of the compiler (during development, most systems spend more resources compiling than executing). C can provide basic control of symbol location (e.g. RAM or ROM), which can simplify symbol-management, and permits writing ROM-able code (which is inherently non-portable). ADA is so large that I suspect it will be VERY slow during program builds; its support environment is also complicated (and, I suspect, inefficient) - much of this is dictated by the large-scale systems it is intended to support. TOOLS: FORTRAN and COBOL have many existing tools, of varying quality and portability (many have strong OS dependencies); UNIX tools can be of reasonable utility. PASCAL has several integrated programming environments, most of which are reasonably portable. ADA has a complete, portable environment defined (but un-implemented at present). C uses UNIX tools quite well, and has some special language-processing tools (e.g. lex and yacc). Several major source-handling systems are specifically C-language. RELIABILITY/MAINTAINABILITY: This seems to depend more upon the skills and experience of the programming and design staff, than upon the choice of language. The more intricate languages (C and ADA) can contain more subtle "hidden" errors, simply by virtue of their richer syntax; however, they can also result in shorter (i.e. fewer NCSL) programs. The lack of structures in FORTRAN is a serious drawback, because it can make programs un-readable; ditto for use of EQUIVALENCE. COBOL tends to be so readable that important items are obscured by the incredible amount of extraneous text (i.e. the "Purloined Letter" syndrome). ADDITIONAL COMMENTS: ADA: ADA is a new language, with no existing programmer base; it will take some time to become experienced in ADA programming, software engineering, and management. It LOOKS very promising, but it has looked so for so long that I worry that it is really too complicated, and too difficult to implement. I shy away from its byzantine complexity - top-notch programmers will have no serious trouble, but I suspect that "average" programmers will NEVER come to grips with all of its features/idiosyncracies (and they're the ones who will maintain the code). ADA has tried to do EVERYTHING (numerical analysis, real-time control, scientific computation, data-base, concurrent programming, Operating Systems, etc.); and each application-programmer has to learn the special features designed for everyone else. Much will probably be written in ADA, but I doubt that many programmers will voluntarily choose it (their managers will choose it for them). Training programmers to use ADA will surely require more time and effort than any of these other languages, I am not yet convinced that the savings (mainly software engineering issues) will offset this. If ADA truly becomes a universal, portable language, with a portable environment to support it, it will probably (and justifiably) displace the other languages (you CAN program in a tractable subset...); don't hold your breath. Strong Typing: Strong typing is the attribute of a language that assigns a specific "type" to each entity (variable) in a program, and then prohibits the mixing of different types. Of the languages discussed, PASCAL and ADA are strongly-typed, the others are not (FORTRAN, COBOL and C are "weakly" typed in that some mixtures are legal, others are not; some type-conversions are automatically supplied). The advantage is that some programming errors can easily be detected, because mixing types is often illogical or nonsensical. The disadvantage is that when you really need to mix types, the compiler gets in the way, forcing you to do something special (sometimes un-obvious and non-portable). ADA and C have (portable) mechanisms to subvert the typing restrictions. Portability: Portability is the attribute of a language that allows a program written in it to be run, without change, on several (many) computers. ADA is inherently portable, and C is nearly so; FORTRAN, COBOL, and PASCAL are NOT portable (they were all intended to be so, but the implementations fall far below the intentions); in practice they can sometimes be portable enough. I feel that portability is VERY important, because the time-scale of a software system is typically long compared to the time-scale of current hardware advances, and because many applications inherently require several different types of computers to cooperate together and act as one system. SUMMARY: COBOL is un-suitable because of its poor portability, its restricted set of operations, and its lack of efficiency in most implementations. Besides, it is just plain UGLY. FORTRAN is un-suitable because of its lack of data-structures, lack of pointers (and dynamic allocation), and poor portability. PASCAL is un-suitable because separate compilation is not specified, because its (necessary) extensions are not portable, and because of the added complexity added by its strong typing. ADA is not suitable because it doesn't exist as a useful language on a sufficiently-large number of machines. C has only minor drawbacks compared to the other languages considered. CAVEATS (revisited): All opinions expressed are my own. Remember that I have virtually no experience in PASCAL, and none at all in ADA or PL/I. Tom Roberts ihnp4!ihnet!tjr
mauney@ncsu.UUCP (Jon Mauney) (05/16/84)
I cannot let the discussion of programming languages (Pascal, Ada, C, Cobol, Fortran) go by without challenge. In general I find the arguments poorly thought out, but in particular I object to these two statements: > PASCAL and ADA, with their strong-typing, can cause complexity > to increase (e.g. dynamic-memory allocation). > C is specified by a grammar, > which can GREATLY ease the construction of sophisticated language- > processing tools; it also improves the performance of the compiler These statements are so patently absurd that I don't know what to say. Instead of adding my own prejudices to the discussion, let me recommend that everyone read the book "Comparing and Assessing Programming Languages -- Ada, C, Pascal" edited by Feuer and Gehani, published by Prentice-Hall. The book reprints papers on the subject by all the greats, Wirth, Ritchie, Kernighan, Habermann, Shaw, Wulf. It is perfectly alright to like or dislike a programming language, but if you are going to give reasons, you should know whereof you speak. Feuer and Gehani is a good place to start. (And those of you who think Pascal is not portable might be interested to know that the users of my parser generators, like the owners of Remington Micro-screen shavers, almost never complain.) -- _Doctor_ Jon Mauney, mcnc!ncsu!mauney \__Mu__/ North Carolina State University
ags@pucc-i (Seaman) (05/17/84)
How can anyone discuss programming languages without mentioning Modula-2? Here is a language which combines the best features of C, Pascal and Ada, and which is small enough to run on microcomputers, and no one even considers it when evaluating languages. -- Dave Seaman ..!pur-ee!pucc-i:ags "Against people who give vent to their loquacity by extraneous bombastic circumlocution."
ab3@stat-l (Rsk the Wombat) (05/17/84)
Dave, no one mentioned Modula-2 because it's still an experiment, and, besides, the article was titled (see above) "...Support for C"; certainly a great many of the arguments in favor of C may also be applied to Modula-2; but I think it's a little early to consider using Wirth's latest product in a *big way*. -- Rsk the Wombat UUCP: { allegra, decvax, ihnp4, harpo, teklabs, ucbvax } !pur-ee!rsk { cornell, eagle, hplabs, ittvax, lanl-a, ncrday } !purdue!rsk
ags@pucc-i (Seaman) (05/18/84)
> Dave, no one mentioned Modula-2 because it's still an experiment, > and, besides, the article was titled (see above) "...Support for C"; Then why did the article mention Pascal, Ada, FORTRAN and (*gasp*) COBOL? Seriously, I recognize that this group is net.lang.c and not net.lang.mod2, but as long as other people are comparing languages, it seems worthwhile to have some good languages available for comparison. Modula-2 combines the simplicity and readability of Pascal, the low-level facilities and expressive power of C, and the information-hiding, high-level structuring and separate compilation capabilities of Ada in a single language. I listened to Brian Kernighan's talk here in which he mentioned C++, the new experimental version of C. C++ looks like an attempt to add some of Modula-2's features to C (primarily information hiding), but it lacks Modula-2's built-in facilities for separating the "definition" and "implementation" portions of a module and providing automatic version control, both at compile time and at load time. There is a fundamental difference between "adding on" features to a language and designing them in from the beginning, a point which Wirth obviously appreciated when he abandoned Pascal and started over with Modula-2. Remember the programmer's axiom: "Build one to throw away." -- Dave Seaman ..!pur-ee!pucc-i:ags "Against people who give vent to their loquacity by extraneous bombastic circumlocution."