[net.lang.c] Selecting a Prog-Lang: Support for C

tjr@ihnet.UUCP (Tom Roberts) (05/11/84)

[WARNING: this article is 246 lines long]

Thanks to those of you who have sent me suggestions why YOU choose
to program in C. Here are my current thoughts on the subject.

This is intended to provide some "intellectual ammunition" to those
unfortunate desciples of C who must justify its use to their management
(or other Higher Authority). Its length is due to the inherent
complexity of the problem of selecting a Programming Language.

I have NOT attempted to include ANY economic justifications, because
they depend strongly upon your organization's history (I am thinking
of such things as: "It would cost N staff-months to re-train our staff
to use Programming-Language X"). Ultimately, such arguments are the
strongest ones to use to convince management; they are also the most
difficult to quantify.

Please send all flames and "ad hominem" attacks to /dev/null; quibbles
about details are best left to mail; reasonable (and reasoned) discussions 
about issues raised herein are welcome.


INTRODUCTION:

	Here are a few random notes on the choice of a programming-language
	for a medium-to-large software project. I assume that the project
	will include computers from several manufacturers, ranging from micros
	to mainframes, so that only reasonably portable high-level languages
	need be considered (FORTRAN, COBOL, PL/I, PASCAL, ADA, and C).
	I assume the project is primarily data-management, does not contain
	large amounts of floating-point computations, and will involve several
	(many) programmers.  I consider such languages as LISP and SNOBOL to be
	too limited in scope; I consider MODULA-2 to be too new and un-developed
	(but interesting....).

CAVEATS:

	My personal bias shows: I have written ~50000 lines of FORTRAN
	(Elementary Particle Physics), ~250 lines PASCAL (curiosity),
	~12000 lines COBOL (data-base system), ~15000 lines C (OS, and
	misc applications), ~15000 lines assembler (Z80, CDC-6500, HP2100,
	PDP-10, PDP-15, IBM-404 (3-plugboards!)). I have read several
	texts on ADA, but have not programmed in it. At present,
	I program in C whenever permissible, because it has the
	best blend of features and efficiency I desire. I know next to
	nothing about PL/I.

	ALL OPINIONS EXPRESSED ARE MY OWN.

PORTABILITY:

	FORTRAN, PASCAL, and COBOL have a central, portable base;
	this often is NOT sufficient to write complete systems
	(e.g. the lack of portability in their data-files).
	PASCAL is un-usable without some (non-portable) extensions.
	ADA is intended to be portable, and complete; this has yet
	to be demonstrated. C has a de-facto standard (K&R), and
	is portable as a language; the library functions are not always
	portable; operation under UNIX is completely portable, if the
	original code is written with portability in mind (the same can
	almost be said for the other languages - this is more a statement
	about UNIX than the language).

PRODUCTIVITY:

	FORTRAN and COBOL rate "low" on productivity, because of their lack
	of structured-programming features, and their lack of compile-time
	constructs (#define/#include/#if); pre-processors (e.g. RATFOR)
	can mollify this. PASCAL rates "high", except for
	hardware-dependent programs, strings, and file I/O, where it
	has problems. C rates "high", except possibly for large projects, where
	poor readability can be a drawback; this is highly dependent upon
	the skills and experience of the programming staff. ADA has more
	features and complexity than C, and so can be less readable;
	it can overload operators and functions, which could make it
	either less or more readable; the readability of ADA programs
	will probably depend even more upon the skills and experience
	of the staff.

PERFORMANCE:

	COBOL tends to be slow in execution, mainly due to the type of
	applications using it; implementations on small machines (e.g.
	Z80) are usually interpreters (VERY slow). FORTRAN is OK, but
	the lack of pointers means that data-structures are often implemented
	as two-dimensional arrays, with a multiplication for each
	reference - this can be intolerable on hardware without
	a multiply instruction. PASCAL is OK, if it is compiled;
	many implementations are tokenized interpreters (P-code),
	with poorer performance; its array-bounds checking (part of the
	language) can reduce performance significantly. C has none of these
	drawbacks, and can be partially "hand-optimized" by declaring some
	variables as "registers"; most C compilers do not optimize code as
	well as might be wished. C models the instruction-sets of many computers
	very well (especially the modern microprocessor chips like the MC68000,
	WE32000, etc.); on such CPUs, C can approach assembly-language
	in speed and efficiency.  ADA has not provided much experience;
	initial guesses are that "simple" things will be OK, "complicated"
	things (e.g. tasks) might not.

PROGRAM COMPLEXITY:

	PASCAL and ADA, with their strong-typing, can cause complexity
	to increase (e.g. dynamic-memory allocation).
	FORTRAN causes complexity because it lacks structures
	(this is the MAJOR reason to avoid FORTRAN); call-by-name can
	interact with COMMON in un-expected (and non-portable) ways;
	dynamic allocation is very difficult, and usually involves
	non-portable operations (e.g. referencing arrays out-of-bounds).
	The lack of recursive functions (FORTRAN, COBOL) can seriously
	complicate inherently recursive algorithms.
	COBOL relies heavily upon global data, making scoping
	virtually impossible; its restricted set of statements makes
	even simple programs LOOK complex; dynamic allocation is
	virtually impossible. C makes it impossible to do single-precision
	floating-point arithmetic (double-precision is used); complex
	arithmetic is not defined (must be implemented as structures and
	functions). C (and to some extent, ADA) can perform low-level
	operations (e.g. I/O drivers in UNIX are routinely written in C);
	this can greatly improve productivity, complexity, and readability
	when such operations must be performed.

SOFTWARE ENGINEERING:

	FORTRAN, PASCAL, and COBOL offer little or no help; pre-processing
	is essential. C contains its own pre-processor with the most-used
	features (#include/#define/#if). PASCAL does not specify separate
	compilation - a VERY big drawback. C is specified by a grammar,
	which can GREATLY ease the construction of sophisticated language-
	processing tools; it also improves the performance of the compiler
	(during development, most systems spend more resources compiling
	than executing). C can provide basic control of symbol location
	(e.g. RAM or ROM), which can simplify symbol-management, and
	permits writing ROM-able code (which is inherently non-portable).
	ADA is so large that I suspect it will be VERY slow during
	program builds; its support environment is also complicated (and,
	I suspect, inefficient) - much of this is dictated by the
	large-scale systems it is intended to support.

TOOLS:

	FORTRAN and COBOL have many existing tools, of varying quality and
	portability (many have strong OS dependencies); UNIX tools can be
	of reasonable utility. PASCAL has several integrated programming
	environments, most of which are reasonably portable. ADA has a
	complete, portable environment defined (but un-implemented at
	present). C uses UNIX tools quite well, and has some special
	language-processing tools (e.g. lex and yacc).
	Several major source-handling systems are specifically C-language.

RELIABILITY/MAINTAINABILITY:
	
	This seems to depend more upon the skills and experience of the
	programming and design staff, than upon the choice of language.
	The more intricate languages (C and ADA) can contain more subtle
	"hidden" errors, simply by virtue of their richer syntax; however,
	they can also result in shorter (i.e. fewer NCSL) programs.
	The lack of structures in FORTRAN is a serious drawback, because
	it can make programs un-readable; ditto for use of EQUIVALENCE.
	COBOL tends to be so readable that important items are obscured
	by the incredible amount of extraneous text (i.e. the "Purloined
	Letter" syndrome).

ADDITIONAL COMMENTS:

	ADA:

	ADA is a new language, with no existing programmer base; it will
	take some time to become experienced in ADA programming,
	software engineering, and management. It LOOKS very promising,
	but it has looked so for so long that I worry that it is really
	too complicated, and too difficult to implement. I shy away from
	its byzantine complexity - top-notch programmers will have no
	serious trouble, but I suspect that "average" programmers will
	NEVER come to grips with all of its features/idiosyncracies (and
	they're the ones who will maintain the code).
	ADA has tried to do EVERYTHING (numerical analysis, real-time control,
	scientific computation, data-base, concurrent programming,
	Operating Systems, etc.); and each application-programmer has
	to learn the special features designed for everyone else. Much will
	probably be written in ADA, but I doubt that many programmers
	will voluntarily choose it (their managers will choose it for them).
	Training programmers to use ADA will surely require more time and
	effort than any of these other languages, I am not yet convinced that
	the savings (mainly software engineering issues) will offset this.
	If ADA truly becomes a universal, portable language, with a portable
	environment to support it, it will probably (and justifiably)
	displace the other languages (you CAN program in a tractable
	subset...); don't hold your breath.

	Strong Typing:

	Strong typing is the attribute of a language that assigns a
	specific "type" to each entity (variable) in a program, and then
	prohibits the mixing of different types. Of the languages discussed,
	PASCAL and ADA are strongly-typed, the others are not (FORTRAN,
	COBOL and C are "weakly" typed in that some mixtures are legal,
	others are not; some type-conversions are automatically supplied).
	The advantage is that some programming errors can easily be detected,
	because mixing types is often illogical or nonsensical. The
	disadvantage is that when you really need to mix types, the
	compiler gets in the way, forcing you to do something special
	(sometimes un-obvious and non-portable).  ADA and C have
	(portable) mechanisms to subvert the typing restrictions.

	Portability:

	Portability is the attribute of a language that allows a program
	written in it to be run, without change, on several (many)
	computers. ADA is inherently portable, and C is nearly so;
	FORTRAN, COBOL, and PASCAL are NOT portable (they were all intended
	to be so, but the implementations fall far below the intentions);
	in practice they can sometimes be portable enough.
	I feel that portability is VERY important, because the time-scale
	of a software system is typically long compared to the time-scale of
	current hardware advances, and because many applications inherently
	require several different types of computers to cooperate together
	and act as one system.

SUMMARY:

	COBOL is un-suitable because of its poor portability,
	its restricted set of operations, and its lack of efficiency in most
	implementations. Besides, it is just plain UGLY.

	FORTRAN is un-suitable because of its lack of data-structures,
	lack of pointers (and dynamic allocation), and poor portability.

	PASCAL is un-suitable because separate compilation is not specified,
	because its (necessary) extensions are not portable, and because
	of the added complexity added by its strong typing.

	ADA is not suitable because it doesn't exist as a useful language
	on a sufficiently-large number of machines.

	C has only minor drawbacks compared to the other languages considered.

CAVEATS (revisited):

	All opinions expressed are my own. Remember that I have virtually
	no experience in PASCAL, and none at all in ADA or PL/I.



				Tom Roberts
				ihnp4!ihnet!tjr

mauney@ncsu.UUCP (Jon Mauney) (05/16/84)

I cannot let the discussion of programming languages (Pascal, Ada, C,
Cobol, Fortran) go by without challenge.  In general I find the 
arguments poorly thought out,  but in particular I object to these
two statements:

>	PASCAL and ADA, with their strong-typing, can cause complexity
>	to increase (e.g. dynamic-memory allocation).

>	C is specified by a grammar,
>	which can GREATLY ease the construction of sophisticated language-
>	processing tools; it also improves the performance of the compiler

These statements are so patently absurd that I don't know what to say.
Instead of adding my own prejudices to the discussion, let me recommend
that everyone read the book 
     "Comparing and Assessing Programming Languages -- Ada, C, Pascal"
     edited by Feuer and Gehani, published by Prentice-Hall.
The book reprints papers on the subject by all the greats, Wirth, Ritchie,
Kernighan, Habermann, Shaw, Wulf.  

It is perfectly alright to like or dislike a programming language,
but if you are going to give reasons,  you should know whereof you
speak.  Feuer and Gehani is a good place to start.

     (And those of you who think Pascal is not portable might be interested
     to know that the users of my parser generators, like the owners of
     Remington Micro-screen shavers, almost never complain.)

-- 

_Doctor_                           Jon Mauney,    mcnc!ncsu!mauney
\__Mu__/                           North Carolina State University

ags@pucc-i (Seaman) (05/17/84)

How can anyone discuss programming languages without mentioning Modula-2?
Here is a language which combines the best features of C, Pascal and Ada,
and which is small enough to run on microcomputers, and no one even considers
it when evaluating languages.
-- 

Dave Seaman
..!pur-ee!pucc-i:ags

"Against people who give vent to their loquacity 
by extraneous bombastic circumlocution."

ab3@stat-l (Rsk the Wombat) (05/17/84)

Dave, no one mentioned Modula-2 because it's still an experiment,
and, besides, the article was titled (see above) "...Support for C";
certainly a great many of the arguments in favor of C may also be
applied to Modula-2; but I think it's a little early to consider using Wirth's
latest product in a *big way*.
-- 
Rsk the Wombat
UUCP: { allegra, decvax, ihnp4, harpo, teklabs, ucbvax } !pur-ee!rsk
      { cornell, eagle, hplabs, ittvax, lanl-a, ncrday } !purdue!rsk

ags@pucc-i (Seaman) (05/18/84)

>  Dave, no one mentioned Modula-2 because it's still an experiment,
>  and, besides, the article was titled (see above) "...Support for C";

Then why did the article mention Pascal, Ada, FORTRAN and (*gasp*) COBOL?

Seriously, I recognize that this group is net.lang.c and not net.lang.mod2,
but as long as other people are comparing languages, it seems worthwhile
to have some good languages available for comparison.

Modula-2 combines the simplicity and readability of Pascal, the low-level 
facilities and expressive power of C, and the information-hiding, high-level
structuring and separate compilation capabilities of Ada in a single language.

I listened to Brian Kernighan's talk here in which he mentioned C++, the
new experimental version of C.  C++ looks like an attempt to add some of
Modula-2's features to C (primarily information hiding), but it lacks
Modula-2's built-in facilities for separating the "definition" and
"implementation" portions of a module and providing automatic version
control, both at compile time and at load time.  

There is a fundamental difference between "adding on" features to a language 
and designing them in from the beginning, a point which Wirth obviously 
appreciated when he abandoned Pascal and started over with Modula-2.  
Remember the programmer's axiom:  "Build one to throw away."
-- 

Dave Seaman
..!pur-ee!pucc-i:ags

"Against people who give vent to their loquacity 
by extraneous bombastic circumlocution."