jperry@UNIX.SRI.COM (John Perry) (05/27/87)
I have actually written about twice as much C code as Pascal or Modula-2 code and so my criticism of C is not made solely at an "airy" theoretical level. You say that "If C is so terrible, then how do you explain so many commercial software developers switching to it?". I thought I answered that in my previous memo i.e. that some languages have dirty details that make them so desirable in a practical sense that programmer's are willing to overlook even the grossest of structural defects. Turing and/or Modula-2 have failed to hold sway in the commercial world for the very simple reason that they arrived after both C and Ada and simply haven't had sufficient people to toot the horn for them. That's all it is --- C got their first and, like the inertia of FORTRAN programmers to learning a different language, C programmers don't want to change either (and I know plenty of them who HATE C but just don't want to make the effort to get out of their bind). Why? Well, as you put it, C is "good enough" --- the typical sunk-cost outlook --- another tool has to be 100% better to justify a changeover. C, when examined closely, is so full of inconsistencies and unnecessarily cryptic syntactical finesses that, and I will reiterate my practical observation, that even good C programmers can be seen consulting a C manual even after several years of practice! I hate to get bogged down in another "features" discussion but it seems I'm being drawn in to it --- so here goes. First, inconsistencies in parameters of library calls. Why is it that, in an fprintf statement, the file is the first parameter but in a putc statement it is the SECOND parameter? And why, oh why, does C have to be different than the rest of the world by making the destination string be the FIRST parameter in string subroutines such as strcat? Oddly enough, in the original Kernighan and Ritchie C book, their version of strcpy has the destination string as the second parameter on page 100 --- why the change to the counterintuitive in commercial C? Second, inconsistencies in rules for structured data types. Can anybody give me a good reason why arrays cannot be assigned in assignment statements as can structures? Another artificiality that one must keep in mind. Third, C's allowance of mixing pointer and array types leads to abominably unreadable code. In some UNIX sources I have seen such mixtures leading to oddities like a[3] meaning "the third value after where the pointer to a is currently pointing" rather than simply the fourth value of the array. Try reading this kind of stuff and keeping your sanity. Fourth, the simple C data declaration: int *pi; gives you no clue whether pi points to a SCALAR integer or an ARRAY of integers. This kind of ambiguity does not exist in Modula-2 or Pascal where the declaration of pi would tell us whether we have a pointer to a scalar or structured data type. Fifth, C's rules for initializing character arrays, especially "ragged" arrays of variable length characters, seems to differ from implementation to implementation. Sixth, C allows an unreadable degree of programmer cleverness as in such code segments as: strcpy(s,t) char *s,*t; { while(*s++ = *t++);} Which relies on the artifact that a string terminator happens to be an ASCII zero and that, since the value of an assignment statement is the value of the left hand side, the while will terminate when NULL is encountered. C moguls actually ENCOURAGE this kind of "idiomatic" expression e.g. Kernighan and Ritchie -- "Although this may seem cryptic at first sight, the NOTATIONAL CONVENIENCE (my emphasis) is considerable, and the idiom should be mastered ...". By notational convenience they mean THAT IT CAN BE TYPED IN QUICKLY!!! And, of course, the "idiom should be mastered" if one is to enter the pantheon known as "C cognoscenti" --- God forbid if one's C code looks like that of a Pascal programmer!! I could rant on and on about the poor human engineering of the C language (how many times have you gotten caught on if a = c instead of if a == c or *p++ versus (*p)++ ??) and get even further bogged down in the "features" quagmire. But to what end?? The "bottom line" of my complaints about C is that it is a poorly engineered language which CANNOT be improved by continually adding "features". Its features are counterintuitive, poorly human engineered in that they invite error, and literally beckon the kind of cryptographic, "clever" code which seems to be ENCOURAGED by its leading proponents. On the other hand, I feel that the basic skeleton structure of a language like Modula-2 is so sound that the addition of precious few features and a couple minor language changes would create a nearly ideal programming tool. But, it'll probably never happen because the artificialities and ambiguities of C create the sort of mystique about the "difficulties of programming" that programmers love. Next to chess players, their egos are the most insufferable. John Perry
gwyn@BRL.ARPA (Doug Gwyn, VLD/VMB) (05/28/87)
> From: John Perry <jperry@sri-unix.arpa> > I have actually written about twice as much C code as Pascal or > Modula-2 code and so my criticism of C is not made solely at an "airy" > theoretical level. That doesn't follow at all. Your criticisms amount to complaints that C does not meet some ideal model you have in mind for a programming language. As will be shown, your understanding of C is superficial, and evidently hampered by your prejudices. I'm not objecting to that in itself, but to the fact that you are making recommendations based on such views. > You say that "If C is so terrible, then how do you explain so many > commercial software developers switching to it?". I thought I answered > that in my previous memo i.e. that some languages have dirty details > that make them so desirable in a practical sense that programmer's > are willing to overlook even the grossest of structural defects. I see. It's not utility you seek, but unusable structural "perfection". The things that make C a better language for practical work you call "dirty details", as if attention to such matters weren't fundamental. We're about to investigate what you apprently identify as these "gross structural defects" (of course your favorite language couldn't possibly have any defects, could it -- they'd just be "minor imperfections"). > ... Well, as you > put it, C is "good enough" --- the typical sunk-cost outlook --- another > tool has to be 100% better to justify a changeover. That's not an accurate assessment. It is a waste of time to design yet another programming language that has comparable facilities to existing ones, unless it is clearly superior in several significant respects (which has not been demonstrated in this debate). One sufficiently good language of each general type is plenty. Programming languages are TOOLS, not ENDS IN THEMSELVES. (Leaving aside educational considerations.) > C, when examined closely, is so full of inconsistencies and > unnecessarily cryptic syntactical finesses that, and I will reiterate > my practical observation, that even good C programmers can be seen > consulting a C manual even after several years of practice! The only thing I've seen people whom I would consider to be good C programmers look up (infrequently) about the language is the precedence of operators in expressions, in cases involving combinations of bitwise, shift, conditional, and assignment operators. (Some programmers would simply use extra parentheses to be sure, rather than look up the precedence.) Most programming languages don't have this problem because they're not capable of supporting such expressions in the first place. I will admit, as will any good C programmer, that the bitwise operators would have been more conveniently given higher precedence than the equality operators. We're used to it, though, and again many languages don't have that problem because they force more cumbersome expression in the first place. Occasionally one will look up the order of arguments to one of the vast number of infrequently-used functions in the C library. Generally one needs to check the functional specification anyway at that point. That would apply equally to any language. (Keyword parameters are only useful if you remember everything about the interface except parameter order; they might be a useful addition to C some day if a good syntax could be devised (name=value already has another meaning).) > First, inconsistencies in parameters of library calls. Why is it > that, in an fprintf statement, the file is the first parameter but in > a putc statement it is the SECOND parameter? Because putc(c,f) was considered to be the natural way to order "put character to file". The printf() family have to support variable parameter lists, and that consideration forced the parameters to come last. At least C HAS variable parameter lists! > And why, oh why, does C > have to be different than the rest of the world by making the destination > string be the FIRST parameter in string subroutines such as strcat? I tried explaining this by analogy with dst:=src, but you seem to have missed the point. It could have been done either way, just so long as the str*() functions are consistent in this (which they are). Another response would be, your concept of the "rest of the world" is seriously limited in scope. Why isn't Modula-2 just like Fortran in everything it does, hmm? Isn't that what the "rest of the world" does? > Oddly enough, in the original Kernighan and Ritchie C book, their version > of strcpy has the destination string as the second parameter on page 100 > --- why the change to the counterintuitive in commercial C? I knew already you were a C illiterate; this proves it. K&R's strcpy() is compatible with the one in the standard C library. In fact, they explain why the parameter order is the way it is, in much the way I did. Or maybe you don't read English either? Again, what you consider "intuitive" is not necessarily universal. If I had no prior experience with dst:=src or MOV DST,SRC then I admit I would have a slight inclination toward strcpy(src,dst), and I normally design function interfaces with input parameters followed by output parameters, but the other way around doesn't bother me either. Why is this so important? Do you program by trying to GUESS how things work instead of LEARNING how things are used? Maybe it's a good thing that C does bite the sloppy craftsman; it might lead to improved work habits. > Second, inconsistencies in rules for structured data types. Can > anybody give me a good reason why arrays cannot be assigned in assignment > statements as can structures? Another artificiality that one must keep > in mind. Yes: the name of an array in C is a pointer to the first element of the array. You CAN assign the name of an array to an appropriate pointer variable. There simply is no syntax for specifying assignment of the entire array. The memcpy() function can of course be used for this when it is necessary to copy an array into another. In C, that is an infrequent situation, because pointers to objects are so easy to use that whenever possible experienced C programmers will copy pointers, not the objects they point to. After all, that's more efficient and C was specifically designed for use in efficient systems programming. C's identification of array names with pointers has been extremely useful. It unfortunately had the side-effect of making arrays themselves "second-class" data objects with respect to some operations, but then virtually all languages do this. (You cannot meaningfully divide scalars by arrays, for example.) > Third, C's allowance of mixing pointer and array types leads to > abominably unreadable code. In some UNIX sources I have seen such > mixtures leading to oddities like a[3] meaning "the third value after > where the pointer to a is currently pointing" rather than simply the > fourth value of the array. Try reading this kind of stuff and keeping > your sanity. Experienced C programmers virtually never access the nth element like this for n other than -1, 0, or 1. Try reading the equivalent operation (with a pointer!) in other languages and see if you're saner. The reason for the parenthetical qualification in the previous sentence is that pointers are often the natural way to access objects in C, unlike Pascal for instance where array indices are easier to use in practically all cases. C use of array indexing would look much like Pascal's, but often that is not the natural way to effectively exploit C's facilities. > Fourth, the simple C data declaration: > > int *pi; > > gives you no clue whether pi points to a SCALAR integer or an ARRAY > of integers. This kind of ambiguity does not exist in Modula-2 > or Pascal where the declaration of pi would tell us whether we have > a pointer to a scalar or structured data type. In your example, `pi' is a pointer to an (int) object, just as the declaration says. The object might be one of several organized as an array, or it might not. A pointer to an array would be declared as int (*pi)[]; where the array dimension may optionally be specified (in such cases, the only dimensions that must be specified in C are the ones necessary to determine how to locate an array entry). I realize you probably intended the point addressed by the second sentence in my paragraph above. Why is that a problem? It follows naturally from the very language structure that makes x[y] == *(x+y); again your complaint seems to be that C uses different abstractions from the ones you're happy with. That is simply a matter of upbringing and personal taste, unless you can demonstrate objectively that one alternative is superior on all counts of real-world value, which I doubt very much you can do. > Fifth, C's rules for initializing character arrays, especially > "ragged" arrays of variable length characters, seems to differ from > implementation to implementation. You want standards, we're about to publish one. There's no ambiguity in the standard language specification about this. The only possible ambiguity in current non-buggy implementations would concern whether or not to diagnose the following as an error: char s[2] = "xy"; (since the initializer seems to require space for its NUL terminator). This is a rather rare situation. Otherwise, the rules are quite clear. > Sixth, C allows an unreadable degree of programmer cleverness > as in such code segments as: > strcpy(s,t) > char *s,*t; > { > while(*s++ = *t++);} > > Which relies on the artifact that a string terminator happens to be an > ASCII zero and that, since the value of an assignment statement is the > value of the left hand side, the while will terminate when NULL is > encountered. C moguls actually ENCOURAGE this kind of "idiomatic" > expression e.g. Kernighan and Ritchie -- "Although this may seem cryptic > at first sight, the NOTATIONAL CONVENIENCE (my emphasis) is considerable, > and the idiom should be mastered ...". First of all, use the right whitespace, comments, etc., as in the reference you gave but did not understand: strcpy(s, t) /* copy t to s; pointer version 3 */ char *s, *t; { while (*s++ = *t++) ; } Next, note that C strings are terminated by 0-valued (char)s; this is true no matter what the target character set -- ASCII has nothing to do with it. It does not "happen" by accident but is a key feature of C character strings directly supported by the language (one can define count+data style strings etc. himself if desired). By the way, nobody I know of (including K&R) recommends using NULL as a symbol for anything in C other than a null pointer, which is not the same as the string terminator (invariably written as '\0' or simply 0). The rest of the quotation, which you elided from your biased summary, is "..., if for no other reason than that you will see it frequently in C programs." This is a perfectly valid point for a text teaching C to make, and the implication is that one might want to consider whether one should write code like this oneself. As a matter of fact, many experienced C programmers (including myself) would show the comparison against the 0-terminator explictly. We would probably change other details of the strcpy() implementation, for which K&R showed four different versions, of which you selected the one that is specifically intended to help explain one of the more obscure commonly- encountered idioms. Note that K&R do NOT use that style in subsequent examples. This hardly constitutes "encouragement". > By notational convenience they mean THAT IT CAN BE TYPED IN > QUICKLY!!! And, of course, the "idiom should be mastered" if one is > to enter the pantheon known as "C cognoscenti" --- God forbid if > one's C code looks like that of a Pascal programmer!! I'm not sure what a "Pascal programmer"'s code looks like. (I thought it varied from programmer to programmer, depending on training, experience, thoughtfulness, organizational standards, etc. Funny that C should be considered immune from these sources of variation.) Certainly my C code doesn't look like the picture you try to paint of code of "C cognoscenti" (which I take to mean "those who know what they're doing when it comes to C" -- is that meant to be derogatory?). > I could rant on and on about the poor human engineering of the C > language (how many times have you gotten caught on if a = c instead > of if a == c Never. > or *p++ versus (*p)++ ??) Never. > and get even further bogged > down in the "features" quagmire. But to what end?? C deliberately has few "features", in the sense of unnecessary frills. It does have "features" in the sense of useful facilities for doing certain types of jobs effectively. It is by no means one of the "big" languages (PL/I, Ada, etc.) > ... On the other hand, I feel that the > basic skeleton structure of a language like Modula-2 is so sound that > the addition of precious few features and a couple minor language > changes would create a nearly ideal programming tool. Yes, one gathers you believe that. Unlike you, I'm not about to criticize a language for attributes I don't fully appreciate or because it isn't just like my favorite language. > But, it'll probably never happen because the artificialities > and ambiguities of C create the sort of mystique about the "difficulties > of programming" that programmers love. Next to chess players, their > egos are the most insufferable. How long have you had this complex? Seriously, C was designed by software developers for use by software developers in a particular unregulated type of development environment. If the environment had been like those favored by structured design methodologies (which were barely in their infancy when C was designed), C very likely would have "package"-like facilities rather than the more simple (and more universally supported by linkers) linkage it provides. C was never intended to be a Beginner's All-purpose Symbolic Instruction Code, nor a teaching tool (Pascal), nor something intended to be read by business managers (COBOL), nor a replacement for myriads of specialized languages for embedded systems (Ada). The real reason professional programmers are jumping on the C bandwagon is that C was designed for use by professional programmers; they find it an effective tool for developing applications. This is no more a "mystique" than a carpenter's preference for tools that an unskilled layman would have a hard time using properly. If YOU PERSONALLY don't happen to like C, you don't have to use it. However, it is morally and ethically WRONG to make recommendations to people who might be looking for advice when you can't do so fairly. In this context, fairness would require that you evaluate the languages in comparable terms -- to state that one of them has serious structural flaws while the other is nearly perfect is simply not true and might mislead the innocent (of which I'm fortunately not one).
samples@RENOIR.BERKELEY.EDU (A. Dain Samples) (05/28/87)
From samples Thu May 28 10:27:07 1987
Received: by renoir.Berkeley.EDU (5.57/1.25)
id AA27322; Thu, 28 May 87 10:26:53 PDT
Date: Thu, 28 May 87 10:26:53 PDT
From: MAILER-DAEMON (Mail Delivery Subsystem)
Subject: Returned mail: User unknown
Message-Id: <8705281726.AA27322@renoir.Berkeley.EDU>
To: samples
Status: R
----- Transcript of session follows -----
>>> RCPT To:<com-sys-apple@ucbvax.berkeley.edu>
<<< 550 <com-sys-apple@ucbvax.berkeley.edu>... User unknown
550 com-sys-apple@ucbvax... User unknown
----- Unsent message follows -----
Received: by renoir.Berkeley.EDU (5.57/1.25)
id AA27318; Thu, 28 May 87 10:26:53 PDT
Date: Thu, 28 May 87 10:26:53 PDT
From: samples (A. Dain Samples)
Message-Id: <8705281726.AA27318@renoir.Berkeley.EDU>
To: com-sys-apple@ucbvax
Subject: Re: The C Language
This debate/argument over language characteristics -- which is
better/best, which is dirty/clean -- is very much like two engineers
engaged in building a three-mile bridge arguing over the brand of
blue-print paper they use.
To say that languages survive and are used because they were there
first and are therefore entrenched is only trivially true. There are
two points to be made, each directed at two different users of
programming languages.
The first class of users are those that enjoy programming: they enjoy
the puzzles to be solved, they enjoy seeing ``Hello, world!'' come up
on a screen when programmed in an about-to-be-learned language, they
experience satisfaction when any set of programming sentences have been
successfully turned into a working program. These people then often
confuse cause-and-effect, or means-and-ends: ``I enjoyed that! And I
was programming in language X, therefore language X must be pretty good
to give me that kind of enjoyment.'' That is why we see religious
arguments over any and every programming language, from FORTRAN, to C,
to APL, to FORTH, to SNOBOL, to Algol, to Pascal, to LISP, to C++, to
... need I go on?
Let us not ignore the effect of the language, however. Some languages
are more fun to program in than others, and I agree with those who say
C is a ``fun'' language: it certainly is a never ending source of
surprise, it provides plenty of opportunity for puzzle-solving, and it
is a clever implementation of some pretty basic programming principles.
That is, I think it is fun until I try to get a serious, large project
completed. Then frustration sets in. But this brings me to the second
class of programmer: those trying to bring a serious, large programming
effort to successful completion. The primary reason that such people
use all of the ``bad'' languages, and the reason that people using the
``better'' languages don't seem to do any better than the users of
``bad'' languages is
THE PROBLEMS THAT PROGRAMMING LANGUAGES SOLVE ARE ORTHOGONAL TO
THE PROBLEMS THAT MUST BE SOLVED IN LARGE PROGRAMMING
PROJECTS.
In other words
THE PROGRAMMING LANGUAGE USED DOESN'T MATTER IN LARGE
PROGRAMMING PROJECTS.
Obviously, these are overstatements designed to make a point; in
reality, the problems are almost orthogonal, and language choice
usually doesn't matter. Again, in other words, the features of a
particular programming language do not attack the problems of large
systems. That is why the Shuttle software is written in FORTRAN (there
was a CACM article in the last couple of years on this). This is why
large financial programs are still written in COBOL. And that is why
DARPA is investing millions of dollars STILL, after all these years, in
how to develop large software systems (there seems there is this
project the President wants to do that requires several million
[billion?] lines of code).
Yes, translation and investment costs come into play, but only as a
second order effect.
The primary problems to be solved in any large system are communication
and consistency. But not JUST between programming units, but also
between programmers, programming teams, managers, managers' bosses,
users, applications engineers, salesmen, etc. etc. etc. And I
guarantee you, that solving these problems for PEOPLE is far worse than
solving them for PROGRAMS! So, the choice of a language is swamped by
the other problems to be solved, and it really doesn't make that much
difference in the global scope of things. It makes some difference,
granted, but not nearly as much as the lowly programmer having to work
with the language might think.
Develop a language that solves the programming-in-the-large problems, and
THEN we can have a meaningful, resolvable argument!
Summary: ANY programming language offers its form of fun, and has its
adherents. But any arguments about one language being better than
another make sense ONLY when we are talking about programming-in-the-
small (and even then they are religious arguments, as well they should
be). But when it comes to serious, large, important programming
projects, the decision of which language to use is not made until so
many other more important decisions have been made, that it almost
doesn't matter.
I mean, come on, do bridge engineers start a design meeting by saying,
``Well, what brand of blue-print paper shall we use on this project?''
In the spirit of spirited debate,
Dain
gwyn@brl-smoke.UUCP (05/29/87)
In article <8705281733.AA27424@renoir.Berkeley.EDU> samples@RENOIR.BERKELEY.EDU (A. Dain Samples) writes: >Develop a language that solves the programming-in-the-large problems, and >THEN we can have a meaningful, resolvable argument! Thanks for a good article. There is a widespread misconception that the proper programming (or meta-programming) language would solve all our problems. In fact, problems are solved by people thinking about them and getting good ideas, not by mechanical methods (except for a few especially boring classes of problems, or as AIDS for people solving problems). Programming language design can assist or hinder development of proper computer solutions, but as you observe that's not the really hard part.
sloan@pnet02.CTS.COM (Steve Smythe) (05/31/87)
Here here... C is a wonderful language, but people should be aware of the fact that because of C's wonderful "features", programmers have to adopt to these practices.. Since there are so many of us that seem to have Apples and C compilers, why not start up a group? I'm in for posting tons of source code for those who are looking for portable code... Sloan UUCP: {akgua!crash, hplabs!hp-sdd!crash}!gryphon!pnet02!sloan INET: sloan@pnet02.CTS.COM
ranger@ecsvax.UUCP (Rick N. Fincher) (06/01/87)
I'd like to interrupt here to thank Doug for his Posix postings to the net it's poeple willing to do things like this that make the net so useful. I was just getting ready to sit down and write code to do what Doug's programs do (ie read Prodos directories in "C") when he posted his programs. This will be a great help to a lot of folks and it comes at an ideal time. With Apple pushing C as the development language for the //gs, it is a good time to set some standards. Thanks Doug Rick Fincher ranger@ecsvax
bdw@peaks.UUCP (06/01/87)
In article <575@gryphon.CTS.COM>, sloan@pnet02.CTS.COM (Steve Smythe) writes: > Here here... C is a wonderful language, but people should be aware of the fact > that because of C's wonderful "features", programmers have to adopt to these > practices.. {akgua!crash, hplabs!hp-sdd!crash}!gryphon!pnet02!sloan writes: > > Since there are so many of us that seem to have Apples and C compilers, why > not start up a group? I'm in for posting tons of source code for those who > are looking for portable code... > > Sloan There are lots of unix-like tools which could make the PRODOS, and especially, the DOS envirnment more useful and efficient to developers and hackers using C if they (the tools) existed. However, why not share this stuff through comp.sys.apple and/or comp.sources? Bruce as hao!boulder!peaks!bdw