[comp.unix.questions] comma operator: keep away?

jas@ernie.Berkeley.EDU (Jim Shankland) (04/18/89)

In article <628@gonzo.UUCP> daveb@gonzo.UUCP (Dave Brower) writes:
[A long diatribe against use of the comma operator in all contexts except
macro bodies, and, sometimes, "for" loop initialization and continuation.]

Granted, everyone should write code that is easily understandable by
others.  Granted, also, that the way to do that depends on idioms in
common use in a particular language:  constructs that are common usage
in, e.g., LISP, may be obscure in C.  Granted, finally, that it is quite
possible to write obscure C code by using the comma operator (and in
countless other ways).

I still suggest that a C programmer who understands:

(A)	if (x->in_use)
	{
		x++;
		y++;
	}

but who is mystified by:

(B)	if (x->in_use)
		x++, y++;

had best be investigating alternate career paths.

The goal daveb@gonzo is pursuing is an admirable one:  how can we get
programmers, especially masses of programmers, to write code that
is correct, understandable, and modifiable by others, in reasonable
time?  It is certainly one of the most important questions in
programming, and answers have been slow in coming.  I understand
and sympathize with the temptation to tilt at windmills; but it
won't kill the dragons.

The world is full of bad programmers.  A magic potion that would turn
them all into good programmers would be a useful thing, indeed.
Brings to mind the (surely apocryphal) story about the ne'er-do-wells
who finagled an NSF grant to research ways to turn cow shit into butter.
Having partied away the entire grant, they applied for supplemental
funds, with the argument:  "We've gotten it to spread like butter,
now we need more money to work on the taste."

Exorcising the comma operator from bad code may make it spread better.
The taste, alas, remains unchanged.

Jim Shankland
jas@ernie.berkeley.edu

"Blame it on the lies that killed us, blame it on the truth that ran us down"

bill@twwells.uucp (T. William Wells) (04/19/89)

In article <28831@ucbvax.BERKELEY.EDU> jas@ernie.Berkeley.EDU (Jim Shankland) writes:
: I still suggest that a C programmer who understands:
:
: (A)   if (x->in_use)
:       {
:               x++;
:               y++;
:       }
:
: but who is mystified by:
:
: (B)   if (x->in_use)
:               x++, y++;
:
: had best be investigating alternate career paths.

The thing that is wrong with the latter has little to do with
mystification. What is wrong is that, for rapid and accurate
understanding of code, one should avoid appearing to do more than one
thing at a time.

In other words, the physical layout of the code should make each thing
being done appear distinct from all the other things being done.

How often have you been burned by missing the second part of a comma
operator? How frequently have you missed the assignment is some
idiocy like:

	foo(<long---complicated---expression, ((bletch = 1) * glorch);

And how often have you missed the significance of an expression that
is pasted on the end of an if? As in awful code like:

		if (expression) { statement;
		statement;      /*!*/
	}

Look rapidly at the examples following the form feed.

Which one would you understand most quickly? And which one, after a
quick glance, would you believe you had understood more accurately?


	if (!(fp1 = fopen(name1, "r")) || !(fp2 = fopen(name2, "r"))) {
		error();
	}

	if (  !(fp1 = fopen(name1, "r"))
	   || !(fp2 = fopen(name2, "r"))) {
		error();
	}

The point is that the latter makes it clear that two different things
are happening. The former you have to figure this out for yourself.

The cost is in time, and if you are just skimming the code, in
immediate comprehension.

Of course, such differences are trivial in any given fragment of code,
but when carried out throughout the program, they can make the
difference between code that is easy to understand and code whose
reading feels like wading through a swamp.

---
Bill                            { uunet | novavax } !twwells!bill
(BTW, I'm may be looking for a new job sometime in the next few
months.  If you know of a good one where I can be based in South
Florida do send me e-mail.)

rupley@arizona.edu (John Rupley) (04/19/89)

In article <28831@ucbvax.BERKELEY.EDU>, jas@ernie.Berkeley.EDU (Jim Shankland)
writes:
> The world is full of bad programmers.  A magic potion that would turn
> them all into good programmers would be a useful thing, indeed.
> Brings to mind the (surely apocryphal) story about the ne'er-do-wells
> who finagled an NSF grant to research ways to turn cow shit into butter.
> Having partied away the entire grant, they applied for supplemental
> funds, with the argument:  "We've gotten it to spread like butter,
> now we need more money to work on the taste."

Ah, but the __true__ story is even better......

At a meeting of the National Academy, in 1967 I believe, there was a
panel discussion with several significant bureaucrats from the NSF and
NIH.  They floated the idea that science funding should be partly if
not largely through block grants to Universities, whose administrators
would have the power to distribute the funds.  This horrifying concept
prompted a response from Kistiakowsky (kinetics, explosions, atom bomb,
Eisenhower, science advisor -- for those too young in years to
remember).  He rose from the audience and said that the proposal
reminded him of conversations he had had with a Soviet Academician.
Several years previously, the Academician had described the Russian
project to convert shit into butter.  Subsequently, Kistiakowsky
recounted, the Academician told him the project was half completed --
they had learned to spread it on bread.  Metaphor complete,
Kistiakowsky returned to his seat.

John Rupley
rupley!local@megaron.arizona.edu

henry@utzoo.uucp (Henry Spencer) (04/25/89)

In article <28831@ucbvax.BERKELEY.EDU> jas@ernie.Berkeley.EDU (Jim Shankland) writes:
>I still suggest that a C programmer who understands:
>
>(A)	if (x->in_use)
>	{
>		x++;
>		y++;
>	}
>
>but who is mystified by:
>
>(B)	if (x->in_use)
>		x++, y++;
>
>had best be investigating alternate career paths.

This is very true.  On the other hand, a C programmer who writes the latter
should also be investigating alternate career paths, because he's clearly
an amateur in a business that needs professionals.  Anyone who says "any
competent programmer ought to be able to understand that!" rather than
"I should make my code as clear as possible!" is an amateur, and one with
an ego problem at that.  Readability is very much a matter of what you're
used to.  Like it or lump it, most C programmers are used to (A) and not (B).
A programmer with a professional attitude will therefore use (A) to make his
code as readable as possible, unless there is some special reason to do
otherwise.

To quote #8 from the Ten Commandments for C Programmers:

	Thou shalt make thy program's purpose and structure
	clear to thy fellow man by using [a familiar style],
	even if thou likest it not, for thy creativity is better
	used in solving problems than in creating beautiful new
	impediments to understanding.
-- 
Mars in 1980s:  USSR, 2 tries, |     Henry Spencer at U of Toronto Zoology
2 failures; USA, 0 tries.      | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

jas@ernie.Berkeley.EDU (Jim Shankland) (04/25/89)

In article <1989Apr24.172219.817@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>On the other hand, a C programmer who [uses a comma operator to sequence
>two closely related expression evaluations, instead of two statements]
>should ... be investigating alternate career paths, because he's clearly
>an amateur in a business that needs professionals.  Anyone who says "any
>competent programmer ought to be able to understand that!" rather than
>"I should make my code as clear as possible!" is an amateur, and one with
>an ego problem at that.  Readability is very much a matter of what you're
>used to.  Like it or lump it, most C programmers are used to [the compound
>statement] and not [the comma construct].

Sigh.  Must have been lousy weather in Toronto that morning.

We're agreed that a professional programmer should code as clearly
and readably as possible, and that gratuitous cleverness is a bad
thing.  My original posting said so.  What I consider self-evidently
true, and you consider self-evidently false, is that the comma
construct is sometimes at least as clear as the equivalent compound
statement.  Short of running code comprehension experiments on
hundreds of bad C programmers, I don't see how we can resolve our
difference of opinion.  But at least I haven't questioned your
professionalism.

Now look what you've done.  Made me grumpy.  Not, on the whole, an
easy thing to do.

Jim Shankland
jas@ernie.berkeley.edu

"Blame it on the lies that killed us, blame it on the truth that ran us down"

friedl@vsi.COM (Stephen J. Friedl) (04/25/89)

In article <1989Apr24.172219.817@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes:
> Anyone who says "any competent programmer ought to be able to understand
> that!" rather than "I should make my code as clear as possible!" is an
> amateur, and one with an ego problem at that.

When writing software, always assume that the next person reading
your code will be somebody much less qualified than you, and they
will be required to fix a very hot bug with little time for study.

You'll probably be right.

     Steve

-- 
Stephen J. Friedl / V-Systems, Inc. / Santa Ana, CA / +1 714 545 6442 
3B2-kind-of-guy   / friedl@vsi.com  / {attmail, uunet, etc}!vsi!friedl

As long as Bush is in office, you'll never see Nancy Reagan in *my* .sig.

crewman@bucsb.UUCP (JJS) (04/26/89)

In article <1989Apr24.172219.817@utzoo.uucp> (Henry Spencer) writes:
>
>On the other hand, a C programmer who [uses a comma operator to sequence
>two closely related expression evaluations, instead of two statements]
>should ... be investigating alternate career paths, because he's clearly
>an amateur in a business that needs professionals.  Anyone who says "any
>competent programmer ought to be able to understand that!" rather than
>"I should make my code as clear as possible!" is an amateur, and one with
>an ego problem at that.  Readability is very much a matter of what you're
>used to.  Like it or lump it, most C programmers are used to [the compound
>statement] and not [the comma construct].
>

I believe that should read, 'most *Pascal* programmers are used to the compound
statement and not the comma construct'.  This is just my opinion, and one
with which many people differ, but why are C programmers always advised not
to use the unique features of C?  So far, I've been advised against:

	-- using the comma construct
	-- using the (a ? b : c) expression
	-- using an assignment as a function
	-- using "if (a)" instead of "if (a != 0)"
	-- using += -= *= /= |= &= ^=
	-- using << >> <<= >>=
	-- using complex pointer-integer expressions
	-- generally complex expressions: map |= (line[k++] = string[j++]);
	-- writing compact source code
	etc.

I've also been advised to go out of my way to make my C programs look like
programs written in another language:

#define begin 	{
#define end	}
#define then			/* if (a == 1) then return; */
#define when	break; case	/* when 'a': foo(); when 'b': bar() */
#define otherwise break; default: /* otherwise baz(); */
etc.

Getting back to the poster of the above article, why is the person who likes
the comma construct an amateur?  Is it because s/he likes a unique feature
of C?  Or is it really because the above "professional" would get confused
reading such code?  In a business which so sorely "needs professionals", why
are we bending over backwards to make our programs easier to read for ama-
teurs?

C is designed to be a low-level language.  It's advantage over machine
language and even higher-level languages is its power derived from its
simplicity and compactness.  It is in this way an elegant language; as
C programmers, we have the tools to make powerful programs compact.  Yet
we are constantly advised to throw away these tools.  I believe that
that is VERY unprofessional.

I repeat: this is just my opinion so mail flames to me.

                  -- JJS

henry@utzoo.uucp (Henry Spencer) (04/27/89)

In article <28890@ucbvax.BERKELEY.EDU> jas@ernie.Berkeley.EDU.UUCP (Jim Shankland) writes:
>Sigh.  Must have been lousy weather in Toronto that morning.

No, actually, the Toronto weather's been fine.

>We're agreed that a professional programmer should code as clearly
>and readably as possible, and that gratuitous cleverness is a bad
>thing.  My original posting said so.  What I consider self-evidently
>true, and you consider self-evidently false, is that the comma
>construct is sometimes at least as clear as the equivalent compound
>statement...

As my original posting said, it is a well-established fact, backed by
various experiments on things like reading rates, that familiar styles
are easier to read.  This particular comma construct is uncommon and
hence likely to be unfamiliar.  QED.

There are always unusual cases where the usual style is, for some special
reason, less clear than some variant.  But these are unusual cases, and
should not be taken to indicate acceptable variation in normal code.
-- 
Mars in 1980s:  USSR, 2 tries, |     Henry Spencer at U of Toronto Zoology
2 failures; USA, 0 tries.      | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

henry@utzoo.uucp (Henry Spencer) (04/27/89)

In article <2495@bucsb.UUCP> crewman@bucsb.bu.edu writes:
>Getting back to the poster of the above article, why is the person who likes
>the comma construct an amateur?  Is it because s/he likes a unique feature
>of C?  Or is it really because the above "professional" would get confused
>reading such code?  In a business which so sorely "needs professionals", why
>are we bending over backwards to make our programs easier to read for ama-
>teurs?

We are not bending over backwards to make our programs easier to read for
amateurs; we are bending over backwards to make them easier to read for our
fellow professionals.  We are not trying to meet some minimum standard of
readability, and indulging our little egos beyond that; we are trying to
make the stuff AS CLEAR AS POSSIBLE even if it hurts.

>C is designed to be a low-level language...

Opinions vary on this.  How do you write "*p++ = *q++" in Fortran?  Which
is the higher level language, based on this comparison?  (Or any comparison,
apart from the marginal issue of complex numbers.)

>...as C programmers, we have the tools to make powerful programs compact.  Yet
>we are constantly advised to throw away these tools.  I believe that
>that is VERY unprofessional.

As C programmers, we have the tools to make powerful programs utterly
unreadable if we make the slightest effort in that direction.  Making them
readable, on the other end, takes significant work and considerable restraint.
-- 
Mars in 1980s:  USSR, 2 tries, |     Henry Spencer at U of Toronto Zoology
2 failures; USA, 0 tries.      | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gwyn@smoke.BRL.MIL (Doug Gwyn) (04/27/89)

In article <1989Apr26.214622.10697@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>As my original posting said, it is a well-established fact, backed by
>various experiments on things like reading rates, that familiar styles
>are easier to read.  This particular comma construct is uncommon and
>hence likely to be unfamiliar.  QED.

Oh, come on, Henry, what is "common" or "familiar" depends very much
on one's experience.  There are many common cases of truly horrible
coding style, and most of us are familiar with examples of it.  That
in itself does not make it easier to read, desirable, or anything
else along those lines.  The best one can say is that familiarity is
one positive factor in code readability, but there are many others too.
Clear logical structure is probably more important.

While I don't recommend unbridled use of comma operators in place of
semicolons, neither would I call usage such as
	while ( norm( x, y, z ) < distance )
		++x, ++y, ++z;
"unreadable".  It seems to me that it would be perfectly readable
to anyone who should be attempting to deal with C code at all.  It may
even be preferable to adding the extra punctuation that you recommend.

pjh@mccc.UUCP (Pete Holsberg) (04/29/89)

In article <1989Apr26.215113.10807@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
=As C programmers, we have the tools to make powerful programs utterly
=unreadable if we make the slightest effort in that direction.  Making them
=readable, on the other end, takes significant work and considerable restraint.
=-- 

Here, here!  After spending a very frustrating day on a "2-3 hour job",
I want to say: write good documentation, too. Write an overview of the
program, and each major function.  If you print the source code, put
tabs on the pages and include a table of contents and maybe a "cflow"
output.  My client "wasted" some bucks today because of all the reading
and paper-shuffling I had to do.

Pete


-- 
Pete Holsberg                   UUCP: {...!rutgers!}princeton!mccc!pjh
Mercer College				CompuServe: 70240,334
1200 Old Trenton Road           GEnie: PJHOLSBERG
Trenton, NJ 08690               Voice: 1-609-586-4800

Kemp@DOCKMASTER.NCSC.MIL (04/30/89)

 >>C is designed to be a low-level language...
 Henry Spencer replies:
 > Opinions vary on this.  How do you write "*p++ = *q++" in Fortran?
 > Which is the higher level language, based on this comparison?  (or
 > any comparison, apart from the marginal issue of complex numbers.)

Much as I fear disagreeing with Henry, I have to respond to this :-)

In Fortran, one writes "p(i) = q(i)", at least inside a loop, which is
where such a construct is normally found.  This is "higher level" in
that it is more like the mathematical notion of "copy array q to array
p" than the machine level description "fetch data pointed to by q,
increment q, copy to location pointed to by p, increment p".

While opinions may (and do) vary on everything, some opinions correspond
to reality more closely than others.  If you define "high level" as
meaning "a higher level of data abstraction" then Fortran is clearly
higher level than C for the types of data it was designed to handle
(numbers, arrays, strings, and structures thereof).  If you equate higher
level with "fewer characters needed to define an algorithm", then opinions may
vary.  For anything involving explicit pointer manipulation (trees, queues,
etc) Fortran is entirely unsuitable, and C is thus "higher level".

-----------------------
Arrays:
  Although C has an array notation, it is not used very effectively. I
don't know of any C compilers that do array bounds checking (which does
not mean that none exist, but they are not on the machines I use).
Fortran programmers take such checking for granted.
  In addition, C programmers use the *p++=*q++ construct to copy an array
because historically C compilers have been too stupid to optimize
"p[i] = q[i]".

  Fortran-8x allows an even higher level representation of array
operations, for example:

Fortran: do i = 1,100            C:  p = array_p;
              p(i) = q(i)            q = array_ q;
           end do                    for (i=0; i<100; i++)
                                          *p++ = *q++;

is replaced by one line:

  p(1:100) = q(1:100)

----------------------------
I/O:
  In Fortran, one can write an array:

    write(6,1) "p", p
1   format(/a,":",1000(5f6.1/))

producing:

p:
   4.2  14.3  27.8  13.9  25.6
  41.7  32.5  83.3  54.2  71.0
  92.3   7.2  42.7  89.3  43.6
     . 
     .
     .

How "high level" is the C code to do the same thing?

Fortran includes I/O within the language definition, which allows
the compiler to detect errors that C compilers can't.  For example,

  write(6,"(i4,3f6.2)") a, b, c

will produce a "data/format mismatch error" if a is not an integer,
otherwise it executes correctly.  What will the equivalent C code:

  printf("%4d%6.2f%6.2f%6.2f\n", a, b, c);

do if a is not an int, or when there is not a fourth variable to
print?  Which is the "higher level" behavior?

Fortran has list-directed i/o (default formatting) and implied loops:

   write(6,*) "a=", a, " i=", i, " p=", (p(j), j=1,4)

Isn't this "higher level" than printf?

------------------------
Strings:
  Fortran strings are represented by a pointer and a length; C strings
are just a pointer, with a null byte representing the end of the string.
Now what kind of a scuzzy, data-dependent, low-level hack is that?
Strings without bounds are what caused the chfn() security hole. How
many other instances do you know of where the use of str???() instead
of strn???() caused problems?  Why does C need bzero(), bcopy(), etc
in addition to strcpy(), strcmp(), etc?

  In Fortran, one writes             instead of
     color = "red"                       strcpy(color, "red");

Which is "higher level"?


How much code would it take to write the following in C (where for example
dir = "/home/dpk" and file = "junk")?

    open(1, dir(:lnblnk(dir))//"/"//file(:lnblnk(file))//".f")

How high level is the equivalent code in C?

----------------------------

  There indeed may be no objective reality to the term "high level", but
IMHO there are numerous cases (besides those involving complex numbers)
where Fortran is easier to write, easier to read, and easier for the
computer to compile into efficient code.

    Dave Kemp <Kemp@dockmaster.ncsc.mil>

thor@stout.ucar.edu (Rich Neitzel) (05/02/89)

In article <19378@adm.BRL.MIL> Kemp@DOCKMASTER.NCSC.MIL writes:
>
>Arrays:
>  Although C has an array notation, it is not used very effectively. I
>don't know of any C compilers that do array bounds checking (which does
>not mean that none exist, but they are not on the machines I use).
>Fortran programmers take such checking for granted.

BTW, on all Fortran compilers I've ever used, bounds checking is a compiler
option, typically NOT a default option. Certainly when I used to write
realtime programs for RSX-11M no one used the bounds checking. Not 
something taken for granted.

>  In addition, C programmers use the *p++=*q++ construct to copy an array
>because historically C compilers have been too stupid to optimize
>"p[i] = q[i]".

Sorry, but looking at the assembler output for several Fortran compilers
versus C compilers on the same machine, I find that Fortran compilers are
equally "too stupid to optimize" this construct. Alas, for
	do i = 1, 100			for (i = 0; i < 100; i++)
	   j(i) = k(i)			   j[i] = k[i];
   10   continue

virtually the same assembler output was generated for both languages. 
If you want "smart", C is better here, since it let's the programmer
optimize "around" the "dumb" compiler.

Besides, if one wants to do a straight copy between two arrays, 
then
	j(1:100000) = k(1:100000)    vs.     bcopy(k,j,sizeof(k);
In this case, for large upper bounds the C form is more efficient
and certainly less data-dependent.

>
>Fortran includes I/O within the language definition, which allows
>the compiler to detect errors that C compilers can't.  

Granted, but there are times when I need to "violate" the rules. 
For example, I once needed to print out the bit patterns for 32
bit floating point numbers. In C this was simple and easy to follow:

	printf("%g = %x\n",float,float);
In Fortran it would require double declarations and equivalences and be
less clear (at least to me :-)).

>Strings:
>  Fortran strings are represented by a pointer and a length; C strings
                                                ^^^^^^^^^^^^
>are just a pointer, with a null byte representing the end of the string.
>Now what kind of a scuzzy, data-dependent, low-level hack is that?

In my humble opinion it is Fortran that is data dependent here. Try
finding out how big the actual string data is versus the amount of storage
allocated for the maximum size. Most Fortran compilers I've used have
initialized strings with blanks! Since these are often significant 
characters in a string, how do you finds the end of valid data? That's
right, you do it yourself. Now you need a variable to keep track of the
end of data index. And as for parsing, tell me what Fortran routine 
can do anything similar to strtok? Gosh, you mean I have to have 
separate strings big enough to hold the maximum exspected character
size for every parsed item? Nope, I believe it's Fortran that data-
dependent. 

>  There indeed may be no objective reality to the term "high level", but

Agreed, but it seems to me that the term is used to distingish languages
like C and Fortran from assembler. I would have a hard time being 
convinced that C is not higher level is those terms.
-------------------------------------------------------------------------------

			Richard Neitzel
			National Center For Atmospheric Research

			thor@thor.ucar.edu

    	Torren med sitt skjegg		Thor with the beard
    	lokkar borni under sole-vegg	calls the children to the sunny wall
    	Gjo'i med sitt shinn		Gjo with the pelts
    	jagar borni inn.		chases the children in.

-------------------------------------------------------------------------------

bagpiper@oxy.edu (Michael Paul Hunter) (05/02/89)

In article <28831@ucbvax.BERKELEY.EDU> jas@ernie.Berkeley.EDU (Jim Shankland) writes:
>Granted, everyone should write code that is easily understandable by
>others.  Granted, also, that the way to do that depends on idioms in
>common use in a particular language:  constructs that are common usage
>in, e.g., LISP, may be obscure in C.  Granted, finally, that it is quite
>possible to write obscure C code by using the comma operator (and in
>countless other ways).

With a finite language (finite number of symbols), a finite number of states,
and a finite length program, I believe that you are only going to get
a countable (finite) number of ways to write obscure code... :)
maybe large...but finite!

				       mike