[comp.lang.c] learning C declaration syntax

throopw@xyzzy.UUCP (Wayne A. Throop) (08/26/87)

> john@bc-cis.UUCP (John L. Wynstra)

> Sure it's nice now to see the same thing in the declaration as in the
> procedural text, and aids like typedef help reduce the problem, but it
> took me a long time to figure out the "inside-out" logic required to
> read C declarations, and an especially nasty one, like the (*b)[10]
> discussed in the net now, will still throw me a little (I write it down
> and puzzle it out).

This is a widespread reaction to C declaration syntax, and in many ways
I can see the point.  I thought it arbitrary and bizarre until I read
the explanatory note in the K&R tutorial which states that declaration
syntax mimics expression syntax.  But once one knows this about
declarations, why is it still difficult?  After all, are assignments
like

        *a1[n] = 23;
or
        (*a2)[n] = 23;
or
        i = (*a3[n])();

confusing?  Surely it isn't really hard to figure out that a1 is an
array of pointers to int, a2 is a pointer to an array of int, and a3 is
an array of pointers to functions returning int?

Note that I think C's declaration syntax *is* indeed misguided and
counterintuitive, especially since this rule is violated (for good
reason, which is the point) for structure, enum, and union declarations.
But once one knows the rule, why is it still difficult to parse
declarations?  The question boils down to: Is traditional expression
syntax for indirection, subscription, and function invocation hard (for
humans) to parse, and is that the reason C declarations are hard (for
humans) to parse?  If this is so, what does this say about such
expressions in general?  Does there need to be a cexpr that does for
expressions what cdecl does for declarations?  Was COBOL right to spell
things out in pseudo-english?

        SUBSCRIPT a3 BY n AND INDIRECT THAT AND INVOKE THAT GIVING i.
or
        EVALUATE 23 GIVING INDIRECT a2 SUBSCRIPTED BY n.
or
        EVALUATE 23 GIVING SUBSCRIPT a2 BY n AND INDIRECT THAT.

After all, this is the expression-like equivalent to what people seem to
perfer for declarations.  

I'm not trying to be sarcastic... just wondering.

--
"That saw isn't very sharp."
                --- Comment by the Marquis of Angelsey to the doctor
                    amputating his leg.
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throop%e

kent@xanth.UUCP (08/30/87)

[in response to a long, thoughtful note about the difficulty of C
declaration syntax, not included for brevity:]

Wayne,

If you judge by the amount of message traffic in comp.lang.c where one
C user is trying to explain to other C users how to read a declaration
involving pointers, arrays, and functions all mixed together, (it runs
upwards of 50% in a long term average) then there surely is
_something_ confusing (wrong?) with C's way of doing declarations.

My own guess is that it is just much too general purpose.  One of the
early lessons embedded into the design of Ada, and in fact mandated in
the procurement document, is that making something easy (in the sense
of terse) to write is a long term loser in trying to get maintainable
(read easy to understand) code.  It would probably have made much more
sense to mandate for C the multiple typedef sort of declarations that
several correspondents have mentioned that they use to keep themselves
sane while writing complex C declarations, even at the cost of many
extra keystrokes.

C was designed by a couple of the top programmers of this century,
working in a shop full of their peers.  Things that seemed easy and
natural to them seem difficult and obscure to those of us more toward
the average in ability.  One usually designs tools for the average
user, not the super user, but that wasn't the case for C.  Many
problems with C are easier to explain from that perspective.

Also, as shown by the program names in Unix(tm), they surely hated to
type!  ;-)

Kent, the man from xanth.

ark@alice.UUCP (08/31/87)

In article <2305@xanth.UUCP>, kent@xanth.UUCP writes:
> Also, as shown by the program names in Unix(tm), they surely hated to
> type!  ;-)

If you were developing software on a 15 char/second terminal,
you'd keep it terse too.

gwl@rruxa.UUCP (08/31/87)

In article <2305@xanth.UUCP>, kent@xanth.UUCP (Kent Paul Dolan) writes:
> _something_ confusing (wrong?) with C's way of doing declarations.
> 
> My own guess is that it is just much too general purpose.  One of the
> early lessons embedded into the design of Ada, and in fact mandated in
> the procurement document, is that making something easy (in the sense
> of terse) to write is a long term loser in trying to get maintainable
> (read easy to understand) code.  It would probably have made much more
> sense to mandate for C the multiple typedef sort of declarations that
> several correspondents have mentioned that they use to keep themselves
> sane while writing complex C declarations, even at the cost of many
> extra keystrokes.
>

          When we have enough real life experience with Ada, then we
shall see just how effective a tool that language is.  For now, we 
have enough evidence on C.

> 
> C was designed by a couple of the top programmers of this century,
> working in a shop full of their peers.  Things that seemed easy and
> natural to them seem difficult and obscure to those of us more toward
> the average in ability.  One usually designs tools for the average
> user, not the super user, but that wasn't the case for C.  Many
> problems with C are easier to explain from that perspective.
>

      So, why put limits on those who don't need them?  C is flexible.
Try doing some of the things that C is capable with a language such as
Pascal!!!!  If you do not feel comfortable with certain programming
constructs, then stay away from them, but don't force the rest of us
to do so just because you want to.

> 
> Also, as shown by the program names in Unix(tm), they surely hated to
> type!  ;-)
>
     Many a one fingered typist thanks them.  As for me I can beat the
secretaries is speed races and it makes me twice as fast as when I
used VM/CMS!!!!!

> 
> Kent, the man from xanth.


George W. Leach

Bell Communications Research      New Jersey Institute of Technology 
444 Hoes Lane       4A-1129       Computer & Information Sciences Dept.
Piscataway,  New Jersey   08854   Newark, New Jersey   07102
(201) 699-8639

UUCP:  ..!bellcore!indra!reggie
ARPA:  reggie%njit-eies.MAILNET@MIT-MULTICS.ARPA

From there to here, from here to there, funny things are everywhere
Dr. Seuss "One fish two fish red fish blue fish"

peter@sugar.UUCP (Peter da Silva) (09/02/87)

> Also, as shown by the program names in Unix(tm), they surely hated to
> type!  ;-)

If you had to do all your work an a 110 baud ASR33, you'd surely hate to
type, too.
-- 
-- Peter da Silva `-_-' ...!seismo!soma!uhnix1!sugar!peter
--                  U   <--- not a copyrighted cartoon :->

firth@sei.cmu.edu (Robert Firth) (09/03/87)

In article <223@xyzzy.UUCP> throopw@xyzzy.UUCP (Wayne A. Throop) writes:

[ is the syntax of C declarations confusing ]

>This is a widespread reaction to C declaration syntax, and in many ways
>I can see the point.  I thought it arbitrary and bizarre until I read
>the explanatory note in the K&R tutorial which states that declaration
>syntax mimics expression syntax.  But once one knows this about
>declarations, why is it still difficult?  After all, are assignments
>like
>
>        *a1[n] = 23;
>or
>        (*a2)[n] = 23;
>or
>        i = (*a3[n])();
>
>confusing?

Wayne, I find the syntax of both declaration and use very hard to read.
perhaps if I used only C I'd get used to it, but I don't.

There seem to me two main sources of confusion for other than C experts.
The first is a use of graphical symbols that is, by almost any standard,
highly eccentric.  That "*" is a good example.  Any naive person knows
that "*" is the ASCII symbol for "multiply".  To use it for indirection
is strange.  To use it for both indirection AND to mean "POINTER TO" in
a declaration is doubly strange.

The second cause of confusion is really trivial, but its ramifications
are enormous.  There are basically three composable structuring symbols
in C: "*" (POINTER TO), "[]" (ARRAY OF), and "()" (FUNCTION RETURNING).
One uses prefix syntax, the other two use postfix.  That is a disastrous
error in human engineering.  If all three were postfix, I believe C code
would be much easier to read.  To see what I mean, try reading some
Modula-2 code, where all the destructuring operations are postfix; it's
easy to see that

	x[i]^[j].k

simply means "take the i'th element of X, dereference, take the j'th element,
select the k component.

The combination of prefix and postfix syntax also of course creates
problems over strength of binding, and hence the need for all those
parentheses.  You know, to create a binding order problem for essentially
monadic operators seems pretty incompetent to me (but I'm biased, as
you perceive).

ron@topaz.rutgers.edu (Ron Natalie) (09/05/87)

I'm sure Ken and Dennis had a Model 37 or they would have made UNIX
use more uppercase and more sensible character sequences other than
|, {, and } which Model 33's don't have.  (I, by the way, did write
my first C program on a model 33, the UNIX lcase mapping was used
more frequently then).

-Ron

throopw@xyzzy.UUCP (Wayne A. Throop) (09/08/87)

> firth@sei.cmu.edu (Robert Firth)
>> throopw@xyzzy.UUCP (Wayne A. Throop)
>> [...] the K&R tutorial which states that declaration
>>syntax mimics expression syntax.  But once one knows this about
>>declarations, why is it still difficult?
> There seem to me two main sources of confusion for other than C experts.
> The first is a use of graphical symbols that is, by almost any standard,
> highly eccentric.  

I agree that C chooses its symbols in an ideosyncratic manner, and for
someone trying to remember both C-and-something-else this is problem.
But the difficulty seems to be something much worse than that.  People
aren't forgetting what symbols stand for what, or the details of whether
prefix operators or postfix operators are "done first".  The confusion
is more fundamental.  No, people are confused to the point of not
knowing how to proceed.  This is what I find strange.  Pick up K&R, find
the table of operators and their prescedence, and you have everything
you need.  Yet people are so confused that they don't even do *this*,
and I am at a total loss to explain it.

> The combination of prefix and postfix syntax also of course creates
> problems over strength of binding, and hence the need for all those
> parentheses.  

Absolutely correct.  But again, the degree of confusion seems totally
out of line with the provoking cause.  If nobody told you about
anything, if there were no tables of the rules, one could get lost.  But
presumably people are told about the anomaly of indirection being
prefix.  Having been told about this, and having a wealth of reference
materials (K&R for example being particularly widespread), why do they
forget?  Surely people could at least remember that someing is anomalous
here, and remember to look it up in the table of operators?

So I guess I'm agreeing that (a[i]^.f^(x)) is much clearer than
(*(*(a[i]).f)(x)).  And I'll agree that it would even be clearer than
the simplest way I can see to say it in X3J11-style C: (a[i]->f(x)),
because of all the arcane abbreviations that that entails (and it
doesn't help the declaration syntax any).  But even so... is it really
*that* hard?  Do the ideosyncratic choices of symbol and the misbegotten
choice of prefixness for indirection really weigh *that* heavily?

Further, if symbols are arcane and hard to learn, in all seriousness
now, wouldn't it be better to say

        a SUBSCRIPTED BY i INDIRECT SELECT f INDIRECT INVOKE WITH x

rather than

        a[i]^.f^(x)

--
14. Are you tired?  If not, go back to step 7.
15. Go to sleep.  Then, wake up, and go back to step 7.
                        --- Knuth- Fundamental Algorithms Vol 1.
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw

henry@utzoo.UUCP (Henry Spencer) (09/09/87)

> Wayne, I find the syntax of both declaration and use very hard to read.
> perhaps if I used only C I'd get used to it, but I don't.

Doesn't necessarily help.  I've been using C pretty much exclusively for
the last 12 years, and I still find C declarations rather trying when they
get complicated.
-- 
"There's a lot more to do in space   |  Henry Spencer @ U of Toronto Zoology
than sending people to Mars." --Bova | {allegra,ihnp4,decvax,utai}!utzoo!henry

dsill@NSWC-OAS.arpa (Dave Sill) (09/15/87)

>Further, if symbols are arcane and hard to learn, in all seriousness
>now, wouldn't it be better to say
>
>  A.    a SUBSCRIPTED BY i INDIRECT SELECT f INDIRECT INVOKE WITH x
>
>rather than
>
>  B.    a[i]^.f^(x)

Surely you jest.  First, A requires 59 characters to say what B says
in 11.  Second, I don't think A is any easier to read than B (really
I think it's harder to read).  Third, It's not giving any more
information or presenting it any less ambiguously.  So what's the
gain?

Maybe what's needed is a preprocessor along the lines of cdecl that
would produce such a human-readable listing from source code.  Like
cdecl, it should also be able to do the inverse: transform verbose
English-like code to C or Modula-2 or whatever.  This would allow
those preferring the verbose interface to use it without affecting the
rest of us or unduly burdening our compilers and storage systems.


-Dave