[net.lang.c] Cryptic C

jcz@ncsu.UUCP (Carl Zeigler) (08/13/85)

  Refering to:
> 
> Subject: Re:  Cryptic C code?
> Message-ID: <605@brl-tgr.ARPA>
> 
> I think case 2 is certainly more readable, but as the book says, you
> need to learn to read things like case 3 since a lot of code is like
> that.  More usually one will see something like
> 	char *s;
> 	...
> 	while ( *s++ )
> 		...
> I personally prefer to distinguish between Boolean
> expressions (such as comparisons) and arithmetic expressions, using
> strictly Boolean expressions as conditions.  Thus:
> 	while ( *s++ != '\0' )
> 
> Tests for NULL pointers and flags often are written
> 	if ( p )
> 		...
> 	if ( flag & BIT )
> 		...
> rather than
> 	if ( p != NULL )
> 		...
> 	if ( (flag & BIT) != 0 )
> 		...
> (I prefer the latter.)  Get used to it..


      While I certainly agree that code should be written
so that it is readible, I do not see how the relational operators
in the above examples really improve the readibility of the code.

One of the definitions of C is that control statements work on the
zero - nonzero values of their controlling expressions.  Adding an extra
 != 0  has no affect other than to make the file larger. ( This is
hardly boolean !)

Readible code is code where the authors intent is clear, where
the data structures and the types of operations that can be
performed on the data structures make sense in the context of the
problem being solved.   This is a much larger issue than wether

if ( exp == 0 ) foo;

adds anything to the readibility of the code. 

Perhaps we are discussing two separate issues:

One: how easy is it to see the code:  A cursory
    reader may miss the subtlety of 'while( *s++ )'
    whereas 'while( *s++ != 0 )' is a redundancy that
    makes missing the point more difficult.

Two: how easy it is to understand the code:
    Questions like how easy is it to tell just exactly why
    while( *s++ ) is needed here in the first place?  and  What
    is the goal?  Can I agree with the authors' need to post-increment
    his way through s?    Why not a pre-increment?

This is the kind of issue I believe K&R were focusing attention to in
their book and why they advocated learning certain 'idioms'.   Just as in
some more complex languages, idioms can carry more semantic content than
is reflected in a first glance at the expression of the idiom itself.

John Carl Zeigler
SAS Institute Inc.
919 467 5322
mcnc!ncsu!jcz

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (08/15/85)

> > I personally prefer to distinguish between Boolean
> > expressions (such as comparisons) and arithmetic expressions, using
> > strictly Boolean expressions as conditions.  Thus:
> > 	while ( *s++ != '\0' )

>       While I certainly agree that code should be written
> so that it is readible, I do not see how the relational operators
> in the above examples really improve the readibility of the code.

Perhaps I can make this point clearer.  Nearly all Algol-like
languages have a separate Boolean data type, and their conditional
expressions are required to be of Boolean type.  In C, arithmetic
expressions are made to do double duty and are interpreted in
Boolean contexts as "true" if and only if the value is nonzero.
While this kludge obviously works, I think it adds one extra level
of mental processing when one is reading conditionals in C.  This
is because there is no visible percept corresponding to one's
thoughts about comparison against zero; one has to explicitly apply
a conceptual language evaluation rule to map the expression into a
condition in order to extract the full meaning of the condition.

Many Boolean expressions are not best thought of as comparison of
an arithmetic quantity against zero.  For example:
	while ( ! Done() )
		Perform_Action();
I have found that introducing Booleans as an explicit data type
into my C code has helped me produce better-organized code having
clearer meaning (once you get used to the idea).  Although these
days I am pretty conservative when it comes to defining one's own
language extensions, this one seems like a winner:

	typedef int	bool;
	#define	false	0
	#define	true	1

And of course I can still read idiomatic C code produced by others,
but I do notice the extra effort required.

> Readible code is code where the authors intent is clear, where
> the data structures and the types of operations that can be
> performed on the data structures make sense in the context of the
> problem being solved.

This is an excellent point.  Perhaps I shouldn't be taking that
for granted.

peter@baylor.UUCP (Peter da Silva) (08/18/85)

> While this kludge obviously works, I think it adds one extra level
> of mental processing when one is reading conditionals in C.  This
> is because there is no visible percept corresponding to one's
> thoughts about comparison against zero; one has to explicitly apply
> a conceptual language evaluation rule to map the expression into a
> condition in order to extract the full meaning of the condition.

Boy, I'd like to let a real psychologist go over that statement. I don't
notice any effort understanding if(some expression returning a small int)
in terms of booleans. Maybe because I never bothered with the "!= NULL"
construct...

> 	typedef int	bool;

#define bool char /* save a bit of memory eh? */

typedef int bool:1;	/* pity this won't work */
-- 
	Peter da Silva (the mad Australian werewolf)
		UUCP: ...!shell!neuro1!{hyd-ptd,baylor,datafac}!peter
		MCI: PDASILVA; CIS: 70216,1076

dsk@mtgzz.UUCP (d.s.klett) (08/19/85)

Instead of using #defines for the boolean values, I
would rather see enumerated data types used.  In general,
C programmers seem to prefer #defines to defining a data
type that can be checked during compilation.

	typedef enum { False , True } Boolean;

Don Klett

henry@utzoo.UUCP (Henry Spencer) (08/20/85)

> 	typedef int	bool;
> 	#define	false	0
> 	#define	true	1

It's interesting to note that Kernighan&Plauger use "yes" and "no" rather
than "true" and "false", and my own reaction is that the code often reads
better that way.  Now that's something to *really* start a raging debate
about... :-)
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

landauer@drivax.UUCP (Doug Landauer) (08/20/85)

Doug Gwyn says:
> I am pretty conservative when it comes to defining one's own
> language extensions, this one seems like a winner:
> 
> 	typedef int	bool;
> 	#define	false	0
> 	#define	true	1

My favorite way to do this one is
	typedef enum { false, true } boolean ;

It provides a little more type checking than your typedef.
--
			-- Doug Landauer --
	...[ ihnp4 | mot | ucscc | amdahl ] !drivax!landauer
		-- "I survived the DRI layoffs." --
			-- "(So far!)" --

pcf@drux3.UUCP (FryPC) (08/20/85)

> Instead of using #defines for the boolean values, I
> would rather see enumerated data types used.  In general,
> C programmers seem to prefer #defines to defining a data
> type that can be checked during compilation.

>	 typedef enum { False , True } Boolean;

I have seen programs written with this style of declaration, they also
had lines such as:

	bool = (x<y) ? True : False;

	if( bool == True ) ...

		  which the enumerated type requires. I prefer defines
and
	bool = x<y; 	if( bool ) ...
	       to keep my programs both readable and correct.

(The enumerated type will also take more space (an int).)

Peter Fry
drux3!pcf

robert@gitpyr.UUCP (Robert Viduya) (08/21/85)

In article <1056@mtgzz.UUCP>, dsk@mtgzz.UUCP (d.s.klett) writes:
> 
> Instead of using #defines for the boolean values, I
> would rather see enumerated data types used.  In general,
> C programmers seem to prefer #defines to defining a data
> type that can be checked during compilation.
> 
> 	typedef enum { False , True } Boolean;
> 
> Don Klett

The problem with enums is that compiler allocate them as ints.  This
means 1 wasted byte on a machine with a 16-bit int, 3 wasted bytes on
a machine with a 32-bit int and so on and so forth.  All you really
need is 1 byte (on most conventional machines).  I personally prefer:

    #define	TRUE	1
    #define	FALSE	0
    typedef	char	bool;


				robert
-- 
Robert Viduya							01111000
Georgia Institute of Technology

UUCP:   {akgua,allegra,amd,hplabs,ihnp4,masscomp,ut-ngp}!gatech!gitpyr!robert
        {rlgvax,sb1,uf-cgrl,unmvax,ut-sally}!gatech!gitpyr!robert
BITNET:	CCOPRRV @ GITVM1

david@ukma.UUCP (David Herron, NPR Lover) (08/22/85)

In article <675@gitpyr.UUCP> robert@gitpyr.UUCP (Robert Viduya) writes:
>In article <1056@mtgzz.UUCP>, dsk@mtgzz.UUCP (d.s.klett) writes:
...
>> 	typedef enum { False , True } Boolean;
...
>
>The problem with enums is that compiler allocate them as ints.  This
>means 1 wasted byte on a machine with a 16-bit int, 3 wasted bytes on
>a machine with a 32-bit int and so on and so forth.  All you really
>need is 1 byte (on most conventional machines).  I personally prefer:
>
>    #define	TRUE	1
>    #define	FALSE	0
>    typedef	char	bool;

Well, I personally prefer:

	#define TRUE (1==1)
	#define FALSE (1==0)
	typedef char bool;

Which is succint, to the point, and *machine*independant*!

'sides, constant expressions are calculated at compile time anyway.
-- 
--- David Herron
--- ARPA-> ukma!david@ANL-MCS.ARPA
--- UUCP-> {ucbvax,unmvax,boulder,oddjob}!anlams!ukma!david
---        {ihnp4,decvax,ucbvax}!cbosgd!ukma!david

Hackin's in me blood.  My mother was known as Miss Hacker before she married!

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (08/23/85)

The reason I didn't use char or enum for boolean data definition
is that that is not how C actually works.  Int is a more accurate
representation of the current state of affairs.

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (08/23/85)

> It's interesting to note that Kernighan&Plauger use "yes" and "no" rather
> than "true" and "false", ...

Yes, but the conventional use in symbolic logic is true/false.
One is not always asking a natural yes/no question of a predicate.

arnold@gatech.CSNET (Arnold Robbins) (08/23/85)

In article <5884@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) writes:
> > 	typedef int	bool;
> > 	#define	false	0
> > 	#define	true	1
> 
> It's interesting to note that Kernighan&Plauger use "yes" and "no" rather
> than "true" and "false", and my own reaction is that the code often reads
> better that way.  Now that's something to *really* start a raging debate
> about... :-)
> -- 
> 				Henry Spencer @ U of Toronto Zoology

Well, this can get carried too far.  I have worked with code based on
Software Tools stuff that looks like

	dowrite (file, YES, NO, NO, YES);

Now, can you tell what the heck it is doing? Especially when the code for
dowrite() is 700 lines down in another file? I've often thought that a style
like

#define FORCEWRITE	1
#define NOFORCE		0

#define APPEND		1
#define NOAPPEND	0

	dowrite (file, FORCEWRITE, APPEND, ....);	/* call */

int dowrite (file, force, append,...)	/* actual procedure */
...
{
	if (force)
		....
	if (append)
		....
}

is much clearer than the first style.  This is the kind of thing, if anything,
that "enums" would be most useful for (no flames about how poorly enums are
implemented. I'm talking conceptually here.).  Overall, TRUE or FALSE or YES
or NO doesn't make much difference to me.  However, I much prefer the following

	if (boolean)
		something
and

	boolean = (x && y || c >= d);

to the overly verbose

	if (boolean == TRUE)
		something

	if (x && y || c >= d)
		boolean = TRUE;
	else
		boolean = FALSE;

Now *that* should start a really raging debate! :-)
-- 
Arnold Robbins
CSNET:	arnold@gatech	ARPA:	arnold%gatech.csnet@csnet-relay.arpa
UUCP:	{ akgua, allegra, hplabs, ihnp4, seismo, ut-sally }!gatech!arnold

Hello. You have reached the Coalition to Eliminate Answering Machines.
Unfortunately, no one can come to the phone right now....

robert@gitpyr.UUCP (Robert Viduya) (08/24/85)

In article <2076@ukma.UUCP>, david@ukma.UUCP (David Herron, NPR Lover) writes:
> In article <675@gitpyr.UUCP> robert@gitpyr.UUCP (Robert Viduya) writes:
> >
> > ... I personally prefer:
> >
> >    #define	TRUE	1
> >    #define	FALSE	0
> >    typedef	char	bool;
> 
> Well, I personally prefer:
> 
> 	#define TRUE (1==1)
> 	#define FALSE (1==0)
> 	typedef char bool;
> 
> Which is succint, to the point, and *machine*independant*!
> 

Oh?  On what machine is (1==1) equal to 0, or (1==0) not equal to 0?  In
section 7.6 (Relational operators, Appendix A - C Reference Manual from
K&R's The C Programming Language), it explicitly states that the logical
operators all yield 0 if the relation is false and 1 if the relation is
true.  Nothing is mentioned about possible variations due to implementation
machine differences.

				robert
-- 
Robert Viduya							01111000
Georgia Institute of Technology

UUCP:   {akgua,allegra,amd,hplabs,ihnp4,masscomp,ut-ngp}!gatech!gitpyr!robert
        {rlgvax,sb1,uf-cgrl,unmvax,ut-sally}!gatech!gitpyr!robert
BITNET:	CCOPRRV @ GITVM1

ado@elsie.UUCP (Arthur David Olson) (08/24/85)

> > > . . .I personally prefer:
> > >
> > >    #define	TRUE	1
> > >    #define	FALSE	0
> > > . . .
> > . . .I personally prefer:
> > 
> >	#define TRUE (1==1)
> >	#define FALSE (1==0)
> > . . .
> Oh?  On what machine is (1==1) equal to 0, or (1==0) not equal to 0? . . .

Yes, the good book (K&R) says that (1==1) is always 1.
The advantage of the second approach above is that it obviates the need to
remember this fact.  The disadvantage of the second approach above is that it
gives "lint" fits ("constant in conditional context").

As for what *I* prefer:

	#ifndef TRUE
	#define	TRUE	(1)
	#define FALSE	(0)
	#endif

where the parenthesized definitions match those in "curses.h" to ensure that
if a reference to "curses.h" appears after the above lines I won't get a
"macro redefined to a different value" diagnostic from the C preprocessor.

				--ado

david@ukma.UUCP (David Herron, NPR Lover) (08/25/85)

In article <685@gitpyr.UUCP> robert@gitpyr.UUCP (Robert Viduya) writes:
>In article <2076@ukma.UUCP>, david@ukma.UUCP (David Herron, NPR Lover) writes:
>> Well, I personally prefer:
>> 
>> 	#define TRUE (1==1)
>> 	#define FALSE (1==0)
>> 	typedef char bool;
>> 
>> Which is succint, to the point, and *machine*independant*!
>> 
>
>Oh?  On what machine is (1==1) equal to 0, or (1==0) not equal to 0?  In
>section 7.6 (Relational operators, Appendix A - C Reference Manual from
>K&R's The C Programming Language), it explicitly states that the logical
>operators all yield 0 if the relation is false and 1 if the relation is
>true.  Nothing is mentioned about possible variations due to implementation
>machine differences.

I don't particularly care if it's defined to be machine independant already.

I have always thought that to be a machine dependancy (the value of true and
false).  Maybe I'm wrong.  But, different machines DO have different ideas
of which is true and false (at the assembler level).  And it is simply
a convention.

Still though, #define TRUE (1==1) is very obvious, to the point, correct,
proper, and all sorts of things.  And it doesn't require one to know
that detail about C that the convention is ~0 == TRUE and 0 == FALSE.

When I'm writing C I think of it has a *high*level*language*.  Not as 
simply one step away from assembler.  The difference between C and other 
*high*level*languages* is that it gives you precise control over an 
idealized machine.  This is what confuses most people about C, because
it can *feel* like assembler.  Especially when they've learned it on a 
PDP-11 and think like an assembly programmer.

But C gives you all these operators which allow you to define things
machine independantly rather than hardcoding values.  Obviously I mean
casts and the sizeof operator.  Also arithmetic to pointers.  So why not
TRUE and FALSE?
-- 
--- David Herron
--- ARPA-> ukma!david@ANL-MCS.ARPA
--- UUCP-> {ucbvax,unmvax,boulder,oddjob}!anlams!ukma!david
---        {ihnp4,decvax,ucbvax}!cbosgd!ukma!david

Hackin's in me blood.  My mother was known as Miss Hacker before she married!

rlk@chinet.UUCP (Richard L. Klappal) (08/25/85)

In article <685@gitpyr.UUCP> robert@gitpyr.UUCP (Robert Viduya) writes:
>In article <2076@ukma.UUCP>, david@ukma.UUCP (David Herron, NPR Lover) writes:
>> In article <675@gitpyr.UUCP> robert@gitpyr.UUCP (Robert Viduya) writes:
>> >
>> > ... I personally prefer:
>> >
>> >    #define	TRUE	1
>> >    #define	FALSE	0
>> >    typedef	char	bool;
>> 
>> Well, I personally prefer:
>> 
>> 	#define TRUE (1==1)
>> 	#define FALSE (1==0)
>> 	typedef char bool;
>> 
>> Which is succint, to the point, and *machine*independant*!
>> 
>
>Oh?  On what machine is (1==1) equal to 0, or (1==0) not equal to 0?  In
>section 7.6 (Relational operators, Appendix A - C Reference Manual from
>K&R's The C Programming Language), it explicitly states that the logical
>operators all yield 0 if the relation is false and 1 if the relation is
>true.  Nothing is mentioned about possible variations due to implementation
>machine differences.
>
>				robert
>-- 
>Robert Viduya							01111000
Maybe not 'machine independent' in C, but the logic will
be 'language independent' (using the appropriate equality operator).

I've had to debug an awful lot of assembly language where
a cpu 'Z' flag (zero-flag) was used as TRUE.  About like strcmp
returns 0 if two strings are the same so that you cannot extend
the concept 

	if (string1==string2) {...}

that works in F77, PL/I, BASIC, etc.



Richard Klappal

UUCP:		..!ihnp4!chinet!uklpl!rlk  | "Money is truthful.  If a man
MCIMail:	rklappal		   | speaks of his honor, make him
Compuserve:	74106,1021		   | pay cash."
USPS:		1 S 299 Danby Street	   | 
		Villa Park IL 60181	   |	Lazarus Long 
TEL:		(312) 620-4988		   |	    (aka R. Heinlein)
-------------------------------------------------------------------------

ark@alice.UucP (Andrew Koenig) (08/25/85)

> I have always thought that to be a machine dependancy (the value of true and
> false).  Maybe I'm wrong.  But, different machines DO have different ideas
> of which is true and false (at the assembler level).  And it is simply
> a convention.

You are wrong: the value of true and false in C is defined as part of
the language:

When I write    if(exp) foo(); else bar();   foo is called if exp is
nonzero and bar is called if exp is zero.

The result of relational operators, &&, ||, and ! is always 1 or 0
(not some random machine-dependent value or zero).

throopw@rtp47.UUCP (Wayne Throop) (08/26/85)

In message 989@gatech, Arnold Robbins gives an interesting example:

> #define FORCEWRITE	1
> #define NOFORCE		0
>
> #define APPEND		1
> #define NOAPPEND	0
>
> 	dowrite (file, FORCEWRITE, APPEND, ....);	/* call */
>
> int dowrite (file, force, append,...)	/* actual procedure */

I like this notion in general, but I point out a problem with it that a
(currently nonexistant) lint-like tool could help with.  What if the
call was done like so:

    dowrite( file, NOAPPEND, FORCEWRITE, ... );

This doesn't do remotely what you intend, and it very hard to detect.
What you "really want" is to be able to declare two enumerations, and
make lint check that you don't pass members of one enumeration to a
formal of another type.

Also, the "extra checking" that K&R says that lint-like tools are free
to do with typedefs would be welcome.  No current link (that I am aware
of) will allow using typedef as a type abstraction device.
-- 
Wayne Throop at Data General, RTP, NC
<the-known-world>!mcnc!rti-sel!rtp47!throopw

franka@mmintl.UUCP (Frank Adams) (08/27/85)

In article <989@gatech.CSNET> arnold@gatech.CSNET (Arnold Robbins) writes:
>
>Well, this can get carried too far.  I have worked with code based on
>Software Tools stuff that looks like
>
>	dowrite (file, YES, NO, NO, YES);
>
>Now, can you tell what the heck it is doing? Especially when the code for
>dowrite() is 700 lines down in another file? I've often thought that a style
>like
>
>#define FORCEWRITE	1
>#define NOFORCE		0
>
>#define APPEND		1
>#define NOAPPEND	0
>
>	dowrite (file, FORCEWRITE, APPEND, ....);	/* call */
>
>
>is much clearer than the first style.  This is the kind of thing, if anything,
>that "enums" would be most useful for (no flames about how poorly enums are
>implemented. I'm talking conceptually here.).

The problem with your solution is I can just easily write

	dowrite(file, APPEND, NOFORCE, ...)

and will have a terrible time finding the error.  This is where Ada wins:

	dowrite(file, append=>true, force=>false, ...)

is clear, simple, and error-resistant.

peter@baylor.UUCP (Peter da Silva) (08/29/85)

> #define FORCEWRITE	1
> #define NOFORCE		0
> 
> #define APPEND		1
> #define NOAPPEND	0
> 
> 	dowrite (file, FORCEWRITE, APPEND, ....);	/* call */

How about

#define FORCEWRITE 1
#define APPEND 2
#define OTHERFLAG 4
...

	dowrite(file, APPEND|OTHERFLAG);
-- 
	Peter (Made in Australia) da Silva
		UUCP: ...!shell!neuro1!{hyd-ptd,baylor,datafac}!peter
		MCI: PDASILVA; CIS: 70216,1076

rcd@opus.UUCP (Dick Dunn) (08/30/85)

> Instead of using #defines for the boolean values, I
> would rather see enumerated data types used.  In general,
> C programmers seem to prefer #defines to defining a data
> type that can be checked during compilation.
> 
> 	typedef enum { False , True } Boolean;

Whether this works depends on your compiler's view of enums.  If it treats
enums as a slight variant on integers (which to my tastes is pretty
sloppy), you're OK.  However, if it uses the very restricted view which
doesn't allow arithmetic on enums, the above definition will prevent the
"usual" logical operators !, &, ^, and | from working with objects of type
Boolean.
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
   ...Relax...don't worry...have a homebrew.

stew@harvard.ARPA (Stew Rubenstein) (08/31/85)

In article <625@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>
>In article <989@gatech.CSNET> arnold@gatech.CSNET (Arnold Robbins) writes:
>
>  >Well, this can get carried too far.  I have worked with code based on
>  >Software Tools stuff that looks like
>  >
>  >	dowrite (file, YES, NO, NO, YES);
>...
>  >	dowrite (file, FORCEWRITE, APPEND, ....);	/* call */
>  >
>  >
>  >is much clearer than the first style.  This is the kind of thing, if
>  >anything, that "enums" would be most useful for (no flames about how poorly
>  >enums are implemented. I'm talking conceptually here.).
>
>The problem with your solution is I can just easily write
>
>	dowrite(file, APPEND, NOFORCE, ...)
>
>and will have a terrible time finding the error.  This is where Ada wins:
>
>	dowrite(file, append=>true, force=>false, ...)

In a perfect world, the argument types for dowrite() would be declared
as different enum types and the compiler will complain if you mix them
up (i.e. in new ANSI C if I understand it right).  In a slightly less
perfect world, lint would complain about the inconsistency.  We were
talking conceptually, right?  Not to say that the keyword style isn't
better, but this is one the compiler ought to catch.

Stew