[comp.lang.c] C style

ok@quintus.UUCP (Richard A. O'Keefe) (01/13/88)

The book
	Professional Software, Volume II:
	    Programming Practice
	by Henry Ledgard with John Tauer
	Addison-Wesley, 1987
	ISBN 0-201-12232-4 (v.2)
	approx US$20

is about matters of programming style.  Most of what is said is
language-independent;  what is language-specific has Pascal,
Modula-2, and   C   versions.  I don't like everything he says,
and the documentation of data structures is not discussed, but
it's well worth reading.

Mind you, while the Example program at the end is about as
clear and beautiful as Pascal/Ada gets, it convinced me that I
should write programs like that (a text formatter) in Lisp...

ian@mucs.UX.CS.MAN.AC.UK (Ian Cottam) (02/18/88)

I would like to update the style list (below) that I give to students.
Any comments welcome.  Flames to /dev/null pls.
Email is best -- I will summerise.
ian@ux.cs.man.ac.uk
___________________

		Good C Style
		============

Note:  Advice that I, and a few others, have given but has been universely
       ignored is indicated by a + rather than a * in the list below!


* choose a personal or project standard for layout and stick to it

+ show the difference between = and == by using asymmetric layout
  for = (e.g. ptr= NULL;)

* use  expr == var  to emphasise and catch some = for == mistakes
  (e.g. x*2 == y)

+ avoid EXCESSIVE use of multiple exits from a context

* do not rely on defaults, e.g. extern/static variables being set to 0

* use void when you are writing a non-value returning function
  (not the default int)

* cast the return values of functions to void if you are certain that
  their result is not needed (have a list of exceptions to this rule
  e.g. printf, free)

* use lint as a matter of course (not just after 10 hours debugging)

* write portable C (Kernel level code, e.g. drivers, excepting.  Put
  all such code in a separate file.)

* use standard I/O (i.e. don't use UNIX specific I/O without good reason)

* don't use a macro if a function call is acceptably efficient

* avoid complex macros

* don't use complex conditional expressions especially ones returning
  non-arithmetic values

+ don't use assignment inside a complex expression
  (e.g. use (chptr= malloc(N), chptr != NULL)
        rather than
            ((chptr = malloc(N)) != NULL)

+ avoid #ifdefs for version control purposes (use a proper version
  control system)

* use typedef to build up (complex) declarations piecemeal

* use enumerations and #define for constant names
  (enum is preferable to #define, especially with ANSI C)

* always check the error status of a system call

* use header files, e.g. <stdio.h>, rather than assume, e.g., -1 == EOF

* try to avoid assuming the world uses ASCII (difficult one this!)

* use casts rather than rely on representational coincidences

* only have one copy of global declarations in a .h file

* do not embed absolute pathnames in program code (at least #define them)

+ use side-effect operations in statement context only
  (exception: the comma operator)

* use static for variables and functions that are local to a file

* use the ``UNIX standard'' for command line argument syntax

* split programs over several files for ``information hiding''

* never fall through the case arm of a switch statement

+ use local blocks to indicate the extent of variables and comments, e.g.
	{ /* this does foo */
		int foovar;
		/* stuff for foo */
	}
	{ /* this does bar */
		...etc...
	}

* don't use the preprocessor to make C look like anything but C

pajari@grads.cs.ubc.ca (George Pajari) (04/12/88)

In article <130@obie.UUCP> wes@obie.UUCP (Barnacle Wes) writes:
>... `=' is NOT always invalid in an if clause:
>
>    if (status = *statreg) {

Article <5981@utcsri.UUCP> by flaps@utcsri.UUCP (Alan J Rosenthal)
> ... better written as "if ((status = *statreg))"
> some people even recommend "if ((status = *statreg), status)"

What is wrong with:

	if ((status = *statreg) != NULL)...

Most compilers I've tried generate the same code as for the original and
this version (I claim) is the most readble (i.e. scanned quickly by a human
with the fewest resulting semantic errors) of the three.

George Pajari

tneff@atpal.UUCP (Tom Neff) (04/13/88)

In article <1982@ubc-cs.UUCP> pajari@grads.cs.ubc.ca (George Pajari) writes:
>In article <130@obie.UUCP> wes@obie.UUCP (Barnacle Wes) writes:
>>    if (status = *statreg) {
>
>What is wrong with:
>
>	if ((status = *statreg) != NULL)...

For what it's worth, the latter is the preferred formulation as recommended 
by Harbison&Steele (2nd ed.) and Gimpel's PC-LINT, among others.  Certain
grizzled veterans may crave the older, less wordy, more "clever" formulation,
but the explicit comparison is easier to read and less likely to be confused
for something else.  If you get into the habit of never writing if(variable),
you don't tend to have =/== accidents, in my experience.  Also, both MSC
and PC-LINT are nice and flag the former usage above as being possibly a
typo.
-- 

Tom Neff

franka@mmintl.UUCP (Frank Adams) (04/19/88)

In article <126@atpal.UUCP> tneff@atpal.UUCP (Tom Neff) writes:
>If you get into the habit of never writing if(variable),
>you don't tend to have =/== accidents, in my experience.

I wouldn't say "never write if(variable)"; rather "only write if(variable)
if variable is Boolean".  In my opinion, things like if(variable==TRUE) are
abominations.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

jep@oink.UUCP (James E. Prior) (04/21/88)

In article <2823@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>In article <126@atpal.UUCP> tneff@atpal.UUCP (Tom Neff) writes:
>>If you get into the habit of never writing if(variable),
>>you don't tend to have =/== accidents, in my experience.
>
>I wouldn't say "never write if(variable)"; rather "only write if(variable)
>if variable is Boolean".  In my opinion, things like if(variable==TRUE) are
>abominations.

Amen!, and I'll go one further

   if (var==TRUE)

is not only abominable, it can be dangerous.  var==TRUE tends to presume
that the only valid values of var are FALSE and TRUE.  There are times
when a var can very intentionally have a non-zero (true) value other than
TRUE (1).  The classic kind of case of this is var=isalpha(c).  The
isxxxxx(c) functions are often defined as macros like:

   #define isxxxxxx(c)  (attribute_array[c] & MASK_XXXX)

Each element of the array consists of a bunch of bit attributes.  The
MASK_XXXX selects the attributes of interest, often yeilding true but
non-one (non-TRUE) values.

-- 
Jim Prior    {ihnp4|osu-cis}!n8emr!oink!jep    jep@oink.UUCP

Pointers are my friend.

tneff@atpal.UUCP (Tom Neff) (04/21/88)

In article <2823@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>I wouldn't say "never write if(variable)"; rather "only write if(variable)
>if variable is Boolean".  In my opinion, things like if(variable==TRUE) are
>abominations.

I agree completely, Frank, and thanks for the correction.  I meant to say
"non-boolean variable" (or at least I realize that now :-)). Of course, now
we could get into the etiquette of  if(boolean-assignment-expression)... 

-- 
Tom Neff			UUCP: ...uunet!pwcmrd!skipnyc!atpal!tneff
	"None of your toys	CIS: 76556,2536		MCI: TNEFF
	 will function..."	GEnie: TOMNEFF		BIX: are you kidding?

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (04/22/88)

In article <255@oink.UUCP> jep@oink.UUCP (James E. Prior) writes:

| Amen!, and I'll go one further
| 
|    if (var==TRUE)
| 
| is not only abominable, it can be dangerous.  var==TRUE tends to presume
| that the only valid values of var are FALSE and TRUE.  There are times
| when a var can very intentionally have a non-zero (true) value other than
| TRUE (1).  The classic kind of case of this is var=isalpha(c).  The

  If you *must* use stuff like this, at least you can write:
	if (var != FALSE)
which is more likely to work. There is only one good reason I can
determine to use code like that: some COBOL programmer wrote the style
specs for your organization.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

karl@haddock.ISC.COM (Karl Heuer) (04/22/88)

In article <255@oink.UUCP> jep@oink.UUCP (James E. Prior) writes:
>In article <2823@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>>In my opinion, things like if(variable==TRUE) are abominations.
>
>[It] is not only abominable, it can be dangerous.  var==TRUE tends to presume
>that the only valid values of var are FALSE and TRUE.

If this isn't the case, it shouldn't have been declared boolean.  My coding
style is as if `bool' were a true boolean type, and as if assigning anything
other than `YES' or `NO' (yeah, I follow Kernighan, not Wirth) would produce
undefined results.

>There are times when a var can very intentionally have a non-zero (true)
>value other than TRUE (1).  The classic kind of case of this is [isalpha].

Then isalpha() is not a boolean, in my book.  I'm not convinced there's any
good reason for this; it would be trivial to rewrite the function to return a
true boolean.  In most instances this would have zero run-time cost.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

barmar@think.COM (Barry Margolin) (04/23/88)

In article <3583@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>>There are times when a var can very intentionally have a non-zero (true)
>>value other than TRUE (1).  The classic kind of case of this is [isalpha].
>
>Then isalpha() is not a boolean, in my book.  I'm not convinced there's any
>good reason for this; it would be trivial to rewrite the function to return a
>true boolean.  In most instances this would have zero run-time cost.

Well, if you're coding in C, you'll have to live with C's quirks.
Nowhere does the C language specify that any particular non-zero value
should be returned by the isXXX functions to indicate truth.  The
person who wrote isalpha() might have done #define TRUE 2, and you
might have done #define TRUE 1.  Who is to say that you are right and
he is wrong?  The only thing that C specifies is that you must both do
#define FALSE 0.  Remember, C doesn't have real booleans, and #define
TRUE 1 isn't going to change that fact.

I can think of one reason, however, why 1 should be used as a standard
truth value.  Single-bit fields in structures are generally used as
flags, and it would be nice to be able to say flag = isXXX(...) rather
than flag = (isXXX(...) != FALSE).  But this is just a wish; portable
code must currently use the more verbose version.

Barry Margolin
Thinking Machines Corp.

barmar@think.com
uunet!think!barmar

friedl@vsi.UUCP (Stephen J. Friedl) (04/24/88)

In article <20126@think.UUCP>, barmar@think.COM (Barry Margolin) writes:
< I can think of one reason, however, why 1 should be used as a standard
< truth value.  Single-bit fields in structures are generally used as
< flags, and it would be nice to be able to say flag = isXXX(...) rather
< than flag = (isXXX(...) != FALSE).

Techinically, shouldn't TRUE be `1U' if it goes into a bitfield?
I'm not sure but would like to hear some thoughts on this...

-- 
Steve Friedl      V-Systems, Inc.    Gene Spafford for President!
friedl@vsi.com   {backbones}!vsi.com!friedl    attmail!vsi!friedl

flaps@dgp.toronto.edu (Alan J Rosenthal) (04/24/88)

People complaining recently that "if(expr == TRUE)" is ridiculous have
missed the most convincing reason as to why it is ridiculous.  Note
that we are assuming that expr is a boolean expression in the
Indian-Hill-C-Style-Manual-as-annotated-by-Henry-Spencer sense that it
is known to evaluate to one of 0 or 1.

The clearest reason why "if(expr == TRUE)" is ridiculous is simply that
such a test begs the question.  Certainly "expr == TRUE" returns a
boolean result, so if it is necessary to test boolean results in this
fashion then we must write "if((expr == TRUE) == TRUE)"!  And so on,
making the expression infinitely large, which is explicitly prohibited
by the ANSI C standard.

ajr

--
"The goto statement has been the focus of much of this controversy."
	    -- Aho & Ullman, Principles of Compiler Design, A-W 1977, page 54.

wes@obie.UUCP (Barnacle Wes) (04/25/88)

In article <255@oink.UUCP>, jep@oink.UUCP (James E. Prior) writes:
>    if (var==TRUE)
> 
> is not only abominable, it can be dangerous.  var==TRUE tends to presume
> that the only valid values of var are FALSE and TRUE.

Right.  The only really safe way to do this is:

typedef enum boolean {false, true};
...
	boolean foo;
	...
	if (foo == true) { ...

But, of course, then you can test your `booleans' directly too, like

	if (foo) { ...

-- 
    /\              -  "Against Stupidity,  -    {backbones}!
   /\/\  .    /\    -  The Gods Themselves  -  utah-cs!uplherc!
  /    \/ \/\/  \   -   Contend in Vain."   -   sp7040!obie!
 / U i n T e c h \  -       Schiller        -        wes

karl@haddock.ISC.COM (Karl Heuer) (04/25/88)

In article <20126@think.UUCP> barmar@fafnir.think.com.UUCP (Barry Margolin) writes:
>Nowhere does the C language specify that any particular non-zero value
>should be returned by the isXXX functions to indicate truth.

This is correct according to historical precedent and the Jan88 dpANS.  I
disapprove nevertheless.

>I can think of one reason, however, why 1 should be used as a standard
>truth value.  Single-bit fields ...

I think a more important reason is that the boolean operators such as "<" and
"&&" are already guaranteed to return normalized boolean values (i.e. truth is
denoted by 1).  Requiring the isXXX functions to do likewise would follow the
principle of least astonishment.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

bts@sas.UUCP (Brian T. Schellenberger) (04/25/88)

In article <20126@think.UUCP> barmar@fafnir.think.com.UUCP (Barry Margolin) writes:
|[for one-bit bit fields]
|it would be nice to be able to say flag = isXXX(...) rather
|than flag = (isXXX(...) != FALSE).  But this is just a wish; portable
|code must currently use the more verbose version.

(I will probably start a "if( a = b )"-type flame war for this, but):

I prefer the shorter form:

	flag = !!isXXX(...)

You can think of "!!" as the "truth-value-of" or "convert-to-canonical-boolean" 
operator.

( To forestall 10 "I didn't know there was such an operator" postings:  )
( This is "!" followed by "!":                                          )
(       non-zero -> 0 -> 1                                              )
(              0 -> 1 -> 0                                              )
-- 
                                                         --Brian.
(Brian T. Schellenberger)				 ...!mcnc!rti!sas!bts

. . . now at 2400 baud, so maybe I'll stop bothering to flame long includes.

root@mfci.UUCP (SuperUser) (04/25/88)

Expires:

Followup-To:

Distribution:

Keywords:

In article <601@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes:
>Techinically, shouldn't TRUE be `1U' if it goes into a bitfield?
>I'm not sure but would like to hear some thoughts on this...

TRUE should have the same type and value as the constant expression (0 == 0).
Similarly, FALSE should have the same type and value as the constant
expression (0 != 0).  This principle holds for any language.  In the case
of C, TRUE and FALSE are signed 1 and signed 0, resp.

nevin1@ihlpf.ATT.COM (00704a-Liber) (04/27/88)

In article <3592@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:

>I think a more important reason is that the boolean operators such as "<" and
>"&&" are already guaranteed to return normalized boolean values (i.e. truth is
>denoted by 1).  Requiring the isXXX functions to do likewise would follow the
>principle of least astonishment.

This is a matter of opinion.  Personally, I would rather have the isXXX
macros be required to return their argument if true, since this conveys
more useful information than simplay a TRUE (1) or FALSE (0) value (BTW,
this might not be possible with the isascii macro).

Also, I would rather have all bits set to represent TRUE (such as two's
complement for -1) instead of 1, since I can use the bitwise operators on
boolean values most effectively this way (I'm not expecting this to happen
due to prior art with < and &&, but one can hope).
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

nevin1@ihlpf.ATT.COM (00704a-Liber) (04/27/88)

In article <364@m3.mfci.UUCP> karzes@mfci.UUCP (Tom Karzes) writes:

>TRUE should have the same type and value as the constant expression (0 == 0).
>Similarly, FALSE should have the same type and value as the constant
>expression (0 != 0).  This principle holds for any language.  In the case
>of C, TRUE and FALSE are signed 1 and signed 0, resp.

Since when does this principle hold for any language??  Take Fortran, for
instance.  If I remember correctly, odd numbers were TRUE and even numbers
were FALSE (or vice-versa; it's really been a long time since I used
FORTRAN), since compilers were required to look at only the least
significant bit when checking for true/false values.  (This may have been
changed for F77; I'm not sure.)

There are many other examples I could cite (Icon, LISP, etc.).  Your
principle only holds for languages which

a) Have a boolean type

and

b) All logical operations result in that boolean type

such as Pascal.  Many (most?) languages do not conform to these
requirements.
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

root@mfci.UUCP (SuperUser) (04/28/88)

Expires:

Followup-To:

Distribution:

Keywords:


In article <4547@ihlpf.ATT.COM> nevin1@ihlpf.UUCP (00704a-Liber,N.J.) writes:
>In article <364@m3.mfci.UUCP> karzes@mfci.UUCP (Tom Karzes) writes:
>
>>TRUE should have the same type and value as the constant expression (0 == 0).
>>Similarly, FALSE should have the same type and value as the constant
>>expression (0 != 0).  This principle holds for any language.  In the case
>>of C, TRUE and FALSE are signed 1 and signed 0, resp.
>
>Since when does this principle hold for any language??  Take Fortran, for
>instance.  If I remember correctly, odd numbers were TRUE and even numbers
>were FALSE (or vice-versa; it's really been a long time since I used
>FORTRAN), since compilers were required to look at only the least
>significant bit when checking for true/false values.  (This may have been
>changed for F77; I'm not sure.)
>
>There are many other examples I could cite (Icon, LISP, etc.).  Your
>principle only holds for languages which
>
>a) Have a boolean type
>
>and
>
>b) All logical operations result in that boolean type
>
>such as Pascal.  Many (most?) languages do not conform to these
>requirements.

First, let me clarify my previous remarks:

    1.  Obviously they applied to languages for which operations such as
        comparisons are genuine expressions which return a value with a
        particular type.  This rules out very primitive languages in
        which comparisons are only permitted in statements which test
        a condition (although my definitions of TRUE and FALSE could still
        work if they were simple macros rather than genuine constant
        expressions).  My original message pointed this out, but it was
        so obvious that I decided not to belabor the obvious and deleted it.
        Anyway, it's no big deal.

    2.  Consider the remaining languages which permit truth-valued expressions
        and variables.  In such languages, if TRUE and FALSE are not
        predefined symbols or special tokens, it is useful to define
        literals or macros to distinguish their use from, for example,
        the numbers 1 and 0 for languages which use these for their
        canonical TRUE and FALSE values.  In almost all such languages,
        0 == 0 will yield a canonical TRUE and 0 != 0 will yield a canonical
        FALSE.

        If TRUE and FALSE are to be defined by a programmer, they should
        be have the canonical values which are almost always already
        defined by the language in question.  This is independent
        of whether a unique boolean type is defined by the language (as in
        Fortran) or whether it is not (as in C).  It is also independent of
        what values "test" as TRUE or FALSE.  When testing a value for
        truth/falsity, depending on the language it may be cast to some
        appropriate type first or it may not, after which some test is
        performed (for some languages the test may not even be defined for
        inappropriate values even if the type is acceptable).  However, it
        should always be the case that TRUE tests true and FALSE tests false.

        Here is a table of canonical TRUE/FALSE values for various languages:

            Language        True        False       Type
            --------        ----        -----       ----
            C               1           0           int
            Fortran         .TRUE.      .FALSE.     LOGICAL
            Pascal          TRUE        FALSE       BOOLEAN
            Ada             TRUE        FALSE       BOOLEAN
            PL/I            1           0           FIXED
            Bliss           1           0           <untyped>
            Lisp            T           NIL         SYMBOL
            APL             1           0           <scalar>
            etc.

        When testing for truth/falsity, different approaches are taken.
        In C, every value is merely tested for inequality with 0 (even
        doubles), with the appropriate type casts being performed.
        Fortran only defines truth/falsity for .TRUE. and .FALSE.,
        although many implementations define a bit pattern used to
        represent .TRUE. and .FALSE., and then define a test to determine
        truth/falsity of non-canonical logicals, integers, etc.  (Note
        that standard F77 only allows .TRUE. and .FALSE. for logicals.)
        Some implementations test the low-order bit, some test for
        inequality with zero, etc.  Most represent .FALSE. as 0.  Some
        represent .TRUE. as 1, others as -1.  But none of this is defined
        by the standard.  Bliss tests the low order bit.  Lisp tests for
        inequality with NIL.  Etc, etc.

To address the individual points of confusion in the reply to my original
message:

    1.  You obviously don't know much about Fortran.  No, Fortran does
        not define a true/false test (truth predicate) for integers.
        As I mentioned above, some Fortran compilers choose to expose
        the internal representation of LOGICALS, and may even define
        a truth predicate for integers (i.e., they may permit integers
        to be used directly as logical expressions in logical IF
        statements, etc.).  So what?  Even if this WERE defined and
        required by the standard, it would change nothing.  The values
        .TRUE. and .FALSE. still have type LOGICAL and are still the
        canonical TRUE/FALSE values of the language.  The expression
        0 .EQ. 0 evaluates to .TRUE. and 0 .NE. 0 evaluates to .FALSE..

        I believe VAX/VMS Fortran allows any integer value to be stored
        in a logical, and it tests the low order bit when determining
        truth/falsity.  Integers may be tested directly for truth/falsity.
        Canonical TRUE is -1 and canonical FALSE is 0.  Perhaps you
        thought this was required by the standard (it is not).  How
        parochial.

    2.  I'm not familiar with icon.  However, as I mentioned above,
        the canonical TRUE/FALSE values for lisp are T and NIL.  The
        truth test used by lisp is inequality with NIL.  It happens
        that both T and NIL have the same data type (SYMBOL), although
        in object oriented languages such as lisp they could just as
        easily have different data types (there may well be lisps in
        which NIL is not regarded as a SYMBOL).

    3.  I believe I have already dispensed with the notion that a boolean
        type is required for my principle to hold.

Any more questions?

ok@quintus.UUCP (Richard A. O'Keefe) (04/28/88)

In article <371@m3.mfci.UUCP>, root@mfci.UUCP (SuperUser) writes:
>         Here is a table of canonical TRUE/FALSE values for various languages:
>             Language        True        False       Type
[deleted]
>             PL/I            1           0           FIXED
[deleted]
This entry is incorrected.  The right answer is
	      PL/I	      '1'B        '0'B        BIT(1)
PL/I has (fixed-length) bitstrings BIT(N) as a family of data types.
(It is not clear that this was a good idea.)

dg@lakart.UUCP (David Goodenough) (04/28/88)

From article <175@obie.UUCP>, by wes@obie.UUCP (Barnacle Wes):
> In article <255@oink.UUCP>, jep@oink.UUCP (James E. Prior) writes:
>>    if (var==TRUE)
>> 
>> is not only abominable, it can be dangerous.  var==TRUE tends to presume
>> that the only valid values of var are FALSE and TRUE.
> 
> Right.  The only really safe way to do this is:
 
[safe and silly way deleted]

Why can no one see the forest because of all the trees getting in the way?

	if (var != FALSE)

will *_ALWAYS_* work NO MATTER WHAT, and it satisfies those that need to test
their booleans in a strange manner.

	if (var)

is what I use when testing a boolean, and if it's an integer then I say

	if (var != 0)

which is what I usually mean.
--
	dg@lakart.UUCP - David Goodenough		+---+
							| +-+-+
	....... !harvard!adelie!cfisun!lakart!dg	+-+-+ |
						  	  +---+

hutchson@convex.UUCP (05/07/88)

If you must take Fortran in a c newsgroup, at least take Fortran, not some
brain-damaged non-error-checking dialect.

The current ANSI standard(x3.9-1978,"fortran-77") does not allow one to know
the representation of the logical values .true. and .false. (except that it
fits in the same space in memory as an integer value), so each
implementor chooses his own.  I have seen at least these:
   -  sign bit==1 is true, sign bit==0 is false.
   -  lsb==1 is true, lsb==0 is false.
   -  any non-zero value is true, zero is false.
   -  any non-zero is considered true, .true. is canonically all one bits
   -  any non-zero is considered true, .true. is canonically==(int) 1
Reversing the meaning and representation of each of these would be equally
valid.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (11/12/88)

In article <764@wsccs.UUCP> terry@wsccs.UUCP (Every system needs one) writes:
>Unions should be avoided, as well as non-pre-aligned structures.

I don't know what you mean by a "non-pre-aligned" structure; perhaps
one with holes?  In any case, object alignment is implementation-
dependent and should not be a major concern when designing most
data structures.  Unions have obvious uses for which there is no
acceptable substitute, but one should not go out of one's way to
use a union where one is not required.

>	if( expression operator expression)
>rather than
>	if (expressionoperatorexpression)
>which can break old compilers with either the space after the if or ...

I have never seen a C compiler that had trouble with white space between
the "if" and "(" tokens.  If you insist on "if(" then at least balance
the white space adjacent to the parentheses.

>It is sheer stupidity to depend on the supposed contents of a black box; for
>instance, a compiler.  This generates non-portable and lazy coding practices
>"aw... the compiler'll make it fast..."

To the contrary, the example you quoted showed that non-portable practices
could be avoided without sacrificing efficiency, by relying on the
compiler to produce code optimized for the particular architecture.
If some compiler has a shortcoming in this regard, effort spent to
improve the compiler will have a larger payoff than effort spent in
tweaking individual applications in an attempt to partially compile
the code by hand in advance.

>Ability to understand others code is the difference between a programmer and
>a person who can program.  Writing code for idiots is only good if you are
>an idiot and can do no better, or if you are willing to hire idiots.

You miss the point.  Even the most experienced C programmer stands to
benefit, EVEN WHEN LATER WORKING ON HIS OWN CODE, if the code is
written with maintainability in mind.  The more low-level tricks
contained in a stretch of code, the less maintainable it is even
to the original programmer.

>There is some truth to this, but the key word is "unnecessary".  It is
>also unnecessary for a computer programmer, whose first abstract concept
>should have been bits.  If the person is a C programmer, the second an third
>concepts should have been Hexadecimal and Octal.  Assuming that these
>operations haven't become automatic after experience is silly.  It is equally
>silly to think of a programmer doing bit operations with multiplies instead of
>shifts!

Judging by the errors I have seen in literally millions of lines of C
code, the C programmer who thinks primarily in terms of bits is probably
introducing scads of present and future bugs into his code.

>You have any idea how long a multiply takes on most architectures?

Not normally long enough to worry about.

One should micro-optimize only those sections of code that need it,
when it is clear that the tradeoff against maintainability is worthwhile.
To do this for all source code would be making a poor long-term tradeoff.

kim@msn034.misemi (Kim Letkeman) (11/20/88)

In article <8864@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:

[ much verbiage removed for brevity ...]

> 
> Judging by the errors I have seen in literally millions of lines of C
> code, the C programmer who thinks primarily in terms of bits is probably
> introducing scads of present and future bugs into his code.
> 
> >You have any idea how long a multiply takes on most architectures?
> 
> Not normally long enough to worry about.
> 
> One should micro-optimize only those sections of code that need it,
> when it is clear that the tradeoff against maintainability is worthwhile.
> To do this for all source code would be making a poor long-term tradeoff.

No question about it. Too many people write poorly structured, unclear
(add as many unpleasant labels as you like) code for no better reason than
a perceived need to write the fastest code on the planet. While I realize
that garden-variety 4.77 mhz machines really appreciate fast code (I own
one myself), micro-optimizing a whole program is overkill to say the least.

One only need read Software Tools and Elements of Programming Style to 
realize that for most programs, the vast majority of the code will have
almost no affect on the overall performance of the program (providing of
course that there are no gross algorithmic blunders.) I have found that 
optimization efforts that focused on better algorithms payed off in order 
of magnitude increases in performance while optimizing a single routine 
usually gives back some fraction of 1 percent ...

Kim

peter@ficc.uu.net (Peter da Silva) (01/11/89)

I have been told that the following mechanism for handling nested includes
is unreliable and/or unportable, but for the life of me I can't see how:

graphics.h:
	#ifndef GRAPHICS_H
	#define GRAPHICS_H
	...
	#endif

windows.h:
	...
	#ifndef GRAPHICS_H
	#include GRAPHICS_H
	#endif
	...

menus.h:
	...
	#ifndef GRAPHICS_H
	#include GRAPHICS_H
	#endif
	...

Now this allows a programmer to include windows.h and menus.h, without
having to (a) know they need to include graphics.h, and (b) worry about
graphics.h being included twice.

What's wrong with this picture?
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Work: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.   `-_-'
Home: bigtex!texbell!sugar!peter, peter@sugar.uu.net.                 'U`
Opinions may not represent the policies of FICC or the Xenix Support group.

peter@ficc.uu.net (Peter da Silva) (01/11/89)

This should make the previous message <2688@ficc.uu.net> a little less absurd.

  windows.h:
  	...
  	#ifndef GRAPHICS_H
! 	#include "graphics.h"
  	#endif
  	...
  
  menus.h:
  	...
  	#ifndef GRAPHICS_H
! 	#include "graphics.h"
  	#endif
  	...
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Work: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.   `-_-'
Home: bigtex!texbell!sugar!peter, peter@sugar.uu.net.                 'U`
Opinions may not represent the policies of FICC or the Xenix Support group.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/11/89)

In article <2688@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>I have been told that the following mechanism for handling nested includes
>is unreliable and/or unportable, but for the life of me I can't see how:
>graphics.h:
>	#ifndef GRAPHICS_H
>	#define GRAPHICS_H
>	...
>	#endif
>windows.h:
>	...
>	#ifndef GRAPHICS_H
>	#include GRAPHICS_H
>	#endif
>	...
>menus.h:
>	...
>	#ifndef GRAPHICS_H
>	#include GRAPHICS_H
>	#endif
>	...
>Now this allows a programmer to include windows.h and menus.h, without
>having to (a) know they need to include graphics.h, and (b) worry about
>graphics.h being included twice.
>What's wrong with this picture?

The only thing wrong is your syntax.  You mean
	#include "graphics.h"
in the latter two files.
In fact there is no need to place conditionals around those inclusions,
since the included file will have no effect if it is already in force,
bacause it checks its one-time lockout flag and avoids redefining things
after the first time it's included in a translation unit.

Notes:
	1.  GRAPHICS_H needs to be reserved for this use.  If this
	header is part of an application (as indicated), then you
	just have to keep track of such symbols, perhaps by making
	the rule that the _H suffix is reserved for them.  If you
	were implementing standard headers for a C implementation,
	you would need to use a lockout symbol that's in the
	implementation's reserved name space, e.g. __CTYPE_H.

	2.  If the header just defines macros and structures,
	and declares types of external objects and functions,
	then you don't need to ensure one-time actions, because
	such actions can be repeated safely.  Typedefs are the main
	things that need to be protected against a second invocation.

	3.  Notwithstanding point 2, if the header includes others
	then it should probably use lock-out symbols, to avoid
	infinite recursion if the other headers include THIS one.

	4.  Application headers should NOT include standard headers,
	because many C implementations do not provide idempotent
	standard headers, so bookkeeping becomes a real mess unless
	you adopt the simple rule that all application headers are
	idempotent and never include system headers inside themselves.
	(ANSI C requires the standard headers to be idempotent, i.e.
	includable multiple times with the same effect as a single
	inclusion.)

	5.  Include headers before doing anything else in the source.

	6.  We use this scheme in a major project and it works
	fine.

bill@twwells.uucp (T. William Wells) (01/11/89)

In article <2688@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
: I have been told that the following mechanism for handling nested includes
: is unreliable and/or unportable, but for the life of me I can't see how:
:
: graphics.h:
:       #ifndef GRAPHICS_H
:       #define GRAPHICS_H
:       ...
:       #endif
:
: windows.h:
:       ...
:       #ifndef GRAPHICS_H
:       #include GRAPHICS_H

(Presumably you mean #include "graphics.h".)

:       #endif
:       ...
:
: menus.h:
:       ...
:       #ifndef GRAPHICS_H
:       #include GRAPHICS_H
:       #endif
:       ...
:
: Now this allows a programmer to include windows.h and menus.h, without
: having to (a) know they need to include graphics.h, and (b) worry about
: graphics.h being included twice.
:
: What's wrong with this picture?

Not a thing. The company I work for does this routinely.  Our source
code is ported all over the place (70+ different systems for one
customer alone!) and we've never had a complaint about this.

Strictly speaking, the #ifndef's in the other include files aren't
necessary but they do save on the time needed to skip the useless
files.

---
Bill
{ uunet!proxftl | novavax } !twwells!bill

peter@ficc.uu.net (Peter da Silva) (01/12/89)

In article <9336@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
> The only thing wrong is your syntax.  You mean
> 	#include "graphics.h"
> in the latter two files.

Yeah, yeah, I know.

> In fact there is no need to place conditionals around those inclusions,
> since the included file will have no effect if it is already in force,
> bacause it checks its one-time lockout flag and avoids redefining things
> after the first time it's included in a translation unit.

Depends on the number of open files the operating system and 'C' library
allows, and on the cost of opening a file. If the files are large, this
extra set of conditionals might significantly enhance compilation speed.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Work: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.   `-_-'
Home: bigtex!texbell!sugar!peter, peter@sugar.uu.net.                 'U`
Opinions may not represent the policies of FICC or the Xenix Support group.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/12/89)

In article <2700@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
-In article <9336@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
-> In fact there is no need to place conditionals around those inclusions,
-> since the included file will have no effect if it is already in force,
-> bacause it checks its one-time lockout flag and avoids redefining things
-> after the first time it's included in a translation unit.
-Depends on the number of open files the operating system and 'C' library
-allows, and on the cost of opening a file. If the files are large, this
-extra set of conditionals might significantly enhance compilation speed.

Well, a good reason for nonetheless letting the compiler take care of it
for you is that otherwise the user of the header needs to also know the
special lock symbol for the header.  General design principles argue that
the symbol should be the private property of the header.

daw@houxs.ATT.COM (David Wolverton) (01/12/89)

In article <9336@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
> 
> 	2.  If the header just defines macros and structures,
> 	and declares types of external objects and functions,
> 	then you don't need to ensure one-time actions, because
> 	such actions can be repeated safely.  Typedefs are the main
> 	things that need to be protected against a second invocation.

You may still want to use the lockout even when it is safe to include
the header more than once.  My gut-level is that your compile time
would be improved if you have long headers WITH lockouts, because then
the preprocessor is effectively the only part of the compiler that must
process/scan that part of the incoming source.  Without lockouts,
the compiler must parse that stuff, do bookkeeping, etc. which adds
to the compile time.

Dave Wolverton
daw@houxs.att.com

leo@philmds.UUCP (Leo de Wit) (01/13/89)

In article <9345@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
|In article <2700@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
|-Depends on the number of open files the operating system and 'C' library
|-allows, and on the cost of opening a file. If the files are large, this
|-extra set of conditionals might significantly enhance compilation speed.
|
|Well, a good reason for nonetheless letting the compiler take care of it
|for you is that otherwise the user of the header needs to also know the
|special lock symbol for the header.  General design principles argue that
|the symbol should be the private property of the header.

An argument against this is that if the user of the header does NOT
need to know this, it could possibly #define the same symbol (for
whatever reason). This would result in the header not being included
properly.

The strategy chosen should perhaps be such that one (or more ?) symbol(s)
per file are being reserved for this purpose, so that each module or
header file knows what symbols can / cannot be used.

Peter's argument 'Depends on the number of open files' doesn't seem
valid:  if you don't use the lock symbol, you open the header file;
since the whole header file is #ifdef'ed out, no other files will be
included in the process. So the total number of files open at any time
is at least one more than in the 'use the lock symbol' strategy. If
your system can't handle that one extra open file, you probably had
already too many open (read: abused the #include facility).

                Leo.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/13/89)

In article <918@philmds.UUCP> leo@philmds.UUCP (Leo de Wit) writes:
>An argument against this is that if the user of the header does NOT
>need to know this, it could possibly #define the same symbol (for
>whatever reason). This would result in the header not being included
>properly.

I assumed that the available name space had already been divided
among "packages" using some general guideline, such as package
prefixes (explained in a previous posting).  You need to take
care of this anyway.  It is easier to cooperate with general
name space partitioning guidelines than to remember specific
symbols.

leo@philmds.UUCP (Leo de Wit) (01/19/89)

In article <9361@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
|I assumed that the available name space had already been divided
|among "packages" using some general guideline, such as package
|prefixes (explained in a previous posting).  You need to take
|care of this anyway.  It is easier to cooperate with general
|name space partitioning guidelines than to remember specific
|symbols.

We use that strategy here too, and it is easy to extend the
guideline to include the preprocessor token:

Global functions are prefixed with a 3 letter & underscore prefix, the
3 letters identifying the module. For instance, if you have a queue
handling module, you would name the functions que_.... . The header
file that exports the module's global functions, variables, types &
macros (and is included by the module itself and all others that use
the module) defines the symbol QUE to prevent multiple inclusion.

Never noticed any problems remembering this scheme ...

  Leo.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/20/89)

In article <925@philmds.UUCP> leo@philmds.UUCP (Leo de Wit) writes:
>Global functions are prefixed with a 3 letter & underscore prefix, the
>3 letters identifying the module.

Four of the six guaranteed significant characters seems like too
great a price.  We use 2-character "package prefixes"; e.g.
	MmAllo
where "Mm" denotes the "Memory manager" package and "Allo" is of
course the specific function of "Allocation".  This has worked
well for me over the decades.

>The header file that exports the module's global functions, variables,
>types & macros (and is included by the module itself and all others
>that use the module) defines the symbol QUE to prevent multiple inclusion.

Short macros like that worry me, because the probability of conflict
with some other use seems too great.  Our package interface definition
header for the "Mm" package uses the symbol MmH_INCLUDED, which is
practically certain not to clash with anybody else.

leo@philmds.UUCP (Leo de Wit) (01/24/89)

In article <9451@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
|In article <925@philmds.UUCP> leo@philmds.UUCP (Leo de Wit) writes:
|>Global functions are prefixed with a 3 letter & underscore prefix, the
|>3 letters identifying the module.
|
|Four of the six guaranteed significant characters seems like too
|great a price.  We use 2-character "package prefixes"; e.g.
|	MmAllo
|where "Mm" denotes the "Memory manager" package and "Allo" is of
|course the specific function of "Allocation".  This has worked
|well for me over the decades.

The project I'm currently working on has about 80 modules. The average
number of globals per module is certainly less than 80 (more about
10-20 I guess).  So it seems to me in this case it is completely
justified to use more identifying characters for the modules than for
the globals within a module.

As for the number of names possible, this is still 37 * 37 per module
(considering only uppercase, digits and underscore). You can still use
longer names for readability (whilst keeping the first 6 characters
unique).
Using the underscore may seem a bit of a waste, but it makes reading
linker output a lot easier in environments that only support uppercase
(in the link phase).

   [about 3 char inclusion macros ...]
|Short macros like that worry me, because the probability of conflict
|with some other use seems too great.  Our package interface definition
|header for the "Mm" package uses the symbol MmH_INCLUDED, which is
|practically certain not to clash with anybody else.

OK, accepted 8-).

	 Leo.

flaps@dgp.toronto.edu (Alan J Rosenthal) (07/09/90)

hannum@haydn.psu.edu (Charles Hannum) writes:
>"... preprocessor games, ..."?  It's not a game.  It's a perfectly ligitimate
>use of the C preprocessor.  If we weren't supposed to use it, it wouldn't be
>included (based C's minimalist philosophy).

Are you claiming that all of the obfuscated C contest winners are legitimate C
programs?  Would you like to have to maintain them?

hannum@handel.psu.edu (Charles Hannum) (07/13/90)

In article <1990Jul8.145418.3368@jarvis.csri.toronto.edu> flaps@dgp.toronto.edu (Alan J Rosenthal) writes:

   hannum@haydn.psu.edu (Charles Hannum) writes:
   >"... preprocessor games, ..."?  It's not a game.  It's a perfectly ligitimate
   >use of the C preprocessor.  If we weren't supposed to use it, it wouldn't be
   >included (based C's minimalist philosophy).

   Are you claiming that all of the obfuscated C contest winners are legitimate C
   programs?  Would you like to have to maintain them?


No.  Most of them are non-portable.  Coding style is not a function of the
language; it is a function of personal taste and that of the people you work
for, if any.
--
 
Virtually,
Charles Martin Hannum		 "Those who say a thing cannot be done should
Please send mail to:		  under no circumstances stand in the way of
hannum@schubert.psu.edu		  he who is doing it." - a misquote