[comp.lang.c] documentation standards........

evh@vax1.acs.udel.EDU (SAVILLE) (10/09/87)

Here are some standards developed at the univ. of del. for C code,
and program design/'style'. I don't agree with everything here
(as some of you probably don't either). Please send any
criticisms/comment directly to me via e-mail at:

evh@vax1.acs.udel.edu

If you can not send to that address, then try
tsavill@apg-5.arpa
AS A LAST RESORT(i'm doing development/design work here, and the node
is bloody busy, so i'll probably catch so flack if you do).
If any of you people out there play frisbee golf, let me know.
I need some new courses. Another mach one course.

The doc is in nroff form.
Please do not send the whole document back to me with
interpolated comments.  This is sort of like a survey.
If you don't have nroff, then i'll send the 'straight' text
version.
----------------cut--here-----------------

.ll 70
.ce 2
Coding Standard For C Programs
Paul D. Amer
.sp 2
l.  Comments
.sp 1
Comments serve two main purposes:
.br
.in+4
(a)  They help a reader understand broad concepts, and explain
the approach a program uses to solve a problem.
.sp  
Every module (procedure or function) must begin with a block of comments
that describes its purpose.  
Modules are
.ul
completely
documented internally, where 'completely' encompasses that amount of
documentation which enables a programmer who has never seen the
code before to understand the purpose without having to read the code.
(Such a programmer is often the original coder six months after
implementation when a problem occurs.)
.sp
Therefore every module begins with the following header block of comments:
.in +4
.ti -4
1.  Purpose: what does the module do?
This description is in in prose form following
all rules of English grammar and punctuation.  Do not describe the 
algorithm in step by step detail.  The program description should be an 
overview written for someone unfamiliar with the assignment.
.ti -4
2.  Author and Date 
.ti -4
3.  Input Parameters:  names and descriptions of parameters passed
to the procedure
.ti -4
4.  Returned Values:  for a procedure, names and descriptions of affected
parameters; for a function, in addition, an explanation of
the single returned value
.ti -4
5.  Procedures/Functions Called: names of procedures and functions
that are called from the module (with optional descriptions).  System
utilities such as printf, getchar need not be documented
.ti -4
6.  Procedures/Functions Calling:  names of procedures and functions
that call this module
.ti -4
7.  Local Variables: names and descriptions of every local variable
.ti -4
8.  Global Variables Used (and how modified):  names of all global
variables used and how the module modifies them
.ti -4
9.  Global Constants: names and description of globals [This section
would be found in the main program only.]
.ti -4
10. Bugs:  known input values or conditions for which a module will not
perform properly
.in -4
.sp 2
(b)  Comments explain code which is otherwise unclear.
.br
When every line of code is commented, the obvious question is "If
the code is obscure enough to require a comment - shouldn't the code
be rewritten?"  Good code doesn't need imbedded comments; try to 
rewrite unclear code instead of commenting it.
Clearly this isn't always possible, and imbedded comments may be necessary.
.sp 1
Begin comments with a verb if possible. 
For example, "Input number of students",
"Compute mean value of score", "Convert answer to ascii".
Avoid phrases such as "The following
code ...", "Now we compute ...", "Lines 20 thru 28 evaluate ...", "This
procedure tries to ...".  
.sp
Comments within code must start in the same column
(and above) the code it explains, or offset to the right of the code.
.sp
.ul
A comment is of negative value if it is wrong or does not agree with the code.
.sp 1
.in-4
2.  Variable names.
.in+4
.sp 1
Choose meaningful variable names.  Do not over-abbreviate (also known
as 'Fortranizing') variable names (e.g., ttl for total, lgth for length).
A good rule of thumb is 5-9 letters for every variable name.  Single 
letter variable names, such as i, j, x, are acceptable only as
dummy loop variables for tasks such as outputting an array.
The only reason you need more than 1 dummy variable is for nested
loops.
.sp 1
.in-4
3.  How to indent code.
.in+4
Opinions differ on how to best make code readable, but everyone
agrees on one rule - Be Consistent.
.sp 1
a.  Always indent 3 spaces; not 2, not 5, always 3
.sp 1
b.  Align corresponding { }'s and keep code aligned with
its bounding { }, do not indent again
.nf

      Correct                  Incorrect

     {                           {
     -----;                         -----;
     -----;                         -----;
        {                              {
        -----;                            -----;
        -----;                            -----;
        -----;                            -----;
        }                                 }
     }                                 }

.fi
.sp 1
.ne 4
c.  If statements: always indent the then part under the if.
.nf
.sp
     if (relation)
        x = 5;

.ne 6
     if (relation)
        {
        -----;
        -----;
        -----;
        }
.ne 14
d.  If-then-else statements: align the words if and else,
and align the then and else code portion
.nf

     if (relation)  /* comment on primary relational test */
        {
        -----;
        -----;
        }
     else 
        /* comment on initial else */
        {
        -----;
        -----;
        if (relation)
           -----;
        else
           /* comment on the purpose of the deeper else */
           {
           -----;
           -----;
           }
        }
.sp 
.ne 14
e. switch statement: align the cases with the {}'s, indent each case's code

     switch (nextchar)
        {
        case 'a':
           -----;
           -----;
           break;

        case 'b':
        case 'c':
           -----;
           break;

        default:
           -----;
           -----;
        }
.sp
.ne 4
f. for and while statements: always indent the body of the loop

      for (i=1; i<10; i++)
         array[i] = 0;

.ne 5
      while (relation)
         {
         ------;
         ------;
         }
.fi
.in-4
.sp 1
4.  Miscellaneous Rules
.sp 1
.in+4
Order modules in some logical manner, either alphabetically or according
to function.  For instance, there might be 3 procedures which perform
output routines.  They should be grouped together.
An index showing on what page each module can be found is a good idea when
there are more than 10 modules.
.sp 1
Besides performing a computerized function, code should be
written to achieve three main goals. 
.br 
.in+4
Clarity (Readability)
.br 
Maintainability (easy for someone else to modify)
.br 
Portability (even if you never plan to move it to another machine)
.sp 1
.in-4
Some rules which help achieve these goals are:
.sp 1
Keep things simple. (KISS - Keep It Simple Stupid)
.br
.in+4
Use simple control structures (goto's aren't simple)
.br 
Simple does not mean short, nor does it mean executes fastest.
.sp 1
.in-4
Modularize different functions.
.in+4
If you find a block of code that performs a single function and is used
in more than one place in a program,
you should probably make the block a separate module.
.sp 1
Keep modules short (less than 1 page? 40-50 lines of code?)
.sp 
A good module is written such that the module can later be
changed (maybe a better algorithm for the same function) without
having to rewrite other modules.
.sp 1
.in-4
Do not use constants (i.e., magic numbers)
in a program (perhaps with the exception of the
constants 0, 1, -1); instead use #define's. 
.sp 1
Minimize, if not avoid, the use of global variables. (You need a
good reason for using globals and 'it is easier than passing
variables as parameters' is NOT a good reason.)
.sp 1
Avoid assumptions about the underlying machine (such as word length -
character set, etc.)
.sp 1
Except for static local variables, put all initialization statements
in a procedure entitled "initialize"
and call it at the beginning of the main program.
.sp 1
Reference:
.ul
The Elements of Programming Style 
by Kernighan & Plauger, McGraw-Hill, 1974.
.sp 1
.in-4
4.  Example program.
.sp 1
.in+4
The following program is not perfect, but is a good example of  well
documented code.  Student suggestions for improvement are welcome.
     (also found in    ~amer/examples/c.program.c  on vax1)


.nf
/************************************************************************ 
 *           Assignment 1:  A Structured Program                        * 
 *                                                                      * 
 * Programmer: Paul Amer                  Due Date: April 16, 1984      * 
 * Course: C courses                      Revised:  August, 23 1986     * 
 ************************************************************************ 
1. Purpose:
   This program serves two purposes.  First, it is a general example of 
   how to properly document and structure a C program.  Second, it finds 
   the root of a function when it is given the function, an interval 
   to search, and the accuracy with which the root is to be found.       

   Four conditions must be satisfied for the program to work:          
   1) The function, f,  must be a polynomial of degree 10 or less,
   2) The interval contains exactly one real root,
   3) f(lower) and f(upper) have opposite signs where lower is the lower      
      endpoint and upper is the upper endpoint of the interval.
   4) User must input correct data.

   Three methods are used to numerically approximate the root in the
   given interval: (1) bisection (2) secant (3) Newton's. Each method is
   explained in detail in its associated procedure below.
 
2. Paul Amer,  August 1984

3. Inputs:
   User is prompted for a polynomial, its degree, upper and lower initial
   interval endpoints, and an epsilon to control the accuracy of computation.

4. Outputs:
   The program prints the root of the polynomial as computed by three
   independent methods.  Also, the number of iterations necessary to
   derive each solution is outputted.

5. Procedures/Functions called: bisection, secant, Newton, Horner

6. Procedures/Functions calling: none

7. Local Variables:
   i       -- dummy loop counter for work with the polynomial
   epsilon -- amount of error allowed when finding the root
   lower   -- lower endpoint of the interval being searched
   upper   -- upper endpoint of the interval being searched
   poly_coeffs -- coefficients of the polynomial
   degree  -- degree of the user's polynomial

8. Global Variables (and how modified in this module): None

9. Global Constants:
   MAXDEGREE -- the maximum degree for any polynomial input
   RSTARS    -- used for ease of reading when printing out roots
   STARS     -- used for ease of reading when printing out the polynomial
   TAB       -- equal to four spaces, used to separate output

10. Bugs:  this program has never been compiled and executed!
********************************************************************/

#define MAXDEGREE 10
#define RSTARS "**************************************************************"
#define STARS "***************************************************"
#define TAB "    "

/******************************************************************
*                                                                 *
*                       MAIN PROGRAM                              *
*                                                                 *
******************************************************************/
main( )
   {
   int i, degree;
   float epsilon, lower, upper, poly_coeffs[MAXDEGREE+1];
   float Horner( );

   printf( "Input the degree of the polynomial:\n" );
   scanf( "%d", &degree );

   /* check if degree of polynomial to be solved is out of range */
   while ((degree <= 0) || (degree > MAXDEGREE))
      {
      if ( degree < 0 )
         printf("The only polynomial with negative degree is 0.\n");
      else if (degree == 0)
              printf("Constant polynomials do not have roots.\n");
           else
              printf("Polynomial too large for program.\n");

      printf("Reenter degree between 0 and %d inclusive:\n", MAXDEGREE);
      scanf("%d", &degree);
      }
 
   /* read coefficients of polynomial */
   for (i=0; i<=degree; i++)
      {
      printf("Enter the coefficient of x^%d term:\n", i);
      scanf("%f", &poly_coeffs[i]);
      }
 
   printf("Input the lower and upper bounds of the region to be searched:\n");
   scanf("%f %f", &lower, &upper);
   printf("Input the amount of error allowable when finding the root:\n");
   scanf("%f", &epsilon);
 
   /* output the polynomial */
   printf("\n");
   for (i=degree; i>=0; i--)
      {
      printf("%s\n\n", STARS );
      printf("*%sCoefficent of x^%d term%s*%s%8.4f%s*\n\n",
         TAB, i, TAB, TAB, poly_coeffs[i], TAB);
      }
   printf("%s\n", STARS);

   /* check if either endpoint is a root, if not, compute roots in three ways*/
   if (Horner(poly_coeffs, lower, degree) == 0.0)
      printf("Root is exactly: %f\n", lower);
   else if (Horner(poly_coeffs, upper, degree) == 0.0)
           printf("Root is exactly: %f\n", upper);
        else
           {
           Newton(poly_coeffs, lower, upper, epsilon, degree);
           secant(poly_coeffs, lower, upper, epsilon, degree);
           bisection(poly_coeffs, lower, upper, epsilon, degree);
           }
   }
 
/***************************************************************************
*                                                                          *
*                       procedure bisection                                *
*                                                                          *
****************************************************************************
1. Purpose
  Procedure bisection numerically approximates the root of a polynomial  
  using the "bisection method".  This method uses the following recursive
  formula:
      mid = ( lower + upper ) / 2;
      if ( sign( f( lower ) ) == sign( f( mid ) ) )
          lower = mid;
      else
          upper = mid;
  This means that an interval [lower,upper] is halved after each
  iteration.  If the function evaluated at the half way point is the same  
  as the lower bound, the root must lie within the interval [mid, upper].
  Otherwise the root lies within the interval [lower, mid]. The variables
  lower and upper are reassigned to reflect this new knowledge and the
  process is reiterated until the difference between the upper and lower
  bounds is becomes less than a certain error.
  
2. Paul Amer, August 1984

3. Input Parameters:
      fofx  -- array of coefficients of a polynomial
      lower -- lower endpoint of interval containing the root
      upper -- upper endpoint of interval containing the root
      epsilon -- controls accuracy of solution by bounding the minimum
                 difference between upper and lower to be tested    
      degree -- highest degree of all terms in the polynomial

4. Returned Values: None

5. Procedures/Functions called: Horner, sign

6. Procedures/Functions calling: main

7. Local Variables:
     iterations -- the number of times the bisection is performed
     bisect_root -- the root as computed by the bisection method
     mid    -- the midpoint of the interval [lower, upper]

8. Global Variables Used (and how modified):  None
***************************************************************************/
bisection(fofx, lower, upper, epsilon, degree)
float fofx[ MAXDEGREE + 1 ], lower, upper, epsilon;
int degree;
   {
   float mid, bisect_root;
   int iterations = 0;

   /* perform bisection until within tolerance of answer */
   while ( upper - lower > epsilon )
      {
      mid = ( lower + upper ) / 2;

      /* check signs of function values using sign function */
      if (sign(Horner(fofx, mid, degree)) == sign(Horner(fofx, lower, degree)))
         lower = mid;
      else
         upper = mid;

      /* increment number of times bisection is performed */
      iterations++;
      }

   /* assign and output answer */
   bisect_root = ( lower + upper ) / 2;
   printf( "\n%s\n\n", RSTARS );
   printf( "*%sBISECTION       - root is approximately:%s%8.4f%s*\n",
      TAB, TAB, bisect_root, TAB );
   printf( "*%sBISECTION       - number of iterations:%s%d%s*\n",
      TAB, TAB, iterations, TAB );
   printf( "\n%s\n", RSTARS );
   }

/******************************************************************
*                                                                 *
*                       procedure derivative                      *
*                                                                 *
*******************************************************************
1. Purpose:
   Procedure derivative computes the derivative of a polynomial

2. Paul Amer, August 1984

3. Input Parameters:
    fofx  -- array of coefficients of a polynomial
    degree -- highest degree of all terms in the polynomial

4. Returned Values: 
    deriv_coeffs   -- array which stores the coefficients of the derivative
             of the polynomial whose coefficients were passed in fofx.

5. Procedures/Functions Called: None
   
6. Procedures/Functions Calling: Newton

7. Local Variables:
     i -- loop counter representing the exponent of a term

8. Global Variables Used (and how modified): None
******************************************************************/
derivative(fofx, degree, deriv_coeffs)
float fofx[MAXDEGREE+1], deriv_coeffs[MAXDEGREE+1];
int degree;
   {
   int i;

   /* clear out array of returned values */
   for (i=0; i<=MAXDEGREE; i++)
      deriv_coeffs[i] = 0.0;

   /* transfer the coefficients of original polynomial into deriv_coeffs */
   for (i=0; i<=degree; i++)
      deriv_coeffs[i] = fofx[i];

   /* multiply coefficients by powers */
   for (i=1; i<=degree; i++)
      deriv_coeffs[i] = i * deriv_coeffs[i];

   /* move coefficients to power-1 */
   for (i=0; i<degree; i++)
      deriv_coeffs[i] = deriv_coeffs[i+1];

   /* zero out the "leading" term */
   deriv_coeffs[degree] = 0.0;
   }

/******************************************************************
*                                                                 *
*                       function Horner                           *
*                                                                 *
*******************************************************************
1. Purpose:
   Function Horner evaluates a polynomial at a given point using
   Newton's algorithm for polynomial evaluation (synthetic division)

2. Paul Amer, August 1984

3. Input Parameters:
      fofx  -- array of coefficients of a polynomial
      degree -- highest degree of all terms in the polynomial
      pointx -- point at which the polynomial is to be evaluated
  
4. Returned Value:
     The value of the supplied polynomial evaluated at the supplied point

5. Procedures/Functions Called: None

6. Procedures/Functions Calling: Newton, main, secant, bisection

7. Local Variables:
     poly_at_pointx -- used to compute the sume of terms of the polynomial at 
                  the given point x 
     i    -- a dummy loop counter for multiplications
   
8. Global Variables Used (and how modified): None
******************************************************************/
float Horner(fofx, pointx, degree)
float fofx[MAXDEGREE+1], pointx;
int degree;
   {
   float poly_at_pointx;
   int i;

   poly_at_pointx = fofx[degree];

   /* iteratively compute all terms of the polynomial evaluated at x */
   for (i=degree-1; i>=0; i--) 
      poly_at_pointx = (pointx * poly_at_pointx) + fofx[i];

   return(poly_at_pointx);
   }

/******************************************************************
*                                                                 *
*                       procedure Newton                          *
*                                                                 *
*******************************************************************
1. Purpose:
   Newton's method is an improvement on the secant method.  It
   uses the tangent instead of the secant so it coverges faster.
   Newton numerically approximates the root of a polynomial using 
   "Newton's method".  This method uses the following recursive formula:
        x    =  x    -    f(x   ) / f'(x   )
         k       k-1         k-1        k-1
  where f' represents the instantaneous derivative at a point on
  the curve and x  represents the  k th approximation of the root.
                 k
 
2. Paul Amer, August 1984

3. Input Parameters:
      fofx  -- array of coefficients of a polynomial
      lower -- lower endpoint of interval containing the root
      upper -- upper endpoint of interval containing the root
      epsilon -- controls accuracy of solution by bounding the minimum
                 difference between upper and lower to be tested    
      degree -- highest degree of all terms in the polynomial

4. Returned values:  None
 
5. Procedures/Functions Called: derivative, abs, Horner

6. Procedures/Functions Calling: main

7. Local Variables:
     last       the last (x   ) approximation of the root             
                           k-2                                      
     present    the current (x   ) approximation of the root            
                              k-1
     next       the next (x ) approximation of the root           
                           k                                      
     newton_root  -- root of the polynomial computed by "Newton's" method
     iterations     the number of iterations (k) needed to find the root  
     deriv_coeffs  -- coefficients of polynomial's derivative

8. Global Variables Used (and how modified): None
******************************************************************/
Newton(fofx, lower, upper, epsilon, degree)
float fofx[ MAXDEGREE + 1 ], lower, upper, epsilon;
int degree;
   {
   int iterations = 0;
   float next, present, last, newton_root, deriv_coeffs[MAXDEGREE+1];

   /* make initial approximations the endpoints of interval */
   last = lower;
   present = (upper + lower)/2;

   /* compute derivative of function */
   derivative(fofx, degree, deriv_coeffs);

   /* while approximations > tolerance use Newton's method */
   while (abs(present - last) > epsilon)
      {
      next = present - (Horner(fofx, present, degree)
                           / Horner(deriv_coeffs, present, degree));

      /* prepare for next iteration */
      last = present;
      present = next;

      /* increment number of times Newton's method is thus far performed */
      iterations++;
      }

   /* assign and output answer */
   newton_root = present;
   printf("\n%s\n\n", RSTARS);
   printf("*%sNEWTONS'S METHOD - root is approximately:%s%8.4f%s*\n",
      TAB, TAB, newton_root, TAB);
   printf("*%sNEWTON'S METHOD - number of iterations:%s%d%s*\n",
      TAB, TAB, iterations, TAB);
   printf("\n%s\n", RSTARS);
   }

/******************************************************************************
*                                                                             *
*                       procedure secant                                      *
*                                                                             *
*******************************************************************************
1. Purpose:
   Procedure secant numerically approximates the root of a polynomial using
   the "secant method". This method uses the following recursive formula:
               x    =  x    -    f(x   ) / m
                k       k-1         k-1

           f(x   ) - f(x   )      which is the slope of the curve for the
              k-1       k-2       interval containing the points
                                  x    and x     which are the k-1 st and
  where m =  _________________      k-1      k-2
                                  and k-2 nd approximations of the root.
             x     -   x          m is the secant.
               k-1       k-2

2. Paul Amer, August 1984

3. Input Parameters:
     fofx  -- array of coefficients of a polynomial
     lower -- lower endpoint of interval containing the root
     upper -- upper endpoint of interval containing the root
     epsilon -- controls accuracy of solution by bounding the minimum
                difference between upper and lower to be tested    
     degree -- highest degree of all terms in the polynomial

4. Returned Values: None

5. Procedures/Functions Called:  Horner, abs

6. Procedures/Functions Calling:  main

7. Local Variables:
     last       the last (x   ) approximation of the root
                           k-2
     present    the current (x   ) approximation of the root
                              k-1
     next       the next (x ) approximation of the root
                           k
     iterations  -- number of iterations (k) needed to find the root
     secant_root  -- a root of the polynomial as found by "secant" method

8. Global Variables Used (and how modified): None
******************************************************************************/
secant(fofx, lower, upper, epsilon, degree)
float fofx[MAXDEGREE+1], lower, upper, epsilon;
int degree;
   {
   float Horner( );
   float last, next, present, secant_root;
   int iterations = 0;

   /* make initial approximations the endpoints of the interval */
   last = lower;
   present = upper;

   /* use secant method while approximation is not accurate enough */
   while (abs(present - last) > epsilon)
      /* calculate next approximation of root */
      {
      next = present - (Horner(fofx, present, degree) /
         ((Horner(fofx, present, degree) - Horner(fofx, last, degree)) /
            (present - last)));
      last = present;
      present = next;
      iterations++;
      }

   /* assign and output answer */
   secant_root = present;
   printf("\n%s\n\n", RSTARS);
   printf("*%sSECANT   METHOD - root is approximately:%s%8.4f%s*\n",
      TAB, TAB, secant_root, TAB);
   printf("*%sSECANT   METHOD - number of iterations :%s%d%s*\n",
      TAB, TAB, iterations, TAB);
   printf("\n%s\n", RSTARS);
   }

/******************************************************************
*                                                                 *
*                       function sign                             *
*                                                                 *
*******************************************************************
1. Purpose:
   Function sign tests whether or not a supplied floating point
   value is positive (0 is considered positive) 

2. Paul Amer, August 1984
 
3. Input Parameters:
     number -- a floating point value to be tested
 
4. Returned Values:
     function is 1 if number is positive (>=0)
                 0 if number is negative

5. Procedures/Functions Called: None
6. Procedures/Functions Calling: bisection
7. Local Variables: None
8. Global Variables Used (and how modified): None
*********************************************************************/
sign(number)
float number;
   {
   if (number >= 0.0)
      return(1);
   else
      return(0);
   }

djones@megatest.UUCP (Dave Jones) (10/13/87)

You asked for comments.

This was one of the best documents I have seen on C programming standards.
I would like to add just a couple of suggestions.

1. Emphasize the documentation of variables and data-structures rather
than algorithms.  With few exceptions, algorithms should be so simple
as to require no comment.  If you are writing algorithms which require
documentation, you probably should simplify them.  There are exceptions
of course.  If you have taken an algorithm from a book, give a reference.

I have had a great deal of practice (and a great deal of frustration)
maintaining old code.  The trick to figuring out what the code does is
to figure out what the data represent.  In the parlance, "What are the
invariant relations?"  That is to say, what relationships between the
various data must be maintained, except in the "critical regions" which
manipulate the data?  If you know what the data represent, you know
what it is that the code MUST do (or SHOULD do), regardless of how
obscurely the original programmer coded the algorithms.

Internal documentation of algorithms is usually just distracting.
"Update the lset counter and remove entry from the exception table"
means absolutely nothing if you don't know what the lset counter represents,
or when entries belong in the exception table.  If you know those things,
the comment is extraneous.

Requiring that the variables be described in a header introduces the
possibility of the header and the code getting out of synch, and
creates busy-work.  Require instead that the variables be documented where 
they are declared.

2. Alphabetical organization makes no sense at all.  Organization by
form is little better.  One constant head-ache in the UNIX invironment
is caused by the fact that .h files must be separated from the
object files which they represent.  Keep related files together.  
I recommend that the source code be organized by data-structure.
For each related set of data-structure definitions, have two .h files:
one which is used by the routines which access the data-structures directly,
another for the routines which will access them only indirectly by means
of the first group of routines.  Put all the direct access routines in
one file.  If that file grows too large for efficient compilation,
put the routines in separate files in their own directory, or give them
names which clearly indicate that they (and only they) belong to the
direct-access .h file.  Example naming convention:

	hash_table.h   /* .h file describing routines which implement 
	                *  hash-tables, and opaque handles for hash-tables
		        */
	hash_table.i.h  /* .h used by the hash-table routines (only) */
	hash_table.create.c  /* One source file */
	hash_table.update.c  /* another. */

msb@sq.UUCP (10/16/87)

I mailed a long response to the person who posted that "Coding Standard
for C Programs", but I think it may be worthwhile to also post a summary
of my major points.

1.  It isn't a coding standard for C programs.  It's a coding standard for
    modular programs in general, partly adapted for C.  For instance, it
    doesn't specify anything about how to align enum or struct declarations.
    (The Indian Hill Style Sheet does.)  Little is said about types.

2.  The examples show several signs that they were not written by a C
    programmer; for instance, using a while-loop instead of

	for (iterations = 0; upper - lower > epsilon; iterations++)

    and the use of scanf() as if it read an input *line*.

3.  The examples also show bad modularization, as the same function is
    used to compute and print a value, and unnecessary complexities,
    like assigning a variable to another with a different name because
    something else is going to be done to it now.

    The last function, with 31 lines of comment and code, should have been:

	#define sign(x) ((x) >= 0)

4.  The fill-in-the-blank header comments are hard to read and hard to
    maintain, especially the "called by" entry.  Some other rules are
    similarly well-intentioned-but-too-strict; for instance, there are
    other stereotyped uses that call for 1-letter variable names besides
    loop indexing.

5.  Despite these and other points, such a standard is still better than
    no standard, especially in a student environment where people may have
    to be weaned off all-variables-1-letter and goto-spaghetti.  But I would
    recommend The Elements of Programming Style above all.

Mark Brader		"It's important not to let the structure of the
Toronto			code determine the functionality of the program ...
utzoo!sq!msb		The desired functionality should be predetermined
msb@sq.com		before the code writing is done."	-- Dave Sill

daveb@geac.UUCP (10/21/87)

  In reference the discussion of documentation standards arising out of
"Coding Standard for C Programs", I should like to point out that
the proposed voluminious header comments are counter-productive.

Point 1:
  I had the interesting (and unpleasant) experience some years ago
to become involved in backing out of a set of mistaken decisions in
a major project[1], and one of the things which 
	a) caused the erroneous decisions to be believed, and
	b) caused them to be hard to change,
was an enthusiasm toward documenting a design in full and complete
detail (but without verification of the correctness of the design).

  The header comments used by this project were substantially
identical to those proposed, and did contribute to the problem.
  I propose that such comments are suitable functions which can be
shown to be complete and correct under verification, and that their
use be restricted to the documentation of complete and correct
functions for purposes of maintenance.
  (Yes, my tongue is in my cheek... I'll make a constructive
suggestion later).


Second Point:
  Repeating the declarations and uses of variables in the header
separates the declaration and description unnecessarily, and breaks
the "write once" rule[2].  It is far better to put the description
as close to the declaration as possible.  It would be even better if
the _uses_ could be close to the declaration, but that can be
approximated by writing small functions, a normal "good practice".

Proposal:
  David Gries[3] suggests that one document the preconditions for
successful completion of a function and its result.  This is
consistent with what I've seen in "well-written" c programs, so I'll
through this out for consideration:


/*
 * strchr -- find a character in a string.  Returns a pointer
 *	to the character or NULL. Requires a pointer to a
 *	null-terminated string and a character in the range 0
 *	to 0xFF. Behavior out of these ranges is undefined.
 */
 char *
strchr(p,c) 
	register char *p; /* The string to be searched. */
	register int c;  /* The thing to search for. */
{
	/* Precondition: */
	while (condition) {
		loop
	}
	/* Postcondition: */
	return p;
}

 --dave (note the **brevity** of the header comment) c-b

[1] Hinted at in "My Accordian's Stuffed Full of Paper", ACM
    SIGSOFT Software Engineering News, July 1984, p 58ff.

[2] A practice which stems from the MIT/Bell/GE Multics project:
    Have only one copy of a datum, and refer to it from wherever it
    is needed, preferably via a published interface.  For example,
    sys_info_$max_seg_size is a size constraint. The rule can also
    be applied to documentation... 

[3] Gries, David, "The Science of Programming", Springer-Verlag (I
    forget the date).
-- 
 David Collier-Brown.                 {mnetor|yetti|utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind)
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

chip@ateng.UUCP (Chip Salzenberg) (10/22/87)

I can't resist...

In article <1987Oct15.212022.20550@sq.uucp> msb@sq.UUCP (Mark Brader) writes:
>
>2.  The examples show several signs that they were not written by a C
>    programmer; for instance, using a while-loop instead of
>
>       for (iterations = 0; upper - lower > epsilon; iterations++)

It is _not_ true that a "C programmer" will use a "for" loop whenever
possible.  For example, I consider myself a fairly experienced C
programmer, and my personal rule is that only variables involved in loop
_control_ should appear in the for statement.

-- 
Chip Salzenberg         "chip@ateng.UUCP"  or  "{uunet,usfvax2}!ateng!chip"
A.T. Engineering        My employer's opinions are not mine, but these are.
   "Gentlemen, your work today has been outstanding.  I intend to recommend
   you all for promotion -- in whatever fleet we end up serving."   - JTK

tom@vax1.acs.udel.EDU (10/22/87)

From:  SAVILLE <evh@vax1.acs.udel.edu>
Subject:  documentation standards........

> Here are some standards developed at the univ. of del. for C code,
> and program design/'style'. I don't agree with everything here
> (as some of you probably don't either).

This standard was actually for general programming; this is the 'C'
version. there is also a nearly identical 'Pascal' standard around 
here. This standard is not well suited to the way *I think* C code
should be written and documented. Nor, I might add, is it an accepted
standard around here. Dr. Amer (the author) insists upon it in all
of his classes, as do a few instructors who learned C from him, but
most code is written/doc'ed according to no particular standard.

I think that coding standards standards are a good thing as long as
they are well adapted to the language, and not too restrictive. ie.
i'm much more likely to accept a set of guidelines than a 15+ page
formal specification of how every facet of a program WILL be done.

We have a number of "AmerDoc" programs, scripts, emacs-fns etc. written
by people who over the years decided that they were not going to deal
with some of the more insidious busy-work aspects of this standard.

For example:

> Comments serve two main purposes:
>     [...]
> 
>     3.  Input Parameters:  names and descriptions of parameters passed
>         to the procedure
>     4.  Returned Values:  for a procedure, names and  descriptions  of
>         affected  parameters; for a function, in addition, an explana-
>         tion of the single returned value
>     5.  Procedures/Functions Called: names of procedures and functions
>         that  are called from the module (with optional descriptions).
>         System utilities such as printf, getchar need not be document-
>         ed
>     6.  Procedures/Functions Calling:  names of procedures  and  func-
>         tions that call this module
>     7.  Local Variables: names and descriptions of every  local  vari-
>         able
>     8.  Global Variables Used (and how modified):  names of all global
>         variables used and how the module modifies them
>     9.  Global Constants: names and description of globals [This  sec-
>         tion would be found in the main program only.]

None of this is needed in a header comment for a function. It can be
produced on demand if it is ever desired by any half decent cross
reference program. Data structures, vars and things should of course be
adequately documented where they are declared and used.

Some of these items, esp. #6 are impossible to satisfy for any general
purpose code. Imagine trying to write some library fn, say 'printf()'
to conform to this standard. After all, isn't the one of the goals of a
standard to encourage modularity and generality rather than coding many
things over and over since their design was too problem-specific.

I agree with most of the other guidelines about documenting as long as
nothing but purpose/description, author, history & known bugs goes in a
header comment. if one wants to excessively comment a piece of C code
why not create a foo.readme file rather than making it hard to find the
code in foo.c

Here are some other improvements I would suggest for this standard:

1. Delete section on indenting code.

Almost everyone who has written more than 10 programs has their own
indentation style that they think is better than anyone elses. As long
as it is consistent and readable it doesn't really matter. Any group
working on the same project should indent the same way. If you don't
like the way someone indents you can always feed their code to a
pretty printer before you look at it. Besides his indentation style
looks ugly :-)

2. This section on initialization is NOT for C and is not a good idea
   for any language which allows compile time initialization.

> Except for static local variables, put all initialization statements
> in a procedure entitled "initialize" and call it at the beginning of
> the main program.

This is a foolish thing to do since all externals and statics can be
initialized "for free" to nearly anything desired at compile time,
(cf. constant expressions) but an 'initialize()' proc would generate
code and take time to execute. Automatic variables could of course be
handled that way but it seems clearer (and less expensive) to me to
assign them values right where they are declared. Alter the above lines
to read:
   "Group all initialization statements together into one clearly
labelled place. For externals this should be either the beginning of
the main program file or a file with a name like "init.c". For locals
this should be the beginning of the block where they are declared."

3. Include more guidelines on using:

types

scope of variables (there are 3 types: global, file, and block)

cpp & header files.
    Only preprocesser directives, and declarations (not definitions)
    should appear in ".h" files. Never executable code. Fully
    parenthasize all #define constants and macro's. If you say

    #define FOO 	sizeof(int) + sizeof(char)

    Someone will inevitably say "4 * FOO" and then hate you because
    they got 17 as the value instead of 20.

file and/or module organization in general.
    (his example leads one to believe that programs should be one
    monolithic file) I prefer a separate file for every function unless
    you have several that clearly belong together in one "module" (see
    K&R p.77 and section 4.6 for the classic stack module example).
    This gives you the maximum benefit from 'make' and lets even dumb
    linkers produce the smallest binarys possible. If you have `make'
    you should take advantage of it, and if you don't have make you
    should get it. At times the 1 fn per file approach may seem a bit
    extreme but it certainly encourages modularity.

4. Get a real example.
The current one is a thinly disguised pascal program.

						tom

gardner@prls.UUCP (Robert Gardner) (10/23/87)

In article <9897@brl-adm.ARPA> tom@vax1.acs.udel.EDU (Tom Uffner) writes:
>all externals and statics can be
>initialized "for free" to nearly anything desired at compile time,
>(cf. constant expressions) but an 'initialize()' proc would generate
>code and take time to execute.

This isn't necessarily true and depends on the details of the linker
and loader (and compiler implementation). One compiler I worked with
allocates space for global variables dynamically before calling main(),
zeroes it, and then executes a bunch of initialization statements that
move the initialization values into the proper places, then calls main().
There is approx. one move for each initializer, so each initializer
takes about twice as much space as you would expect. On the other hand,
this is a net gain if there are only a few initializers compared to the
number of global variables, since allocating space for all the global
variables in the executable (as RSX does) can take more space than the
initialization code if there are a lot of globals.

I prefer compile-time initialization when possible, but it's not
necessarily "for free".

Robert Gardner

peter@sugar.UUCP (Peter da Silva) (10/26/87)

Indents should be one tab-stop. That way you can set your editor and your
"make print" command line to set the indentation depth your way, and some
other person who might be using your code can set them his, her, or its way.

I myself like changing the tab-size in my editor depending on how heavily
indented the code I'm working on is.

I know I can always run it through "cb", but that tends to mess up the comments
something fierce.

Now whether you do this:

	if(...) {

or this:

	if(...)
	{

or this:

	if(...)
		{

is really irrelevant. It would be nice if you can indent the body of your
functions, though:

foo()
{
	int variable;

	code;

	if(...)
	{
		code;
	}
}

is a bit easier to work with than:

foo()
{
int variable;

code;

if(...)
	{
	code;
	}
}

it may take up a bit more room, but that's what the variable tabs are for.
I always have trouble remembering whether I'm in a function or not.
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.

daveb@geac.UUCP (10/29/87)

In article <913@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes:
>Indents should be one tab-stop. That way you can set your editor and your
>"make print" command line to set the indentation depth your way, and some
>other person who might be using your code can set them his, her, or its way.
>
>I myself like changing the tab-size in my editor depending on how heavily
>indented the code I'm working on is.

  Back in the Multics days, a tab was 10 characters, or 1" on the
standard (printing) terminal then in use.
  This had the pleasant effect of discouraging deeply-indented
nested blocks of code, which in turn tended to discourage long
procedures/functions.

  As a result, I always recommend "broad" tabs for C and Ada (which
can be easily written with a "comb-like" indenting style), and
"narrow" tabs only for Pascal and sh.

 --dave
-- 
 David Collier-Brown.                 {mnetor|yetti|utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind)
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/30/87)

In article <913@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes:
>is really irrelevant. It would be nice if you can indent the body of your
>functions, though:

Quite right, discussion of personal stylistic preferences is a waste
of network bandwidth.  On the other hand, there can be practical
reasons for some aspects of coding style.  For example, if identifiers
for function definitions occur against the left margin but most other
text is indented, then it is easy with most text editors to quickly
jump to the beginning of any function whose name is known, or to
grep a set of sources to find out which one defines the function (as
opposing to calling it).

brian@ncrcan.UUCP (11/05/87)

In article <6606@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <913@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes:
>>is really irrelevant. It would be nice if you can indent the body of your
>>functions, though:
>
>Quite right, discussion of personal stylistic preferences is a waste
>of network bandwidth.  On the other hand, there can be practical
>reasons for some aspects of coding style.  For example, if identifiers
>for function definitions occur against the left margin but most other
>text is indented, then it is easy with most text editors to quickly
>jump to the beginning of any function whose name is known, or to
>grep a set of sources to find out which one defines the function (as
>opposing to calling it).

Which is why all my functions are declared as follows:

int 
myfunction()
{
}

and not:

int myfunction()
{
}