[comp.lang.c] A question on C programming style

bxpfac@umiami.ir.miami.edu (04/12/91)

How do the guru's on the net feel about the following two styles?

Style 1: (No nested includes - user responsible for proper order of includes).
--------
foo.h
  extern save_data (FILE *fp);

use.c
  #include <stdio.h>    /* Needed because FILE used in foo.h and has to be
                           included foo.h. */
  #include "foo.h"

Style 2: (Nested inclusion).
--------
foo.h
  #include <stdio.h>     /* We know that this has to be included with this. */
  extern save_data (FILE *fp);

use.c
  #include "foo.h"

  #include <stdio.h>    /* Is now optional and if included, would not be 
                           included twice provided that the <stdio.h> is
                           set up properly. */
   

Bimal / devebw9f@miavax.ir.miami.edu

markh@csd4.csd.uwm.edu (Mark William Hopkins) (04/13/91)

In article <1991Apr12.103621.8907@umiami.ir.miami.edu> devebw9f@miavax.ir.miami.edu writes:
>How do the guru's on the net feel about the following two styles?
(nested includes)

ANSI-C doesn't care if you include the same .h file twice or more (if they
contain only declarations and not definitions).  They went out of their way
(the standards-makers) to ensure this to be so.

Anyway, if you include a.h into b.h, a.h is usually make secure by nesting it
in an #ifndef:

#ifndef A_INCLUDED
#   define A_INCLUDED
... rest of a.h
#endif

<stdio.h> might have something like this in it...

amewalduck@trillium.waterloo.edu (Andrew Walduck) (04/13/91)

In article <1991Apr12.103621.8907@umiami.ir.miami.edu> devebw9f@miavax.ir.miami.edu writes:
>How do the guru's on the net feel about the following two styles?
>
>Style 1: (No nested includes - user responsible for proper order of includes).
>--------
>foo.h
>  extern save_data (FILE *fp);
>
>use.c
>  #include <stdio.h>    /* Needed because FILE used in foo.h and has to be
>                           included foo.h. */
>  #include "foo.h"
>
>Style 2: (Nested inclusion).
>--------
>foo.h
>  #include <stdio.h>     /* We know that this has to be included with this. */
>  extern save_data (FILE *fp);
>
>use.c
>  #include "foo.h"
>
>  #include <stdio.h>    /* Is now optional and if included, would not be 
>                           included twice provided that the <stdio.h> is
>                           set up properly. */
>   
>
>Bimal / devebw9f@miavax.ir.miami.edu

Well...to put in my two cents (Canadian $), I've just recently started using
the nested form...
Disadvantages:
1. Slower compilation due to multiple references to the same file...
2. Must include references to the imbedded includes in the make file        
   dependancy list....
3. Potential for loops if one file omits having a #ifndef, #endif wrapper
   around it.
Advantages:
1. Data heirarchy is better....but harder to maintain as maintainer must
   understand where in the tree to insert his new definitions...a clear
   design document (or graph) showing how the datatypes and objects relate
   is a plus.
2. Also if you include a file, it comes "ready to use" as its already included
   whatever definitions you may need.

Myself, I'm begining to prefer the nested style.  

Andrew Walduck
                    

scs@adam.mit.edu (Steve Summit) (04/13/91)

In article <1991Apr12.103621.8907@umiami.ir.miami.edu> devebw9f@miavax.ir.miami.edu writes:
>Style 1: (No nested includes - user responsible for proper order of includes).
>foo.h
>  extern save_data (FILE *fp);
>
>Style 2: (Nested inclusion).
>foo.h
>  #include <stdio.h>     /* We know that this has to be included with this. */
>  extern save_data (FILE *fp);

[A fine FAQ list question, this.]

Whether nested #included files are good style is, historically,
(like so many of these style questions :-( ) a bit of a religious
question.  (ANSI's new guarantees may have shifted the balance
somewhat.)

When one header file makes use of something (usually a macro or
typedef) defined in another, I feel that a nested #include
directive is a good idea.  The alternative -- requiring the
#includer to #include other header files first -- loses big
points in the information-hiding department, and leads to real
maintenance headaches in practice.  (It's too easy to forget to
pre-#include something else, and the errors that result are
typically non-obvious.  If the requirements of the header file in
question change, all its #includers must be modified.)

Obviously, if header files are to be #included recursively, they
must be "idempotent" (a word which only comes up in this
discussion, meaning something like "can be multiply invoked
safely"), since it is likely that they will end up getting
#included twice.  (In Bimal's second example, the #includer of
foo.h might well #include <stdio.h> anyway.)  The standard
technique for guaranteeing idempotency, which has already been
mentioned, is to "turn off" each include file if it has been
processed already:

foo.h:
	#ifndef FOO_H
	#define FOO_H

	/* contents of foo.h */

	#endif

To cut down on compilation time a bit (by eliminating unnecessary
file opening and namei overhead) some people prefer to do the
protecting in the #includer:

	#ifndef FOO_H_INCLUDED
	#define FOO_H_INCLUDED
	#include "foo.h"
	#endif

This is ugly, as the #ifndef has to be repeated (and the sentinel
macro name agreed upon) in each #includer.

Nested #include files can, in general, be confusing.  Finding out
where something is defined can require traversing a twisty little
maze of #include directives.  (This Gordian knot can be cut
easily, however, with "grep <pattern> *.h" .)  Manual Makefile
maintenance quickly becomes nearly impossible, so an automatic
Makefile generating scheme, which follows the nested #include
directives reliably, is a requirement.

The confusion and Makefile maintenance drawbacks lead some people
to recommend strongly against nested #include files.  (Some
people also feel that the #ifndef/#define technique is an
unacceptably ugly kludge.  As I said, it has been a religious
argument.)  However, I feel that the disadvantages of not using
nested #include files (the loss of information hiding, and the
burdensome requirements placed on each #includer) outweigh the
disadvantages of using them.  (As I've suggested, given the
existence of grep and a good Makefile dependency generator, the
disadvantages of nested #include files largely disappear.  And
the #ifndef/#define technique isn't really ugly, either,
especially if I remember to call it a "technique" and not a
"trick" :-) . )

I've slanted this discussion a bit towards the case when the file
being recursively #included is a "project" file, under the
programmer's control (and probably #included with "" rather than <>).
When the header file to be recursively #included is a standard
header file, as it was in Bimal's example, the options shift
slightly.

The ANSI C Standard guarantees that standard header files may
safely be #included multiple times, so nested #inclusion (of
standard headers) is safe in an ANSI environment.  However, most
pre-ANSI header files do not implement any idempotency, so
programmers must be careful if portability to pre-ANSI systems is
important.

Since the "clean" #ifndef/#define technique is implemented inside
the header file being protected, you can't apply it retroactively
to an older standard header file which lacks it.  (Even if you
have write access to the standard header files on your system,
and don't mind modifying them, you can't assume that a person you
give your code to can.)  However, for <stdio.h> at least, a
variation on the second, deprecated idempotency technique is
viable:

	#ifndef EOF
	#include <stdio.h>
	#endif

It's not at all unusual for a data structure or subroutine
interface defined in a header file to refer to the stdio FILE *
type (exactly as in Bimal's example), so usages like this are
common, and shouldn't be frowned upon.

(It helps if, in top-level .c files, you always #include standard
headers first, followed by "local" ones.  If, on the other hand,
you had things equivalent to

	#include "foo.h"
	#include <stdio.h>

, errors on pre-ANSI systems would be more likely.  It's easy to
protect "#include <stdio.h>" with "#ifndef EOF" inside of foo.h;
it's harder and messier to do so in every .c file.  "It is agreed
that the ice is thin here.")

[Now all I have to do is shrink this discussion down to three
sentences for the FAQ list.]

                                            Steve Summit
                                            scs@adam.mit.edu

torek@elf.ee.lbl.gov (Chris Torek) (04/13/91)

In article <1991Apr13.013911.18151@athena.mit.edu> scs@adam.mit.edu writes:
>Whether nested #included files are good style is, historically,
>(like so many of these style questions :-( ) a bit of a religious
>question.  (ANSI's new guarantees may have shifted the balance
>somewhat.)

Indeed.  One thing, for which I argued (and still will argue---it
seems `right' to me) which is not standard, would tip the balance
all the way:  I believe that `#include' should (always) have been
defined as `read this if you have not already'.

Such a definition would mean that

	#define TABLEENTRY(a, b, c) a
	int atable[] = {
	#include "table"
	};
	#undef TABLEENTRY
	#define TABLEENTRY(a, b, c) c
	int ctable[] = {
	#include "table"
	};
	#undef TABLEENTRY

would fail.  It also leaves open the question of spelling: is

	#include "./table"

`the same' as

	#include "table"

?  All of these would have to have been answered in trial
implementations, which (alas) did not exist when the standard was in
progress, and now no doubt never will.  (My answers would be
`#read'---unconditionally read in a file---and `files are considered
identical only if spelled identically' or `identity of files is
implementation-defined'.)
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

wirzenius@cc.helsinki.fi (Lars Wirzenius) (04/13/91)

In article <1991Apr12.103621.8907@umiami.ir.miami.edu>, bxpfac@umiami.ir.miami.edu writes:
> How do the guru's on the net feel about the following two styles?

(Uh, more like a bandwith-waster, I'm afraid...)

> Style 1: (No nested includes - user responsible for proper order of includes).
> Style 2: (Nested inclusion [each header includes everything it needs
> included before it]).

I very much prefer style 1. Style 2 will very likely cause multiple
inclusions, which at best wastes time (the compiler has to process each
include every time), and at worst causes errors, since not all headers
can be included multiple times. The only exception is an include file,
whose only purpose is to include a set of headers that are needed in all
or most of the source files in a given project.

-- 
Lars Wirzenius    wirzenius@cc.helsinki.fi

barnettr@snaggle.rtp.dg.com (Richard Barnette) (04/14/91)

In article <1991Apr12.103621.8907@umiami.ir.miami.edu> devebw9f@miavax.ir.miami.edu writes:
>How do the guru's on the net feel about the following two styles?
>
>Style 1: (No nested includes - user responsible for proper order of includes).
    I tend not to prefer this style.  Forcing the user of header files
to be aware of other header dependencies drastically complicates the
programming process.
    Things get further complicated when the dependencies change
(and they almost always do).  Then some poor joe (probably not the
one who wrote the original code) has to find all the places where the
header was included, and add more #include's to handle the new
dependency.  Or else you forget to delete extra #include's when
a dependency is dropped.
    There are some advantages with this style (see below).

>Style 2: (Nested inclusion).
    My preference.  Note however that care should be taken not
to include a header twice.  This can produce warnings and/or errors
depending on the kinds of things in the header.  It's wise to make
all your #include'd headers look something like this:

#ifndef foo
#define foo
/* contents of file "foo.h" */
#endif

    Unfortunately, some systems don't do something like this
with their standard header files.  A #include <stdio.h> inside
your header file may look innocuous enough, but if the header
is included by a file with its own #include <stdio.h>, disaster
may strike.

Richard Barnette      | Data General Corporation | obligatory (in)famous quote:
(919) 248-6225        | RTP, NC  27709	         |  You wascal wabbit!
Commercial Languages  | 62 T.W. Alexander Drive	 |  Wandering wizards won't
barnettr@dg-rtp.dg.com				 |  win! - /usr/lib/sendmail

scott@bbxsda.UUCP (Scott Amspoker) (04/14/91)

In article <12060@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes:
>Indeed.  One thing, for which I argued (and still will argue---it
>seems `right' to me) which is not standard, would tip the balance
>all the way:  I believe that `#include' should (always) have been
>defined as `read this if you have not already'.

Turbo C has an option to do exactly that.

-- 
Scott Amspoker                       | Touch the peripheral convex of every
Basis International, Albuquerque, NM | kind, then various kinds of blaming
(505) 345-5232                       | sound can be sent forth.
unmvax.cs.unm.edu!bbx!bbxsda!scott   |    - Instructions for a little box that
                                     |      blurts out obscenities.

mike@bria.UUCP (Michael Stefanik) (04/15/91)

In an article, wirzenius@cc.helsinki.fi (Lars Wirzenius) writes:
|I very much prefer style 1. Style 2 will very likely cause multiple
|inclusions, which at best wastes time (the compiler has to process each
|include every time), and at worst causes errors, since not all headers
|can be included multiple times. The only exception is an include file,
|whose only purpose is to include a set of headers that are needed in all
|or most of the source files in a given project.

Actually, if the include files are written correctly, very *little* time
is wasted and no errors are produced.  It's a simple matter of wrapping
the include file in:

	#ifndef _STDIO_H_
	#define _STDIO_H_
		.
		.
	#endif /* _STDIO_H_ */

Personally, I have found that nested includes are both convenient, and
make a great deal of sense.

-- 
Michael Stefanik, MGI Inc, Los Angeles | Opinions stated are never realistic
Title of the week: Systems Engineer    | UUCP: ...!uunet!bria!mike
-------------------------------------------------------------------------------
If MS-DOS didn't exist, who would UNIX programmers have to make fun of?

kers@hplb.hpl.hp.com (Chris Dollin) (04/15/91)

Steve Summit says [apropos the Nexted Include discussion]:

   ... Manual Makefile
   maintenance quickly becomes nearly impossible, so an automatic
   Makefile generating scheme, which follows the nested #include
   directives reliably, is a requirement.

Why does manual makefile maintainance ``become nearly impossible'' (presuming
that it wasn't already)? I handle this particular problem by having entries
for each .h file, treating the includes as ``dependencies'', with a ``touch''
command as the action: thus, if foo.h includes bar.h and baz.h, I have

    foo.h: bar.h baz.h
	touch foo.h

This seems to work fine (if files are spilt across separate directories, one
needs a make with VPATH or some similar extension; but this isn't really any
worse than the situation without nested includes).

Mind you, I'd much prefer automatic maintenance; I'm just pointing out that the
manual case doesn't seem *that* much harder. I'd be delighted to have the
problems pointed out for me before I have to cope with them for real ...

--

Regards, Kers.      | "You're better off  not dreaming of  the things to come;
Caravan:            | Dreams  are always ending  far too soon."

wirzenius@cc.helsinki.fi (Lars Wirzenius) (04/15/91)

In article <189@bria.UUCP>, mike@bria.UUCP (Michael Stefanik) writes:
> In an article, wirzenius@cc.helsinki.fi (Lars Wirzenius) writes:
> | [ ... that I don't prefer nested inclusions -- lw ]
> 
> Actually, if the include files are written correctly, very *little* time
> is wasted and no errors are produced.  It's a simple matter of wrapping
> the include file in:
> [ #ifndef SYM / #define SYM / ... / #endif ]

Very much time can be wasted, if the compiler has to process, say,
stdio.h (or another large header) multiple times. I can easily find
several of my own programs, for which nested includes like this could
cause some headers to be included about 10 or 15 times. Of course, if
you're the only user on a 100 GIPS computer, that doesn't really affect
compile times, but on my humble PC the time escalates very quickly.

Furthermore, not all system headers are protected by the #ifndef/#define
construction, and not everybody can change them, so the possibility for
errors due to multiple inclusion will always be there, if the program is
ever going to be ported.

> Personally, I have found that nested includes are both convenient, and
> make a great deal of sense.

I agree, in theory, but I have great reservations in practice.
-- 
Lars Wirzenius    wirzenius@cc.helsinki.fi

pds@lemming.webo.dg.com (Paul D. Smith) (04/16/91)

I definitely prefer the nested form.  It does have disadvantages, but
they are far outweighed by the advantages (IMHO).  In my projects
we've had lots of libraries (code reusage :-), and the task of trying
to get every include file in the correct order was not worth the
effort.

[] On 13 Apr 91 00:11:42 GMT, amewalduck@trillium.waterloo.edu (Andrew Walduck) said:

AW> Disadvantages:
AW> 1. Slower compilation due to multiple references to the same file...

This is easily avoided by enhancing your multiple-inclusion scheme a
little bit: as above: in foo.h, use:

foo.h ------------------------------
/*
 * foo.h - prototypes for foo functions
 */

#ifndef FOO_H
#define FOO_H

...

#endif  /* FOO_H */
------------------------------ foo.h

Then, when including foo.h in another file, use:

#ifndef FOO_H
#include "foo.h"
#endif

This way the preprocessor doesn't even try to include the file if
FOO_H is already defined.  This is important if you have lots of
include files: I worked on a C compiler which ran out of space when we
had too many files included.

AW> 2. Must include references to the imbedded includes in the make file
AW>    dependancy list....

This is a pain, I admit.  You need a bit of trial-and-effort to get
the first file compiled: you find out you're missing a file & add its
directory to the compile line & retry, then you find out *that* file
needed another, so you add its directory and retry, etc.

However, you only have to do it for one or two files, then you've
generally gotten all of them.

AW> 3. Potential for loops if one file omits having a #ifndef, #endif wrapper
AW>    around it.

You probably won't get loops: most of the time you get redefinition
errors and your compile aborts.  This is usually trivial to find and
fix.

-----

Another popular scheme is to have one "master header file" which
includes every header file you need, then include the "master" in your
code, like this:

foo.h ------------------------------
/*
 * foo.h - Includes everything I need ...
 */

#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
#include <sys/time.h>
 ...
#include "bar.h"
 ...
------------------------------ foo.h

Then in your .c files you just say `#include "foo.h"' and nothing
else.

The disadvantage here, of course, is that the preprocessor has to
process all of these headers for each source file, even if most of it
isn't used.

I think the moral of the post is, as always, use the scheme which fits
your needs.  If you've got a large project with lots of
interdependencies, you should go with the nested-inclusion scheme to
save yourself headaches.  If you've got a simpler project, either go
with non-nested inclusion or use the "master header file" scheme.

                                                                paul
-----
 ------------------------------------------------------------------
| Paul D. Smith                          | pds@lemming.webo.dg.com |
| Data General Corp.                     |                         |
| Network Services Development Division  |   "Pretty Damn S..."    |
| Open Network Systems Development       |                         |
 ------------------------------------------------------------------

gwyn@smoke.brl.mil (Doug Gwyn) (04/16/91)

In article <1991Apr12.103621.8907@umiami.ir.miami.edu> devebw9f@miavax.ir.miami.edu writes:
>foo.h
>  #include <stdio.h>
>use.c
>  #include "foo.h"
>  #include <stdio.h>

Unless you're sure you will never be porting to a non-ANSI C environment,
I recommend against this, since in general you cannot be sure that the
standard headers can be safely included multiple times in the same
translation unit.  (I have had trouble with this in actual practice.)

For headers that you supply yourself, so long as you arrange proper
idempotency locks in the headers, nested inclusion should be fairly
safe.  Be aware, however, that applications need to know what ranges
of identifiers will be usurped by inclusion of any header, and when
nested inclusion is used, the name space usurption is larger than may
at first be apparent.  If you follow some strict naming scheme such
as the "package prefix" approach that I have described here previously,
this should not be a large problem.  If, on the other hand, you don't
control what names are used in the headers, it makes application
programming more difficult than it need be.

gwyn@smoke.brl.mil (Doug Gwyn) (04/16/91)

In article <12060@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes:
>I believe that `#include' should (always) have been defined as
>`read this if you have not already'.

Well, we certainly did not want that to apply to #include <assert.h>.

id8rld06@serss0.fiu.edu (Michael Johnston) (04/16/91)

In responce to the question of whether to use nested includes:

I use nested includes. At the very top of each include is a comment
identifying the file. Following that is something like:

#ifndef MODULE_NAME
#define MODULE_NAME

#include ...

[the rest of the include file goes here]

#endif

This way, any include file necessary is right there. Now, anyone using
this module only needs to know what it does and not how it does it. It
is one less place to make a mistake in the future.
    If you choose to not nest include files, make sure you document
what each file needs. AND DOCUMENT IT WELL!!

Mike


--
====================================================================
Michael Johnston
id8rld06@serss0.fiu.edu or
26793271x@servax.fiu.edu

scs@hstbme.mit.edu (Steve Summit) (04/16/91)

In article <KERS.91Apr15093213@cdollin.hpl.hp.com> kers@hplb.hpl.hp.com (Chris Dollin) writes:
>Steve Summit says [apropos the Nexted Include discussion]:
>   ... Manual Makefile
>   maintenance quickly becomes nearly impossible...
>
>Why does manual makefile maintainance ``become nearly impossible'' (presuming
>that it wasn't already)? I handle this particular problem by having entries
>for each .h file, treating the includes as ``dependencies'', with a ``touch''
>command as the action: thus, if foo.h includes bar.h and baz.h, I have
>
>    foo.h: bar.h baz.h
>	touch foo.h

I'm a stickler for meaningful modification times (I want the
modification time to reflect the last time *I* modified the
file), so I can't use this trick.  The names of all the
(recursively #included) header files appear on the object file
dependency line:

	a.o: a.c b.h c.h

if a.c #includes b.h, and b.h #includes c.h .

Strictly speaking, this is a more "correct" makefile, since b.h
doesn't really "depend" on c.h in the sense of "has to be rebuilt
if c.h changes."  (This is, to be sure, a debatable point, and
often is.  I believe the make documentation recommends the latter
interpretation, however.)

I gather that the "touch" trick is fairly widely used, so I
suppose it must work, but I can't say if it has any real
practical drawbacks (other than losing the "real" modification
time, and the philosophical objections).

"Impossible" is, of course, a relative term in this field.
I'm profoundly lazy, and I refuse to keep even simple Makefile
dependencies up-to-date manually, let alone complicated ones
which require chasing through several levels of #inclusion.
That's why I use an automatic dependency generator.  (For me,
doing so has no disadvantages, because I wrote my own that
behaves just as I expect it to.)

                                            Steve Summit
                                            scs@adam.mit.edu
                      (temporary alternate) scs@hstbme.mit.edu

cschmidt@lynx.northeastern.edu (04/17/91)

> How do the guru's on the net feel about the following two styles?
>
> [two examples, one of which uses nested include directives]

My practice is to use nested include directives, but the technique I
employ requires some explanation.  This is a bit long, and I hope my
fellow C programmers will find it of interest.

Rather than writing header files directly, I write only .C files, then
use a program I wrote that generates header files.  This has several
advantages, some of which have to do with include directives.

The header generator program writes directives that prevent the
compiler from including or interpreting a header file more than once
when processing a single source file.

o   The program adds three lines to each output header file, enclosing
    the entire file in an "if" statement.  This prevents the compiler
    from interpreting the same header file more than once.  Example:

        #ifndef STDIO_H
        #define STDIO_H
        ...
        #endif

o   Every include directive written to the output header file is
    enclosed in an "if" statement.  This prevents the compiler from
    including the same header file more than once.  Example:

        #ifndef STDIO_H
        #include <stdio.h>
        #endif

Here is an important advantage of using this technique.  A source
file needs to include only the headers for modules that define things
the source file uses directly.  In other words, it is never necessary
to include a header only to declare things required for the compiler
to correctly interpret a header included subsequently.

For example, if module A requires module B, and module B requires
module C, then the module A source file only includes the B header
explicitly.  If the module B requirements are later modified, no
changes to the modules that require B will be necessary.  When you
write module A, you do not need to know what module B requires.  If
module A happens to include header C before including header B, the
conditional statements shown above prevent the compiler from opening
header C twice.

The header generator program encourages you to divide the input source
file into two sections.  (Actually, there are four section types in
all, but this is enough complication for this message.)

o   Export section.  This section contains the things that are copied
    to the output header file, such as include directives and other
    declarations for all other modules that use this module.  A
    certain pragma directive marks the end of the export section.

o   Implementation section.  This section contains the definitions (as
    opposed to the declarations) of exported variables and functions,
    and it may define non-exported constants, types, variables, and
    functions, and it may contain include directives that are required
    only for the implementation section.

The header generator program takes an input file specification
argument, which may include wildcards.  For each specified source
file, the program generates a header file only if there does not
already exist a header file with a newer time stamp.

An additional advantage: The compiler runs perceptably faster because
the header files are compressed and because fewer header files are
accessed.  Note that a module does not include its own header file.
Header file compression consists of removing all comments and blank
lines, and replacing consecutive spaces and tabs with a single space.

I would like to know what you all think about this.

Christopher Schmidt
cschmidt@lynx.northeastern.edu

torek@elf.ee.lbl.gov (Chris Torek) (04/17/91)

(this is drifting well off the comp.lang.c beam; followups should probably
go elsewhere.)

>In article <KERS.91Apr15093213@cdollin.hpl.hp.com> kers@hplb.hpl.hp.com
>(Chris Dollin) writes:
>>I handle [nested includes] by having [makefile] entries for each .h
>>file, treating the includes as ``dependencies'', with a ``touch''
>>command as the action ... [so why not use this technique?]

In article <1991Apr16.002821.19233@athena.mit.edu> scs@adam.mit.edu writes:
>I'm a stickler for meaningful modification times ... so I can't use
>this trick. ... Strictly speaking, ... b.h doesn't really "depend" on
>c.h in the sense of "has to be rebuilt if c.h changes."

The latter is true but almost irrelevant (the difference between `c.h
has changed, therefore everything that uses b.h-which-uses-c.h must
change' and `c.h has changed, therefore b.h has changed, therefore
everything that uses b.h must change' is slim).  The former point is
actually more relevant.  In particular, the `touch' trick does not work
well with source code version control systems, which often either
depend on modification times or (more likely) keep files read-only
unless they are specifically being `changed'.

You can solve the latter problem by making b.h be, not just a source,
but a source-and-object: instead of including b.h, include b.i, and
build b.i by copying b.h.  Then b.i depends on b.h and may also depend
on c.h; b.i is not under source control (being an object), and files
that `use b.h-which-uses-c.h' depend on (and use) b.i.  This introduces
a new set of problems.  The best solution is to avoid intermediaries
and use real dependencies.  With a little help from the compiler, the
whole system can be automated; `mkdep' and company are a step in the
right direction, and are a sufficient solution when used with care.

(The compiler is the best place to compute dependency relations because
only the compiler knows for certain just what information the compiler
used to put together the object file.  In a perfect system one would
list only the commands used to build a program, and the system would
infer all the rest.  Perfect systems are hard to come by, and most
attempts have been rather slow; mkdep and `manually automatic'
dependencies are a good compromise, for now.)
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

ross@contact.uucp (Ross Ridge) (04/17/91)

In article <KERS.91Apr15093213@cdollin.hpl.hp.com> kers@hplb.hpl.hp.com (Chris Dollin) writes:
>I handle this particular problem by having entries
>for each .h file, treating the includes as ``dependencies'', with a ``touch''
>command as the action: thus, if foo.h includes bar.h and baz.h, I have
>
>    foo.h: bar.h baz.h
>	touch foo.h

Well I tend to think something on the order of:

	fred.o: fred.c foo.h bar.h baz.h
	bob.o: bob.c foo.h bar.h baz.h
	jane.o: jane.c foo.h bar.h baz.h

is the way to go.  It's a better representation of the true depedencies.
But to make this more managable I use a simple make macro:

	FOO_H = foo.h bar.h baz.h

	fred.o: fred.c $(FOO_H)
	bob.o: bob.c $(FOO_H)
	jane.o: jane.c $(FOO_H)

This lets you quickly update your makefile when add or remove #include's
from your foo.h file.  (It also has the great advantage of letting you
do "FOO_H =" when you get fustrated because every little change to foo.h
causes everything to be recompiled.)

								Ross Ridge

Disclaimer:  This isn't my idea, I stole it from the GNU CC makefile.
I wonder if this makes this covered by the GNU copyleft?  Hmm...
-- 
Ross Ridge								 //
"The Great HTMU"							[oo]
ross@contact.uucp							-()-
ross@watcsc.uwaterloo.ca						 //

bhoughto@bishop.intel.com (Blair P. Houghton) (04/19/91)

In article <memo.928070@lynx.northeastern.edu> cschmidt@lynx.northeastern.edu writes:

Sounds like a great idea, but...

>The header generator program encourages you to divide the input source
>file into two sections.  (Actually, there are four section types in
>all, but this is enough complication for this message.)

Ick.  Constraints on source layout are often much
more trouble than they're worth.

Can you make it smart enough to simply create correct
external declarations in the .h from all things defined in
the .c (all things with external linkage, that is)?  Plus
all #defines?  Generally programmers put definitions of
objects in the region of the file preceding the first
function definiton.  Sometimes, however, it's useful to
keep certain constants and variables close to the function
that uses them, so one can cut and paste the function if
the entire package isn't needed.  It's not any harder to
find a variable definition in the middle of a file than at
the beginning, especially when the file defines several
hundred variables.

I can see how it would be a couple of orders of magnitude
easier to do it with an explicit dividing line between
objects and functions, though.

You didn't mention it, but the .h file should also contain the
prototypes of the functions in the .c file, of course...

				--Blair
				  "I laughed, I cried,
				   I compiled my popcorn..."

lerman@stpstn.UUCP (Ken Lerman) (04/19/91)

In article <1848@bbxsda.UUCP> scott@bbxsda.UUCP (Scott Amspoker) writes:
|>In article <12060@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes:
|>>Indeed.  One thing, for which I argued (and still will argue---it
|>>seems `right' to me) which is not standard, would tip the balance
|>>all the way:  I believe that `#include' should (always) have been
|>>defined as `read this if you have not already'.
|>
|>Turbo C has an option to do exactly that.
|>

Objective-C uses #import to mean `read this if you have not already'.
#include has its usual meaning (which is sometimes useful).

There remains the question of what does it mean to say a file has
already been included.  One with the same name?  With the same path
and name?  With the same inode (on the same file system)?  With the
same contents?

For (IMHO) reasons having to do with where searches take place when
the included files is expressed as "filename.h" (instead of
<filename.h>) importing files with the same name (ignoring the path)
only once seems to be the right answer.

Ken

cschmidt@lynx.northeastern.edu (04/21/91)

> I'm profoundly lazy, and I refuse to keep even simple Makefile
> dependencies up-to-date manually, let alone complicated ones which
> require chasing through several levels of #inclusion.  That's why I
> use an automatic dependency generator.  (For me, doing so has no
> disadvantages, because I wrote my own that behaves just as I expect it
> to.)

It was heartening to read this.  I have long believed that writing
MAKE scripts by hand is both too time consuming and too unreliable.
In one project with which I was associated, it was discovered AFTER
the product shipped that about 14 dependencies were missing from the
circa 500-line MAKE script.

My current solution is similar to yours, Steve, but with an unusual
twist: my dependency detector program never writes the dependency list
to disk.  Instead, it outputs a list of source files that need to be
compiled right now.  It then calls the compiler (if the list is not
empty).  The MAKE program is not used.  (Naturally, when I am
debugging a single module, I just call the compiler directly.)

One obvious advantage to this approach is that the dependency list is
always guaranteed to be up to date.  When using a conventional
dependency generator, there is a temptation to postpone using it.

You might think that this approach would be slower than using a
traditional MAKE program, but it turns out that it is actually faster
in many cases.  The MAKE program resolves dependencies one at a time.
If ten modules need to be compiled, MAKE calls the C compiler ten
times.  Using my dependency detector, the C compiler is called just
once; the names of the ten modules to be compiled are passed to the
compiler as a single parameter in the form of a "response file".

In any case, my dependency detector program employs these performance
improvement techniques:

o   The source file is not scanned for include directives if the
    object file does not exist, or if the object file's time stamp is
    older than that of the source file.

o   To avoid reading the entire source file for include directives,
    the program does not read beyond a certain optional pragma
    directive, which happens to be the same pragma directive that my
    header file generator recognizes as marking the start of the
    implementation section.

o   By default, the program does not check the time stamps of standard
    header files (those whose names are enclosed in angle brackets),
    but there is an optional command line switch to override this.

o   Once the time stamp of a header file is determined, that header
    file name and its time stamp are stored in a global list, in case
    another source file includes the same header file.  The time stamp
    assigned to a header file is either that header file's actual time
    stamp, or the time stamp of the newest header file specified in a
    nested include directive, whichever time stamp is newest.

o   Before scanning source files, the program notes the time stamp of
    the newest object file.  When determining the time stamp of a
    header file, the scanning for nested include directives stops if
    the program finds a header file with a time stamp that is newer
    than that of the newest object file.

I would be interested in any remarks you folks might have about this,
especially if you have had any experience designing or using a
dependency detector that uses this approach.

Christopher Schmidt
cschmidt@lynx.northeastern.edu

geels@cs.vu.nl (Arnold Geels) (04/22/91)

In comp.lang.c you write:

>I would be interested in any remarks you folks might have about this,
>especially if you have had any experience designing or using a
>dependency detector that uses this approach.

I'll bite.  My main objection is that your program could be incorrect.
There is only one program that really knows what the dependencies are and
that is the compiler itself.  All other programs must act like the compiler,
bugs and all(!).  And the more complex they become, the more you are copying
the compiler code.

So: use your compiler to generate the dependencies.

>I have long believed that writing
>MAKE scripts by hand is both too time consuming and too unreliable.
>In one project with which I was associated, it was discovered AFTER
>the product shipped that about 14 dependencies were missing from the
>circa 500-line MAKE script.

So: use automatic dependency generators.  I think we agree here.

>One obvious advantage to this approach is that the dependency list is
>always guaranteed to be up to date.  When using a conventional
>dependency generator, there is a temptation to postpone using it.

Right.  Force the system to check the dependencies every time you
compile.

>You might think that this approach would be slower than using a
>traditional MAKE program, but it turns out that it is actually faster
>in many cases.  The MAKE program resolves dependencies one at a time.
>If ten modules need to be compiled, MAKE calls the C compiler ten
>times.  Using my dependency detector, the C compiler is called just
>once; the names of the ten modules to be compiled are passed to the
>compiler as a single parameter in the form of a "response file".

I think that when projects get so big that makefiles get too large,
dependencies become hard to find, etc., performance is no longer an issue.
You want to be sure that things are correct, and that's hard enough already.
Suppose you run into a bug. You search the code for days, and finally 
find that it is caused by an out-of-date object file that was missed by
your home-made dependency generator..  Now what do you say?

>In any case, my dependency detector program employs these performance
>improvement techniques:

> (techniques deleted)

>o   To avoid reading the entire source file for include directives,
>    the program does not read beyond a certain optional pragma
>    directive, which happens to be the same pragma directive that my
>    header file generator recognizes as marking the start of the
>    implementation section.

But the compiler DOES read beyond the pragma, and if there is an include
statement there you're hung.  You could say "bad luck," but we are talking
about tools here, and tools should make life easier, not harder.

If you are interested in this topic, here is a nice article:

%T On the design of the Amoeba Configuration Manager
%A E.H. Baalbergen
%A C. Verstoep
%A A.S. Tanenbaum
%J ACM SIGSOFT Software Engineering Notes
%V 17
%N 7
%D November 1989
%P 15-22

Cheers,

Arno1d.

 - - - - - - - - - - - - - - 
Arnold Geels
Vrije Universiteit Amsterdam
The Netherlands

rjc@cstr.ed.ac.uk (Richard Caley) (04/23/91)

[Slightly off C but...]

In article <memo.942884@lynx.northeastern.edu>, cschmidt  (c) writes:

c> One obvious advantage to this approach is that the dependency list is
c> always guaranteed to be up to date.  When using a conventional
c> dependency generator, there is a temptation to postpone using it.

There is another (IMHO better) solution. The Makefile can arrange for
the dependencies to be recalculated when necessary, that is after all
what make is good for.

You need a make which understands `include' directives, but after that
it is relatively painless.  

Details left as an excorcise.

--
rjc@cstr.ed.ac.uk			_O_
					 |<

martelli@cadlab.sublink.ORG (Alex Martelli) (04/26/91)

scs@hstbme.mit.edu (Steve Summit) writes:
	...
:I gather that the "touch" trick is fairly widely used, so I
:suppose it must work, but I can't say if it has any real
:practical drawbacks (other than losing the "real" modification
:time, and the philosophical objections).

Well, this trick (the "foo.h: bar.h; touch foo.h" approach to
nested include handling in makefiles) surely has the potential
of causing a lot of extra work.  Makefiles are not just for
recompiling and linking sources; lots more things can depend 
on a text file than just a .o... the extra touching, depending
on what you do in your makefiles, may cause redundant attempts
at RCS check-ins of sources, redundant printouts of unmodified
sources, redundant backups to slow media, redundant broadcast
of sources over a net, redundant updating of tarfiles, etc etc.
I also share your dislike for having the last-modified date lose
information value in a long-listing.
Despite all this, it's just *so* convenient that I find myself
using it from time to time, mostly for makefiles intended for
short-term use...
-- 
Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 53, Bologna, Italia
Email: (work:) martelli@cadlab.sublink.org, (home:) alex@am.sublink.org
Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434; 
Fax: ++39 (51) 366964 (work only), Fidonet: 332/401.3 (home only).