[net.unix] make depend

kim@enea.UUCP (Kim Walden) (03/15/85)

> From article 3517 in net.lang.c:
>
> ... [a make dependency generating tool] should be picked up
> and stuck into every UNIX in existence (picked up by vendors,
> for the benefit of their licensees), because the absence of
> (semi-)automated dependency generators is a nuisance.
> People don't keep their dependency lists up to date, things
> don't get remade when a header changes, and all sorts of bugs
> crop up.  Or, worse, they only appear when ".o" files are removed
> and the modules are recompiled, so the bug appears rather distant
> in time from the actual change to the header file.
> -- 
> 	Guy Harris
> 	{seismo,ihnp4,allegra}!rlgvax!guy


I agree entirely with Guy concerning the importance of automated
dependeny generators.

There seems to be an abundance of such generators spreading around,
ranging from simplistic sed scripts to more ambitious work, but the
generators I have seen, including the ones from Berkeley, Stanford and
AT & T, are all wrong. The reason for this is quite fundamental.

Scanning source files recursively to find out exactly what files will be
included in what other files, which lately seems to have been added
as an extra option to the C compiler, simply does not work.

In many cases, files include other files, which are not source files,
e.g. the typical #include "scanner.c" in a yacc program.
Such files may not be present at the time of dependency generation,
or if they are, the may be obsolete, and thus leave out dependencies
or introduce faulty ones.

One cannot force them to be up to date either, since this would require
running make, but make cannot run properly before its dependencies
have all been generated. Thus we have an inevitable hen-and-egg situation.

The solution is to use ONLY source files as a basis for
dependency generation.

I have described this at some length in the article "Automatic Generation
of Make Dependencies", Software Practice & Experience, vol. 14(6),
pp. 575-585, June 1984, and I will talk about it on the EUUG Usenix
Conference in Paris, April 1-3.

When standard make suffix conventions are used (renaming files like
y.tab.c and lex.yy.c to get the base file name of the respective input
files etc.), simple transformation rules can be used to deduce from
the source file names and include statements extracted from them just
what include statements will be present in the generated files,
without actually creating these.

This also has the advantage of being easily parameterized to handle
include mechanisms in other languages than C.

I have a program that takes a complete set of source files,
extracts all include statements from them, and generates the
correct set of dependencies using a set of default suffix
transformation rules, without requiring any generated files to
be present.

If Berkeley is interested, I would be willing to include it as
user contributed software to bsd4.3.
-- 
	Kim Walden
	ENEA DATA Sweden

	UUCP:	{seismo,decvax,philabs}!{mcvax,ukc,unido}!enea!kim
	ARPA:	decvax!mcvax!enea!kim@berkeley.arpa
		mcvax!enea!kim@seismo.arpa

throopw@rtp47.UUCP (Wayne Throop) (03/25/85)

> > ...
> > People don't keep their dependency lists up to date, things
> > ...
> > 	Guy Harris
> 
> I agree entirely with Guy concerning the importance of automated
> dependeny generators.
> ...
> The solution is to use ONLY source files as a basis for
> dependency generation.
>  ...
>     Kim Walden

Kim's solution seems quite good given the way make works.  If you are
willing to change make's model of the world, another solution becomes
available.  In particular, if a make-like tool allowed dependencies to
be specified dynamically, the problem of intermediate files being needed
before make is invoked (to allow dependencies to be discovered) becomes
a non-problem.

There are many ways for make to be modified to allow for dynamic inputs,
but suppose a syntax something like

    foo.o: foo.c (foo.c; find_includes foo.c)
            cc -c foo.c
    foo.c: foo.x
            make_c_from_x foo.x

The parenthesized text specifies a list of inputs to the "command"
at the end of the list.  The output of the command will be a list of
include files, and the effect is as though that list of files had been
supplied instead of the parenthesized text.  What make would do for
this makefile fragment would be to invoke make_c_from_x, then invoke
find_includes (discovering any include file dependencies of foo.c),
and then cc foo.c, producing foo.o.

This is essentially an "incremental make-file generator".  It has problems,
such as what to do about include files that include other files, and so
on, but these problems can be overcome.  Note also that make would then need
a database of already-run dynamic input lists, for efficency (otherwise
it would need to re-run find_includes every time make runs, not just when
foo.c changes).

kim@enea.UUCP (Kim Walden) (04/10/85)

In article <9201@rtp47.UUCP> throop@rtp47.UUCP (Wayne Throop) writes:
>> ...
>> The solution is to use ONLY source files as a basis for
>> dependency generation.
>>  ...
>>     Kim Walden

> Kim's solution seems quite good given the way make works.  If you are
> willing to change make's model of the world, another solution becomes
> available.  In particular, if a make-like tool allowed dependencies to
> be specified dynamically, the problem of intermediate files being needed
> before make is invoked (to allow dependencies to be discovered) becomes
> a non-problem.
> 
> There are many ways for make to be modified to allow for dynamic inputs,
> but suppose a syntax something like
> 
>     foo.o: foo.c (foo.c; find_includes foo.c)
>             cc -c foo.c
>     foo.c: foo.x
>             make_c_from_x foo.x
> 
> The parenthesized text specifies a list of inputs to the "command"
> at the end of the list.  The output of the command will be a list of
> include files, and the effect is as though that list of files had been
> supplied instead of the parenthesized text.  What make would do for
> this makefile fragment would be to invoke make_c_from_x, then invoke
> find_includes (discovering any include file dependencies of foo.c),
> and then cc foo.c, producing foo.o.
> 
> ...

I do not agree with your proposal, because:

	1. Very few people really understand the implications
	   of make's basic model as it is, so the least thing
	   we would want to do is to complicate it further.

	2. It would not solve the problem anyway.

	   In the example, make itself forces the generated file foo.c
	   up-to-date before invoking a command to extract include lines
	   from it.
	   But an extraction command will have to deal with nested includes,
	   and when an INCLUDED file is a generated file, the command cannot
	   force it up-to-date, and hence cannot proceed to search the
	   file for more include lines.

	   The hen-and-egg syndrome is still there, and cannot be
	   circumvented.

-- 
	Kim Walden
	ENEA DATA Sweden

	UUCP:	{seismo,decvax,philabs}!{mcvax,ukc,unido}!enea!kim
	ARPA:	decvax!mcvax!enea!kim@berkeley.arpa
		mcvax!enea!kim@seismo.arpa

throopw@rtp47.UUCP (Wayne Throop) (04/13/85)

In <853@enea.UUCP>, Kim Walden objects to the notion of dynamic
dependency derivation, saying:

>I do not agree with your proposal, because:
>
>        1. Very few people really understand the implications
>           of make's basic model as it is, so the least thing
>           we would want to do is to complicate it further.
>
>        2. It would not solve the problem anyway.
>
>           In the example, make itself forces the generated file foo.c
>           up-to-date before invoking a command to extract include lines
>           from it.
>           But an extraction command will have to deal with nested includes,
>           and when an INCLUDED file is a generated file, the command cannot
>           force it up-to-date, and hence cannot proceed to search the
>           file for more include lines.
>
>           The hen-and-egg syndrome is still there, and cannot be
>           circumvented.

I find myself agreeing with point 1.  Adding yet another wart to make is
not the answer.  However, point 2 turns out not to be the case, since I
use an existing make-like tool that dynamically derives dependencies.
Kim's objections are quite valid, but my original posting (in
retrospect) did not adequately present my position.  Let me try to
clarify.

First, my example was illustrative only.  I did not mean to imply that I
thought that warping the existing make was the proper way to proceed.  I
chose a make-like syntax, since most readers are faminiar with it.  The
actual syntax of the tool I use is very un-make-like, as is it's basic
model of the world.

Second, when I said in my original article that the "include file
includes another file" problem could be solved, I had reason to be
pretty sure I was right.  Because it has been solved.  Granted, it has
problems with conditionally included files, but then, so does Kim's
tool.  I made no claim to dynamic derivation's superiority, just that it
was a viable alternative.  (On the other hand, since user-specified
dependency rules can easily be added, it probably IS better in cases
where non-standard derivations are used.  Back on the original hand, it
is NOT better in cases where all the transforms you want to apply are
known to a make-file generator.)

So: how does it deal with the include file problem?  Well, the basic
model is that for each buildable item there is a derivation action, and
a construction action.  The derivation action tells the dependency
manager what items need to be built before the construction action can
take place.  If these are include files, they have derivation actions
that specify that the recursively included files must be available
before the level-one include file can be considered available.

Given the sources
    foo.c
        #include "a.h"
        foo(){}
    a.h
        #include "b.h"
    b.h
        /* no more includes */

Our automated build tool produced this trace:

    % 01 VISITING foo.c.cc
    % 02   DERIVING foo.c.cc
    % 03     VISITING foo.c.c_source
    % 03       Changed:   File sources:foo.c
    % 03       Invoking build macro for foo.c.c_source
    % 03     END VISITING foo.c.c_source {Ok}
    % 02     Changed:   File foo.c
    % 02     Invoking derive macro for foo.c.cc
    % 02   END DERIVING foo.c.cc {Ok}
    % 02   VISITING a.h.c_source
    % 03     DERIVING a.h.c_source
    % 03       Changed:   File sources:a.h
    % 03       Invoking derive macro for a.h.c_source
    % 03     END DERIVING a.h.c_source {Ok}
    % 03     VISITING b.h.c_source
    % 04       DERIVING b.h.c_source
    % 04         Changed:   File sources:b.h
    % 04         Invoking derive macro for b.h.c_source
    % 04       END DERIVING b.h.c_source {Ok}
    % 03       Changed:   File sources:b.h
    % 03       Invoking build macro for b.h.c_source
    % 03     END VISITING b.h.c_source {Ok}
    % 02     Changed:   File sources:a.h
    % 02     Changed:   File b.h
    % 02     Invoking build macro for a.h.c_source
    % 02   END VISITING a.h.c_source {Ok}
    % 01   Changed:   File foo.c
    % 01   Changed:   File a.h
    % 01   Invoking build macro for foo.c.cc
    % 01 END VISITING foo.c.cc {Ok}

A lot of huffing and puffing to go through for a fairly simple compile,
but note that in these cases foo.c, a.h and b.h are all GENERATED FILES!
That is, they didn't exist in the file system at the start of the
"make", but were instead in a source archive.  The "Invoking build macro
for <mumble>.c_source" are the "extract from archive" actions.  Thus
there is nothing to prevent the derive action for a.h to force b.h to be
created on the fly, if b.h is produced by something else.  In fact, this
sort of thing is done in many of our automated builds.  The crucial
ability here is that derive actions can communicate with the dependency
manager.

The chicken-and-egg problem does not arise here.  In fact, if there are
no circular dependencies, I don't see how it CAN arise.  And, if there
are circular dependencies, I don't see how any automated method can do
much better.

The crucial points I am trying to make:
  - Dynamic derivation is conceptually simple.
    In a dynamic derivation, a derive rule needs only to know locally
    what is going on.  EG, I have a C file as input and I want to know
    what include files there are.  I don't need to worry about the fact
    that the C file was produced YAPG (yet another program generator),
    nor do I care what the original source was, or even if there WAS an
    original source in any traditional sense.  In a from-source
    derivation, the deriver needs to know globally what transforms are
    going to be made.
  - Dynamic derivation is flexible.
    It is easy to add new derivation rules, and these new rules don't
    need to know about how the entire world fits together.
  - Dynamic derivation is practical.
    It exists in a working system, and further development is proceeding
    on these tools.  It turns out that I am not at liberty to distribute
    these tools (and they don't run under unix anyhow), but I can do the
    next best thing and distribute the idea.  As Kim pointed out,
    enhancing make is not the best way to implement dynamic dependancy
    generation.  Creating a simpler but more flexible tool "from
    scratch" seems a better idea.  (A reasonable first-cut version of
    the tool itself can be implemented in a couple of man-weeks.)
-- 
Wayne Throop at Data General, RTP, NC
<the-known-world>!mcnc!rti-sel!rtp47!throopw