nagle@well.UUCP (John Nagle) (10/23/89)
The problem is well-known. It would be desirable if all include files themselves included everything they needed, so that order of inclusion was taken care of automatically. But, of course, this results in multiple inclusion of the same files, which causes problems. A common work-around for this problem is to use a construct like the following in each include file: #ifndef XXX #define XXX ...content... #endif This works, but on the second inclusion, the file still has to be read and parsed, at least by the level of processing that reads "#" statements. (Many newer compilers do "#" processing in the same pass as main compilation, so referring to a "preprocessor" in this context is not necessarily correct.) With widespread use of this technique within library files, some files may be read a large number of times, mostly to be ignored. This slows compilation. The problem is especially severe in large C++ programs, where large numbers of header files are necessary, and nested header files are not at all uncommon. It's been proposed that the semantics of "#include" be changed to avoid all multiple inclusion. But this is controversial, and would require ANSI approval. I propose a solution via compiler optimization. The compiler should behave as follows: 1. If, when reading an "included" file, there are no non-comment statements before the first "#ifndef" (if any), and no non-comment statements after the "#endif" matching said "#ifndef", the compiler shall associate the tag found on the "#ifndef" line with the name of the "#include" file. 2. When processing an "#include" statement, if the file has an associated tag as defined in 1) above, and the tag is defined (in the sense of "#define") the file shall not be included. This is a completely compatible solution to the problem. Old compilers will compile include files written with the "ifndef" convention correctly, but slowly, and new compilers will do it faster. No standardization action is required. Any implementor can install this now, and speed up their product. One could argue for a more elegant but less compatible solution, but the political hassles aren't worth it. John Nagle
tada@athena.mit.edu (Michael J Zehr) (10/24/89)
In article <14240@well.UUCP> nagle@well.UUCP (John Nagle) writes: >[problem of multiple exclusion, generally solved by:] > #ifndef XXX > #define XXX > ...content... > #endif >This works, but on the second inclusion, the file still has to be read and >parsed, at least by the level of processing that reads "#" statements. >[slowing compilation, particularly in C++ programs] >[proposed compiler optimization binding the 'XXX' defined above to the >included file and not including it a second time.] there's another solution you can manage without having to get your compiler vendor to make such an extension: <foo.h:> #ifndef FOO #define FOO #include "foo_real.h" #endif admittedly there's an extra file open for the first time reading through, and you still have to open files when #including the second time, but it's not as bad as having to read the entire file in (particularly if you have some really *large* files). library files could be changed to this with relatively little effort and no compiler changes, and users' header files for large systems could be changed to work faster *today* without having to wait for the vendors to decide to do this. (unless i'm missing something obvious and being really stupid this works....) -michael j zehr
rsalz@bbn.com (Rich Salz) (10/24/89)
Gee, if #ifndef _HAVE_FOO_H_ ... contents of foo.h ... #define _HAVE_FOO_H_ #endif /* _HAVE_FOO_H_ */ is too slow, then do this: #ifndef _HAVE_FOO_H_ #include <hidden_real_name_of_foo.h> #define _HAVE_FOO_H_ #endif /* _HAVE_FOO_H_ */ John's rules about the first #ifndef after the first comment sound much too complicated for me -- I have enough problem with election day... /r$ -- Please send comp.sources.unix-related mail to rsalz@uunet.uu.net. Use a domain-based address or give alternate paths, or you may lose out.
crowl@cs.rochester.edu (Lawrence Crowl) (10/24/89)
In article <14240@well.UUCP> nagle@well.UUCP (John Nagle) notes three
solutions to the multiple file inclusion problem:
(1) An include file protects itself via #ifndef on a symbol that it defines.
This causes the file to be read multiple times.
(2) Modifying the semantics of #include to only include a file once. This is
incompatible with the current definition of #include.
(3) His proposal for modifying the implementation of #include to recognize the
idiom in (1). Implementations need not read a file twice.
However, a solution to the multiple inclusion problem already exists. It is
not necessary (nor desireable in my opinion) to modify #include semantics or
implementations. The solution is as follows.
Each include file defines a symbol (preferably related to its name). For
example, in foo.h:
#define foo_h
...
Each file that includes foo.h, protects the inclusion with a #ifndef:
...
#ifndef foo_h
#include "foo.h"
#endif foo_h
...
Programmers of include files can further protect against multiple inclusion
by using the standard mechanism:
#ifndef foo_h
#define foo_h
...
#endif foo_h
This solution has the following properties:
- The compiler reads include files exactly once.
- No modifications to current systems are required.
- Includes are three lines long instead of one.
- A small (really) amount of additional programmer effort is required.
I have used this solution as a matter of course since shortly after I learned
to program in C. It is an obvious solution, and leads me to wonder why it is
not common practice. Any explainations?
--
Lawrence Crowl 716-275-9499 University of Rochester
crowl@cs.rochester.edu Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl Rochester, New York, 14627
hascall@atanasoff.cs.iastate.edu (John Hascall) (10/24/89)
In some article with a silly huge id#, Lawrence Crowl writes: }In article <14240@well.UUCP> nagle@well.UUCP (John Nagle) notes three }solutions to the multiple file inclusion problem: }(2) Modifying the semantics of #include to only include a file once. This is } incompatible with the current definition of #include. } [yet another scheme...] Since the impending ANSI standard requires that including a file more than once have exactly the same effect as including it once...why can't a compiler just ignore #includes for files it has already #included??? (at least for the "standard" includes) Any comments from the ANSI mavens? I know some people use stuff like: main.c: subr.c: #define MAINDEF 1 #define MAINDEF 0 #include "vars.h" #include "vars.h" main() subrtn() { ... { ... but, does anyone do this thing in the *same* file?? John Hascall
tneff@bfmny0.UU.NET (Tom Neff) (10/24/89)
In article <1659@atanasoff.cs.iastate.edu> hascall@atanasoff.UUCP (John Hascall) writes: > Since the impending ANSI standard requires that including a file more > than once have exactly the same effect as including it once...why can't > a compiler just ignore #includes for files it has already #included??? > (at least for the "standard" includes) Including standard HEADER files should be idempotent. Back here in the real world there are plenty of uses for including a file multiple times with a desired substantial effect on each inclusion. Examples include program generated data tables, copyright strings, and machine dependent code sequences. Any compiler that unconditionally ignored an include file on the second mention would be horribly broken. -- "My God, Thiokol, when do you \\ Tom Neff want me to launch? Next April?" \\ tneff@bfmny0.UU.NET
ken@cs.rochester.edu (Ken Yap) (10/24/89)
|(1) An include file protects itself via #ifndef on a symbol that it defines. | This causes the file to be read multiple times. But is this really as inefficient as people think? I tried the following on a Sun-4/60 % wc grammar0.cc 932 2944 19700 grammar0.cc % g++ -I../h -E grammar0.cc | wc 3728 8219 63497 % time g++ -I../h -E grammar0.cc > /tmp/foo.cc 0.4u 0.3s 0:01 44% 0+208k 0+9io 0pf+0w Looks pretty insignificant compared to parsing and CG time. Just to prove that multiple inclusions were attempted % grep '#' /tmp/foo.cc | sort +2 -3 | uniq -2 -c 2 # 1 "../h/pg_types.h" 1 1 # 42 "../h/pg_types.h" 1 # 1 "/usr/su/lib/g++-include/BitSet.h" 1 2 # 28 "/usr/su/lib/g++-include/BitSet.h" 2 1 # 1 "/usr/su/lib/g++-include/File.h" 1 3 # 27 "/usr/su/lib/g++-include/File.h" 2 1 # 1 "/usr/su/lib/g++-include/builtin.h" 1 3 # 48 "/usr/su/lib/g++-include/builtin.h" 2 1 # 1 "/usr/su/lib/g++-include/math.h" 1 1 # 126 "/usr/su/lib/g++-include/math.h" 2 2 # 1 "/usr/su/lib/g++-include/std.h" 1 1 # 225 "/usr/su/lib/g++-include/std.h" 2 # 1 "/usr/su/lib/g++-include/stddef.h" 1 1 # 59 "/usr/su/lib/g++-include/stddef.h" 1 # 1 "/usr/su/lib/g++-include/stdio.h" 1 2 # 1 "/usr/su/lib/g++-include/stream.h" 1 1 # 160 "/usr/su/lib/g++-include/stream.h" 1 # 27 "/usr/su/lib/g++-include/stream.h" 2 2 # 1 "/usr/su/lib/g++-include/values.h" 1 2 # 92 "/usr/su/lib/g++-include/values.h" 1 # 1 "error.h" 1 2 # 1 "grammar.h" 1 2 # 11 "grammar.h" 2 1 # 127 "grammar.h" 4 # 13 "grammar.h" 2 1 # 1 "grammar0.cc" 10 # 1 "grammar0.cc" 2 2 # 1 "item.h" 1 1 # 98 "item.h" 2 # 1 "option.h" 1 1 # 40 "option.h" 1 # 1 "pg.h" 1 2 # 10 "pg.h" 2 1 # 1 "pggram.h" 1 1 # 1 "predict.h" 1 1 # 9 "predict.h" 2 1 # 1 "production.h" 1 3 # 10 "production.h" 2 5 # 1 "symbol.h" 1 1 # 10 "symbol.h" 2 4 # 140 "symbol.h" 1 # 9 "symbol.h" 2 4 # 1 "symset.h" 1 2 # 10 "symset.h" 2 3 # 84 "symset.h" 1 # 9 "symset.h" 2 1 # 1 "symtab.h" 1 1 # 9 "symtab.h" 2 3 # 1 "termset.h" 1 2 # 50 "termset.h" 1 # 9 "termset.h" 2 Some of the .h files are pretty hefty, as you can see from the size of the expanded source. I don't think I will lose any sleep over what cpp is doing.
meissner@dg-rtp.dg.com (Michael Meissner) (10/24/89)
In article <1659@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu (John Hascall) writes: > Since the impending ANSI standard requires that including a file more > than once have exactly the same effect as including it once...why can't > a compiler just ignore #includes for files it has already #included??? > (at least for the "standard" includes) > > Any comments from the ANSI mavens? Only the include files specified by standard (stdio.h, string.h, etc.) are required to work when included multiple times (ie, there must be some sort of guard around parts that can not be redeclared, like typedefs and structures). No such requirement is mandated for any other include file. -- Michael Meissner, Data General. If compiles where much Uucp: ...!mcnc!rti!xyzzy!meissner faster, when would we Internet: meissner@dg-rtp.DG.COM have time for netnews?
sartin@hplabsz.HPL.HP.COM (Rob Sartin) (10/25/89)
In article <1659@atanasoff.cs.iastate.edu> hascall@atanasoff.UUCP (John Hascall) writes: > Since the impending ANSI standard requires that including a file more > than once have exactly the same effect as including it once...why can't > a compiler just ignore #includes for files it has already #included??? > (at least for the "standard" includes) That introduces some potential oddities for <assert.h> which doesn't try to prevent from being included twice. My C++ 2.0 assert.h includes a comment that says: /* This header file intentionally has no wrapper, since the user * may want to re-include it to turn off/on assertions for only * a portion of the source file. */ > but, does anyone do this thing in the *same* file?? =begin foo.c #define NDEBUG #include <assert.h> int tested_function() { ... } #undef NDEBUG #include <assert.h> int untested_function() { } =end foo.c Rob Sartin internet: sartin@hplabs.hp.com Software Technology Lab uucp : hplabs!sartin Hewlett-Packard voice : (415) 857-7592
gwyn@smoke.BRL.MIL (Doug Gwyn) (10/25/89)
In article <14240@well.UUCP> nagle@well.UUCP (John Nagle) writes: > It's been proposed that the semantics of "#include" be changed to >avoid all multiple inclusion. But this is controversial, and would require >ANSI approval. It's not especially controversial, because as you imply it would be a change to a well-defined characteristic of the C language. Thus when it was proposed to X3J11, we had little difficulty in determining that the proposal must be rejected. Several people attested to the fact that they have existing code that requires the existing semantics. > I propose a solution via compiler optimization. Your solution does not at all seem to me to preserve existing semantics.
shap@delrey.sgi.com (Jonathan Shapiro) (10/25/89)
Why does everybody feel compelled to reinvent this wheel? The current most widley accepted solution for single inclusion is to insert a pragma into the header file: #pragma once Jonathan Shapiro Synergistic Computing Associates
shap@delrey.sgi.com (Jonathan Shapiro) (10/25/89)
Okay, here's another cute trick. Have an include file called include-files.h, which contains things like #define FRED_H "fred.h" #define WILMA_H "wilma.h" include them by doing the following in all source files: #include "include-files.h" // done once in all source files #include FRED_H #include ... inside FRED_H do the following: #ifdef FRED_H #undef FRED_H #define FRED_H /dev/null #endif This works, doesn't require much overhead, and can be automatically done applied to existing code by a fairly simple shell script. Jonathan S. Shapiro Synergistic Computing Associates
bph@buengc.BU.EDU (Blair P. Houghton) (10/25/89)
How about ( in <foo.h>): #pragma Never_Again Which tells the BlairTech/ANSI1.0 compiler that when it hits a #include to read this file again, it should just ignore it. --Blair "Which is what the original poster was saying, only in an 'I want it standardized, maybe' manner..."
henry@utzoo.uucp (Henry Spencer) (10/25/89)
In article <11396@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: >> I propose a solution via compiler optimization. > >Your solution does not at all seem to me to preserve existing semantics. Can you elaborate on this, Doug? Seems to me like what he was proposing -- have compiler recognize files bracketed with `#ifndef FOO_H' and remember the bracketing -- comes under the "as if" rule. Re-including such a file with FOO_H defined cannot possibly have any effect except to slow down the compilation. -- A bit of tolerance is worth a | Henry Spencer at U of Toronto Zoology megabyte of flaming. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
nagle@well.UUCP (John Nagle) (10/27/89)
In article <1989Oct24.060920.28655@cs.rochester.edu> ken@cs.rochester.edu writes:
->But is this really as inefficient as people think? I tried the
->following on a Sun-4/60
->
->% wc grammar0.cc
-> 932 2944 19700 grammar0.cc
->% g++ -I../h -E grammar0.cc | wc
-> 3728 8219 63497
->% time g++ -I../h -E grammar0.cc > /tmp/foo.cc
->0.4u 0.3s 0:01 44% 0+208k 0+9io 0pf+0w
->
->Looks pretty insignificant compared to parsing and CG time.
What are you comparing to what? Only one time measurement is given.
This makes it rather meaningless to draw any conclusions.
John Nagle
diamond@csl.sony.co.jp (Norman Diamond) (10/27/89)
Someone: >>> I propose a solution via compiler optimization. In article <11396@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: >>Your solution does not at all seem to me to preserve existing semantics. In article <1989Oct25.164145.29980@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >Can you elaborate on this, Doug? Seems to me like what he was proposing -- >have compiler recognize files bracketed with `#ifndef FOO_H' and remember >the bracketing -- comes under the "as if" rule. Re-including such a file >with FOO_H defined cannot possibly have any effect except to slow down >the compilation. I mostly agree with Mr. Spencer. However, perhaps we need an additional interpretation ahead of this one. Is inclusion a "volatile" operation? Hmm. In fact, if a program is being interpreted instead of compiled, the program itself can be responsible for its include files being changed between two successive inclusions. Now maybe C will finally replace LISP. :-) :-) -- Norman Diamond, Sony Corp. (diamond%ws.sony.junet@uunet.uu.net seems to work) Should the preceding opinions be caught or | James Bond asked his killed, the sender will disavow all knowledge | ATT rep for a source of their activities or whereabouts. | licence to "kill".
ken@cs.rochester.edu (Ken Yap) (10/27/89)
|->But is this really as inefficient as people think? I tried the |->following on a Sun-4/60 |-> |->% wc grammar0.cc |-> 932 2944 19700 grammar0.cc |->% g++ -I../h -E grammar0.cc | wc |-> 3728 8219 63497 |->% time g++ -I../h -E grammar0.cc > /tmp/foo.cc |->0.4u 0.3s 0:01 44% 0+208k 0+9io 0pf+0w |-> |->Looks pretty insignificant compared to parsing and CG time. | | What are you comparing to what? Only one time measurement is given. |This makes it rather meaningless to draw any conclusions. Sorry, sloppy of me: % time g++ -I../h -c grammar0.cc 10.5u 2.5s 0:22 57% 0+1532k 34+37io 202pf+0w 5% of the total time is not something I care to worry about at this time.
gwyn@smoke.BRL.MIL (Doug Gwyn) (10/28/89)
In article <1989Oct25.164145.29980@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
-Can you elaborate on this, Doug? Seems to me like what he was proposing --
-have compiler recognize files bracketed with `#ifndef FOO_H' and remember
-the bracketing -- comes under the "as if" rule. Re-including such a file
-with FOO_H defined cannot possibly have any effect except to slow down
-the compilation.
I received some private correspondence on this also, and apparently I
didn't grasp the actual meat of the proposal.
I suppose if further #undef/#define of the identifier were properly
tracked, it would work.
dhesi@sun505.UUCP (Rahul Dhesi) (10/28/89)
In article <1088@odin.SGI.COM> shap@delrey.sgi.com (Jonathan Shapiro) writes: > #include FRED_H Please be alert for problems. K&R requires the token after the "#include" to be a filename enclosed in double quotes or angle brackets, not an arbitrary symbol. It was not until the ANSI C standard that the generalized syntax was blessed. Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com> UUCP: oliveb!cirrusl!dhesi Use above addresses--email sent here via Sun.com will probably bounce.
marc@dumbcat.UUCP (Marco S Hyman) (10/29/89)
In article <1011@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes: In article <1088@odin.SGI.COM> shap@delrey.sgi.com (Jonathan Shapiro) writes: > #include FRED_H Please be alert for problems. K&R requires the token after the "#include" to be a filename enclosed in double quotes or angle brackets, not an arbitrary symbol. And the C compiler shipped with System V/386 3.2 (ISC's flavor) coffs (I couldn't help myself ;-) on that one -- At least it didn't like #include __FILE__. (Another reason to use gcc/g++). --marc -- // Marco S. Hyman {ames,pyramid,sun}!pacbell!dumbcat!marc
dhesi@sunscreen.UUCP (Rahul Dhesi) (10/30/89)
In article <1087@odin.SGI.COM> shap@delrey.sgi.com (Jonathan Shapiro) writes: >The current most widley accepted solution for single inclusion is to >insert a pragma into the header file: > > #pragma once A serious mistake, because this pragma can affect the meaning of a program, and therefore cannot be safely ignored. Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com> UUCP: oliveb!cirrusl!dhesi Use above addresses--email sent here via Sun.com will probably bounce.