sources-request@genrad.UUCP (01/08/85)
From: minow@decvax This is part two of a distribution of the Decus public-domain implementation of the C preprocessor. -h- cpp1.c Mon Jan 7 23:59:31 1985 cpp1.c /* * CPP main program. * * Edit history * 21-May-84 MM "Field test" release * 23-May-84 MM Some minor hacks. * 30-May-84 ARF Didn't get enough memory for __DATE__ * Added code to read stdin if no input * files are provided. * 29-Jun-84 MM Added ARF's suggestions, Unixifying cpp. * 11-Jul-84 MM "Official" first release (that's what I thought!) * 22-Jul-84 MM/ARF/SCK Fixed line number bugs, added cpp recognition * of #line, fixed problems with #include. * 23-Jul-84 MM More (minor) include hacking, some documentation. * Also, redid cpp's #include files * 25-Jul-84 MM #line filename isn't used for #include searchlist * #line format is <number> <optional name> * 25-Jul-84 ARF/MM Various bugs, mostly serious. Removed homemade doprint * 01-Aug-84 MM Fixed recursion bug, remove extra newlines and * leading whitespace from cpp output. * 02-Aug-84 MM Hacked (i.e. optimized) out blank lines and unneeded * whitespace in general. Cleaned up unget()'s. * 03-Aug-84 Keie Several bug fixes from Ed Keizer, Vrije Universitet. * -- corrected arg. count in -D and pre-defined * macros. Also, allow \n inside macro actual parameter * lists. * 06-Aug-84 MM If debugging, dump the preset vector at startup. * 12-Aug-84 MM/SCK Some small changes from Sam Kendall * 15-Aug-84 Keie/MM cerror, cwarn, etc. take a single string arg. * cierror, etc. take a single int. arg. * changed LINE_PREFIX slightly so it can be * changed in the makefile. * 31-Aug-84 MM USENET net.sources release. * 7-Sep-84 SCH/ado Lint complaints * 10-Sep-84 Keie Char's can't be signed in some implementations * 11-Sep-84 ado Added -C flag, pathological line number fix * 13-Sep-84 ado Added -E flag (does nothing) and "-" file for stdin. * 14-Sep-84 MM Allow # 123 as a synonym for #line 123 * 19-Sep-84 MM scanid always reads to token, make sure #line is * written to a new line, even if -C switch given. * Also, cpp - - reads stdin, writes stdout. * 03-Oct-84 ado/MM Several changes to line counting and keepcomments * stuff. Also a rewritten control() hasher -- much * simpler and no less "perfect". Note also changes * in cpp3.c to fix numeric scanning. * 04-Oct-84 MM Added recognition of macro formal parameters if * they are the only thing in a string, per the * draft standard. * 08-Oct-84 MM One more attack on scannumber * 15-Oct-84 MM/ado Added -N to disable predefined symbols. Fixed * linecount if COMMENT_INVISIBLE enabled. * 22-Oct-84 MM Don't evaluate the #if/#ifdef argument if * compilation is supressed. This prevents * unnecessary error messages in sequences such as * #ifdef FOO -- undefined * #if FOO == 10 -- shouldn't print warning * 25-Oct-84 MM Fixed bug in false ifdef supression. On vms, * #include <foo> should open foo.h -- this duplicates * the behavior of Vax-C * 31-Oct-84 ado/MM Parametized $ in indentifiers. Added a better * token concatenator and took out the trial * concatenation code. Also improved #ifdef code * and cleaned up the macro recursion tester. * 2-Nov-84 MM/ado Some bug fixes in token concatenation, also * a variety of minor (uninteresting) hacks. * 6-Nov-84 MM Happy Birthday. Broke into 4 files and added * #if sizeof (basic_types) * 9-Nov-84 MM Added -S* for pointer type sizes * 13-Nov-84 MM Split cpp1.c, added vms defaulting * 23-Nov-84 MM/ado -E supresses error exit, added CPP_INCLUDE, * fixed strncpy bug. * 3-Dec-84 ado/MM Added OLD_PREPROCESSOR * 7-Dec-84 MM Stuff in Nov 12 Draft Standard * 17-Dec-84 george Fixed problems with recursive macros * 17-Dec-84 MM Yet another attack on #if's (f/t)level removed. * 07-Jan-85 ado Init defines before doing command line options * so -Uunix works. */ /*)BUILD $(PROGRAM) = cpp $(FILES) = { cpp1 cpp2 cpp3 cpp4 cpp5 cpp6 } $(INCLUDE) = { cppdef.h cpp.h } $(STACK) = 2000 $(TKBOPTIONS) = { STACK = 2000 } */ #ifdef DOCUMENTATION title cpp C Pre-Processor index C pre-processor synopsis .s.nf cpp [-options] [infile [outfile]] .s.f description CPP reads a C source file, expands macros and include files, and writes an input file for the C compiler. If no file arguments are given, CPP reads from stdin and writes to stdout. If one file argument is given, it will define the input file, while two file arguments define both input and output files. The file name "-" is a synonym for stdin or stdout as appropriate. The following options are supported. Options may be given in either case. .lm +16 .p -16 -C If set, source-file comments are written to the output file. This allows the output of CPP to be used as the input to a program, such as lint, that expects commands embedded in specially-formatted comments. .p -16 -Dname=value Define the name as if the programmer wrote #define name value at the start of the first file. If "=value" is not given, a value of "1" will be used. On non-unix systems, all alphabetic text will be forced to upper-case. .p -16 -E Always return "success" to the operating system, even if errors were detected. Note that some fatal errors, such as a missing #include file, will terminate CPP, returning "failure" even if the -E option is given. .p -16 -Idirectory Add this directory to the list of directories searched for #include "..." and #include <...> commands. Note that there is no space between the "-I" and the directory string. More than one -I command is permitted. On non-Unix systems "directory" is forced to upper-case. .p -16 -N CPP normally predefines some symbols defining the target computer and operating system. If -N is specified, no symbols will be predefined. If -N -N is specified, the "always present" symbols, __LINE__, __FILE__, and __DATE__ are not defined. .p -16 -Stext CPP normally assumes that the size of the target computer's basic variable types is the same as the size of these types of the host computer. (This can be overridden when CPP is compiled, however.) The -S option allows dynamic respecification of these values. "text" is a string of numbers, separated by commas, that specifies correct sizes. The sizes must be specified in the exact order: char short int long float double If you specify the option as "-S*text", pointers to these types will be specified. -S* takes one additional argument for pointer to function (e.g. int (*)()) For example, to specify sizes appropriate for a PDP-11, you would write: c s i l f d func -S1,2,2,2,4,8, -S*2,2,2,2,2,2,2 Note that all values must be specified. .p -16 -Uname Undefine the name as if #undef name were given. On non-Unix systems, "name" will be forced to upper-case. .p -16 -Xnumber Enable debugging code. If no value is given, a value of 1 will be used. (For maintenence of CPP only.) .s.lm -16 Pre-Defined Variables When CPP begins processing, the following variables will have been defined (unless the -N option is specified): .s Target computer (as appropriate): .s pdp11, vax, M68000 m68000 m68k .s Target operating system (as appropriate): .s rsx, rt11, vms, unix .s Target compiler (as appropriate): .s decus, vax11c .s The implementor may add definitions to this list. The default definitions match the definition of the host computer, operating system, and C compiler. .s The following are always available unless undefined (or -N was specified twice): .lm +16 .p -12 __FILE__ The input (or #include) file being compiled (as a quoted string). .p -12 __LINE__ The line number being compiled. .p -12 __DATE__ The date and time of compilation as a Unix ctime quoted string (the trailing newline is removed). Thus, .s printf("Bug at line %s,", __LINE__); printf(" source file %s", __FILE__); printf(" compiled on %s", __DATE__); .s.lm -16 Draft Proposed Ansi Standard Considerations The current version of the Draft Proposed Standard explicitly states that "readers are requested not to specify or claim conformance to this draft." Readers and users of Decus CPP should not assume that Decus CPP conforms to the standard, or that it will conform to the actual C Language Standard. When CPP is itself compiled, many features of the Draft Proposed Standard that are incompatible with existing preprocessors may be disabled. See the comments in CPP's source for details. The latest version of the Draft Proposed Standard (as reflected in Decus CPP) is dated November 12, 1984. Comments are removed from the input text. The comment is replaced by a single space character. The -C option preserves comments, writing them to the output file. The '$' character is considered to be a letter. This is a permitted extension. The following new features of C are processed by CPP: .s.comment Note: significant spaces, not tabs, .br quotes #if, #elif .br;####_#elif expression (_#else _#if) .br;####'_\xNNN' (Hexadecimal constant) .br;####'_\a' (Ascii BELL) .br;####'_\v' (Ascii Vertical Tab) .br;####_#if defined NAME 1 if defined, 0 if not .br;####_#if defined (NAME) 1 if defined, 0 if not .br;####_#if sizeof (basic type) .br;####unary + .br;####123U, 123LU Unsigned ints and longs. .br;####12.3L Long double numbers .br;####token_#token Token concatenation .br;####_#include token Expands to filename The Draft Proposed Standard has extended C, adding a constant string concatenation operator, where "foo" "bar" is regarded as the single string "foobar". (This does not affect CPP's processing but does permit a limited form of macro argument substitution into strings as will be discussed.) The Standard Committee plans to add token concatenation to #define command lines. One suggested implementation is as follows: the sequence "Token1#Token2" is treated as if the programmer wrote "Token1Token2". This could be used as follows: #line 123 #define ATLINE foo#__LINE__ ATLINE would be defined as foo123. Note that "Token2" must either have the format of an identifier or be a string of digits. Thus, the string #define ATLINE foo#1x3 generates two tokens: "foo1" and "x3". If the tokens T1 and T2 are concatenated into T3, this implementation operates as follows: 1. Expand T1 if it is a macro. 2. Expand T2 if it is a macro. 3. Join the tokens, forming T3. 4. Expand T3 if it is a macro. A macro formal parameter will be substituted into a string or character constant if it is the only component of that constant: #define VECSIZE 123 #define vprint(name, size) \ printf("name" "[" "size" "] = {\n") ... vprint(vector, VECSIZE); expands (effectively) to vprint("vector[123] = {\n"); Note that this will be useful if your C compiler supports the new string concatenation operation noted above. As implemented here, if you write #define string(arg) "arg" ... string("foo") ... This implementation generates "foo", rather than the strictly correct ""foo"" (which will probably generate an error message). This is, strictly speaking, an error in CPP and may be removed from future releases. error messages Many. CPP prints warning or error messages if you try to use multiple-byte character constants (non-transportable) if you #undef a symbol that was not defined, or if your program has potentially nested comments. author Martin Minow bugs The #if expression processor uses signed integers only. I.e, #if 0xFFFFu < 0 may be TRUE. #endif #include <stdio.h> #include <ctype.h> #include "cppdef.h" #include "cpp.h" /* * Commonly used global variables: * line is the current input line number. * wrongline is set in many places when the actual output * line is out of sync with the numbering, e.g, * when expanding a macro with an embedded newline. * * token holds the last identifier scanned (which might * be a candidate for macro expansion). * errors is the running cpp error counter. * infile is the head of a linked list of input files (extended by * #include and macros being expanded). infile always points * to the current file/macro. infile->parent to the includer, * etc. infile->fd is NULL if this input stream is a macro. */ int line; /* Current line number */ int wrongline; /* Force #line to compiler */ char token[IDMAX + 1]; /* Current input token */ int errors; /* cpp error counter */ FILEINFO *infile = NULL; /* Current input file */ #if DEBUG int debug; /* TRUE if debugging now */ #endif /* * This counter is incremented when a macro expansion is initiated. * If it exceeds a built-in value, the expansion stops -- this tests * for a runaway condition: * #define X Y * #define Y X * X * This can be disabled by falsifying rec_recover. (Nothing does this * currently: it is a hook for an eventual invocation flag.) */ int recursion; /* Infinite recursion counter */ int rec_recover = TRUE; /* Unwind recursive macros */ /* * instring is set TRUE when a string is scanned. It modifies the * behavior of the "get next character" routine, causing all characters * to be passed to the caller (except <DEF_MAGIC>). Note especially that * comments and \<newline> are not removed from the source. (This * prevents cpp output lines from being arbitrarily long). * * inmacro is set by #define -- it absorbs comments and converts * form-feed and vertical-tab to space, but returns \<newline> * to the caller. Strictly speaking, this is a bug as \<newline> * shouldn't delimit tokens, but we'll worry about that some other * time -- it is more important to prevent infinitly long output lines. * * instring and inmarcor are parameters to the get() routine which * were made global for speed. */ int instring = FALSE; /* TRUE if scanning string */ int inmacro = FALSE; /* TRUE if #defining a macro */ /* * work[] and workp are used to store one piece of text in a temporay * buffer. To initialize storage, set workp = work. To store one * character, call save(c); (This will fatally exit if there isn't * room.) To terminate the string, call save(EOS). Note that * the work buffer is used by several subroutines -- be sure your * data won't be overwritten. The extra byte in the allocation is * needed for string formal replacement. */ char work[NWORK + 1]; /* Work buffer */ char *workp; /* Work buffer pointer */ /* * keepcomments is set TRUE by the -C option. If TRUE, comments * are written directly to the output stream. This is needed if * the output from cpp is to be passed to lint (which uses commands * embedded in comments). cflag contains the permanent state of the * -C flag. keepcomments is always falsified when processing #control * commands and when compilation is supressed by a false #if * * If eflag is set, CPP returns "success" even if non-fatal errors * were detected. * * If nflag is non-zero, no symbols are predefined except __LINE__. * __FILE__, and __DATE__. If nflag > 1, absolutely no symbols * are predefined. */ int keepcomments = FALSE; /* Write out comments flag */ int cflag = FALSE; /* -C option (keep comments) */ int eflag = FALSE; /* -E option (never fail) */ int nflag = 0; /* -N option (no predefines) */ /* * ifstack[] holds information about nested #if's. It is always * accessed via *ifptr. The information is as follows: * WAS_COMPILING state of compiling flag at outer level. * ELSE_SEEN set TRUE when #else seen to prevent 2nd #else. * TRUE_SEEN set TRUE when #if or #elif succeeds * ifstack[0] holds the compiling flag. It is TRUE if compilation * is currently enabled. Note that this must be initialized TRUE. */ char ifstack[BLK_NEST] = { TRUE }; /* #if information */ char *ifptr = ifstack; /* -> current ifstack[] */ /* * incdir[] stores the -i directories (and the system-specific * #include <...> directories. */ char *incdir[NINCLUDE]; /* -i directories */ char **incend = incdir; /* -> free space in incdir[] */ /* * This is the table used to predefine target machine and operating * system designators. It may need hacking for specific circumstances. * Note: it is not clear that this is part of the Ansi Standard. * The -N option supresses preset definitions. */ char *preset[] = { /* names defined at cpp start */ #ifdef MACHINE MACHINE, #endif #ifdef SYSTEM SYSTEM, #endif #ifdef COMPILER COMPILER, #endif #if DEBUG "decus_cpp", /* Ourselves! */ #endif NULL /* Must be last */ }; /* * The value of these predefined symbols must be recomputed whenever * they are evaluated. The order must not be changed. */ char *magic[] = { /* Note: order is important */ "__LINE__", "__FILE__", NULL /* Must be last */ }; main(argc, argv) int argc; char *argv[]; { register int i; #if HOST == SYS_VMS argc = getredirection(argc, argv); /* vms >file and <file */ #endif initdefines(); /* O.S. specific def's */ i = dooptions(argc, argv); /* Command line -flags */ switch (i) { case 3: /* * Get output file, "-" means use stdout. */ if (!streq(argv[2], "-")) { #if HOST == SYS_VMS /* * On vms, reopen stdout with "vanilla rms" attributes. */ if ((i = creat(argv[2], 0, "rat=cr", "rfm=var")) == -1 || dup2(i, fileno(stdout)) == -1) { #else if (freopen(argv[2], "w", stdout) == NULL) { #endif perror(argv[2]); cerror("Can't open output file \"%s\"", argv[2]); exit(IO_ERROR); } } /* Continue by opening input */ case 2: /* One file -> stdin */ /* * Open input file, "-" means use stdin. */ if (!streq(argv[1], "-")) { if (freopen(argv[1], "r", stdin) == NULL) { perror(argv[1]); cerror("Can't open input file \"%s\"", argv[1]); exit(IO_ERROR); } strcpy(work, argv[1]); /* Remember input filename */ break; } /* Else, just get stdin */ case 0: /* No args? */ case 1: /* No files, stdin -> stdout */ #if HOST == SYS_UNIX work[0] = EOS; /* Unix can't find stdin name */ #else fgetname(stdin, work); /* Vax-11C, Decus C know name */ #endif break; default: exit(IO_ERROR); /* Can't happen */ } setincdirs(); /* Setup -I include directories */ addfile(stdin, work); /* "open" main input file */ #if DEBUG if (debug > 0) dumpdef("preset #define symbols"); #endif cppmain(); /* Process main file */ if ((i = (ifptr - &ifstack[0])) != 0) { #if OLD_PREPROCESSOR ciwarn("Inside #ifdef block at end of input, depth = %d", i); #else cierror("Inside #ifdef block at end of input, depth = %d", i); #endif } fclose(stdout); if (errors > 0) { fprintf(stderr, (errors == 1) ? "%d error in preprocessor\n" : "%d errors in preprocessor\n", errors); if (!eflag) exit(IO_ERROR); } exit(IO_NORMAL); /* No errors or -E option set */ } FILE_LOCAL cppmain() /* * Main process for cpp -- copies tokens from the current input * stream (main file, include file, or a macro) to the output * file. */ { register int c; /* Current character */ register int counter; /* newlines and spaces */ extern int output(); /* Output one character */ /* * Explicitly output a #line at the start of cpp output so * that lint (etc.) knows the name of the original source * file. If we don't do this explicitly, we may get * the name of the first #include file instead. */ sharp(); /* * This loop is started "from the top" at the beginning of each line * wrongline is set TRUE in many places if it is necessary to write * a #line record. (But we don't write them when expanding macros.) * * The counter variable has two different uses: at * the start of a line, it counts the number of blank lines that * have been skipped over. These are then either output via * #line records or by outputting explicit blank lines. * When expanding tokens within a line, the counter remembers * whether a blank/tab has been output. These are dropped * at the end of the line, and replaced by a single blank * within lines. */ for (;;) { counter = 0; /* Count empty lines */ for (;;) { /* For each line, ... */ while (type[(c = get())] == SPA) /* Skip leading blanks */ ; /* in this line. */ if (c == '\n') /* If line's all blank, */ ++counter; /* Do nothing now */ else if (c == '#') { /* Is 1st non-space '#' */ keepcomments = FALSE; /* Don't pass comments */ counter = control(counter); /* Yes, do a #command */ keepcomments = (cflag && compiling); } else if (c == EOF_CHAR) /* At end of file? */ break; else if (!compiling) { /* #ifdef false? */ skipnl(); /* Skip to newline */ counter++; /* Count it, too. */ } else { break; /* Actual token */ } } if (c == EOF_CHAR) /* Exit process at */ break; /* End of file */ /* * If the loop didn't terminate because of end of file, we * know there is a token to compile. First, clean up after * absorbing newlines. counter has the number we skipped. */ if ((wrongline && infile->fp != NULL) || counter > 4) sharp(); /* Output # line number */ else { /* If just a few, stuff */ while (--counter >= 0) /* them out ourselves */ putchar('\n'); } /* * Process each token on this line. */ unget(); /* Reread the char. */ for (;;) { /* For the whole line, */ do { /* Token concat. loop */ for (counter = 0; (type[(c = get())] == SPA);) { #if COMMENT_INVISIBLE if (c != COM_SEP) counter++; #else counter++; /* Skip over blanks */ #endif } if (c == EOF_CHAR || c == '\n') goto end_line; /* Exit line loop */ else if (counter > 0) /* If we got any spaces */ putchar(' '); /* Output one space */ c = macroid(c); /* Grab the token */ } while (type[c] == LET && catenate()); if (c == EOF_CHAR || c == '\n') /* From macro exp error */ goto end_line; /* Exit line loop */ switch (type[c]) { case LET: fputs(token, stdout); /* Quite ordinary token */ break; case DIG: /* Output a number */ case DOT: /* Dot may begin floats */ scannumber(c, output); break; case QUO: /* char or string const */ scanstring(c, output); /* Copy it to output */ break; default: /* Some other character */ cput(c); /* Just output it */ break; } /* Switch ends */ } /* Line for loop */ end_line: if (c == '\n') { /* Compiling at EOL? */ putchar('\n'); /* Output newline, if */ if (infile->fp == NULL) /* Expanding a macro, */ wrongline = TRUE; /* Output # line later */ } } /* Continue until EOF */ } output(c) int c; /* * Output one character to stdout -- output() is passed as an * argument to scanstring() */ { #if COMMENT_INVISIBLE if (c != TOK_SEP && c != COM_SEP) #else if (c != TOK_SEP) #endif putchar(c); } static char *sharpfilename = NULL; FILE_LOCAL sharp() /* * Output a line number line. */ { register char *name; if (keepcomments) /* Make sure # comes on */ putchar('\n'); /* a fresh, new line. */ printf("#%s %d", LINE_PREFIX, line); if (infile->fp != NULL) { name = (infile->progname != NULL) ? infile->progname : infile->filename; if (sharpfilename == NULL || sharpfilename != NULL && !streq(name, sharpfilename)) { if (sharpfilename != NULL) free(sharpfilename); sharpfilename = savestring(name); printf(" \"%s\"", name); } } putchar('\n'); wrongline = FALSE; } -h- cpp2.c Mon Jan 7 23:59:31 1985 cpp2.c /* * C P P 2 . C * * Process #control lines * * Edit history * 13-Nov-84 MM Split from cpp1.c */ #include <stdio.h> #include <ctype.h> #include "cppdef.h" #include "cpp.h" #if HOST == SYS_VMS /* * Include the rms stuff. (We can't just include rms.h as it uses the * VaxC-specific library include syntax that Decus CPP doesn't support. * By including things by hand, we can CPP ourself.) */ #include <nam.h> #include <fab.h> #include <rab.h> #include <rmsdef.h> #endif /* * Generate (by hand-inspection) a set of unique values for each control * operator. Note that this is not guaranteed to work for non-Ascii * machines. CPP won't compile if there are hash conflicts. */ #define L_assert ('a' + ('s' << 1)) #define L_define ('d' + ('f' << 1)) #define L_elif ('e' + ('i' << 1)) #define L_else ('e' + ('s' << 1)) #define L_endif ('e' + ('d' << 1)) #define L_if ('i' + (EOS << 1)) #define L_ifdef ('i' + ('d' << 1)) #define L_ifndef ('i' + ('n' << 1)) #define L_include ('i' + ('c' << 1)) #define L_line ('l' + ('n' << 1)) #define L_nogood (EOS + (EOS << 1)) /* To catch #i */ #define L_pragma ('p' + ('a' << 1)) #define L_undef ('u' + ('d' << 1)) #if DEBUG #define L_debug ('d' + ('b' << 1)) /* #debug */ #define L_nodebug ('n' + ('d' << 1)) /* #nodebug */ #endif int control(counter) int counter; /* Pending newline counter */ /* * Process #control lines. Simple commands are processed inline, * while complex commands have their own subroutines. * * The counter is used to force out a newline before #line, and * #pragma commands. This prevents these commands from ending up at * the end of the previous line if cpp is invoked with the -C option. */ { register int c; register char *tp; register int hash; char *ep; c = skipws(); if (c == '\n' || c == EOF_CHAR) return (counter + 1); if (!isdigit(c)) scanid(c); /* Get #word to token[] */ else { unget(); /* Hack -- allow #123 as a */ strcpy(token, "line"); /* synonym for #line 123 */ } hash = (token[1] == EOS) ? L_nogood : (token[0] + (token[2] << 1)); switch (hash) { case L_assert: tp = "assert"; break; case L_define: tp = "define"; break; case L_elif: tp = "elif"; break; case L_else: tp = "else"; break; case L_endif: tp = "endif"; break; case L_if: tp = "if"; break; case L_ifdef: tp = "ifdef"; break; case L_ifndef: tp = "ifndef"; break; case L_include: tp = "include"; break; case L_line: tp = "line"; break; case L_pragma: tp = "pragma"; break; case L_undef: tp = "undef"; break; #if DEBUG case L_debug: tp = "debug"; break; case L_nodebug: tp = "nodebug"; break; #endif default: hash = L_nogood; case L_nogood: tp = ""; break; } if (!streq(tp, token)) hash = L_nogood; /* * hash is set to a unique value corresponding to the * control keyword (or L_nogood if we think it's nonsense). */ if (infile->fp == NULL) cwarn("Control line \"%s\" within macro expansion", token); if (!compiling) { /* Not compiling now */ switch (hash) { case L_if: /* These can't turn */ case L_ifdef: /* compilation on, but */ case L_ifndef: /* we must nest #if's */ if (++ifptr >= &ifstack[BLK_NEST]) goto if_nest_err; *ifptr = 0; /* !WAS_COMPILING */ case L_line: /* Many */ /* * Are pragma's always processed? */ case L_pragma: /* options */ case L_include: /* are uninteresting */ case L_define: /* if we */ case L_undef: /* aren't */ case L_assert: /* compiling. */ dump_line: skipnl(); /* Ignore rest of line */ return (counter + 1); } } /* * Make sure that #line and #pragma are output on a fresh line. */ if (counter > 0 && (hash == L_line || hash == L_pragma)) { putchar('\n'); counter--; } switch (hash) { case L_line: /* * Parse the line to update the line number and "progname" * field and line number for the next input line. * Set wrongline to force it out later. */ c = skipws(); workp = work; /* Save name in work */ while (c != '\n' && c != EOF_CHAR) { save(c); c = get(); } unget(); save(EOS); /* * Split #line argument into <line-number> and <name> * We subtract 1 as we want the number of the next line. */ line = atoi(work) - 1; /* Reset line number */ for (tp = work; isdigit(*tp) || type[*tp] == SPA; tp++) ; /* Skip over digits */ if (*tp != EOS) { /* Got a filename, so: */ if (*tp == '"' && (ep = strrchr(tp + 1, '"')) != NULL) { tp++; /* Skip over left quote */ *ep = EOS; /* And ignore right one */ } if (infile->progname != NULL) /* Give up the old name */ free(infile->progname); /* if it's allocated. */ infile->progname = savestring(tp); } wrongline = TRUE; /* Force output later */ break; case L_include: doinclude(); break; case L_define: dodefine(); break; case L_undef: doundef(); break; case L_else: if (ifptr == &ifstack[0]) goto nest_err; else if ((*ifptr & ELSE_SEEN) != 0) goto else_seen_err; *ifptr |= ELSE_SEEN; if ((*ifptr & WAS_COMPILING) != 0) { if (compiling || (*ifptr & TRUE_SEEN) != 0) compiling = FALSE; else { compiling = TRUE; } } break; case L_elif: if (ifptr == &ifstack[0]) goto nest_err; else if ((*ifptr & ELSE_SEEN) != 0) { else_seen_err: cerror("#%s may not follow #else", token); goto dump_line; } if ((*ifptr & (WAS_COMPILING | TRUE_SEEN)) != WAS_COMPILING) { compiling = FALSE; /* Done compiling stuff */ goto dump_line; /* Skip this clause */ } doif(L_if); break; case L_if: case L_ifdef: case L_ifndef: if (++ifptr >= &ifstack[BLK_NEST]) if_nest_err: cfatal("Too many nested #%s statements", token); *ifptr = WAS_COMPILING; doif(hash); break; case L_endif: if (ifptr == &ifstack[0]) { nest_err: cerror("#%s must be in an #if", token); goto dump_line; } if (!compiling && (*ifptr & WAS_COMPILING) != 0) wrongline = TRUE; compiling = ((*ifptr & WAS_COMPILING) != 0); --ifptr; break; case L_assert: if (eval() == 0) cerror("Preprocessor assertion failure", NULLST); break; case L_pragma: /* * #pragma is provided to pass "options" to later * passes of the compiler. cpp doesn't have any yet. */ printf("#pragma "); while ((c = get()) != '\n' && c != EOF_CHAR) cput(c); unget(); break; #if DEBUG case L_debug: if (debug == 0) dumpdef("debug set on"); debug++; break; case L_nodebug: debug--; break; #endif default: /* * Undefined #control keyword. * Note: the correct behavior may be to warn and * pass the line to a subsequent compiler pass. * This would allow #asm or similar extensions. */ cerror("Illegal # command \"%s\"", token); break; } if (hash != L_include) { #if OLD_PREPROCESSOR /* * Ignore the rest of the #control line so you can write * #if foo * #endif foo */ goto dump_line; /* Take common exit */ #else if (skipws() != '\n') { cwarn("Unexpected text in #control line ignored", NULLST); skipnl(); } #endif } return (counter + 1); } FILE_LOCAL doif(hash) int hash; /* * Process an #if, #ifdef, or #ifndef. The latter two are straightforward, * while #if needs a subroutine of its own to evaluate the expression. * * doif() is called only if compiling is TRUE. If false, compilation * is always supressed, so we don't need to evaluate anything. This * supresses unnecessary warnings. */ { register int c; register int found; if ((c = skipws()) == '\n' || c == EOF_CHAR) { unget(); goto badif; } if (hash == L_if) { unget(); found = (eval() != 0); /* Evaluate expr, != 0 is TRUE */ hash = L_ifdef; /* #if is now like #ifdef */ } else { if (type[c] != LET) /* Next non-blank isn't letter */ goto badif; /* ... is an error */ found = (lookid(c) != NULL); /* Look for it in symbol table */ } if (found == (hash == L_ifdef)) { compiling = TRUE; *ifptr |= TRUE_SEEN; } else { compiling = FALSE; } return; badif: cerror("#if, #ifdef, or #ifndef without an argument", NULLST); #if !OLD_PREPROCESSOR skipnl(); /* Prevent an extra */ unget(); /* Error message */ #endif return; } FILE_LOCAL doinclude() /* * Process the #include control line. * There are three variations: * #include "file" search somewhere relative to the * current source file, if not found, * treat as #include <file>. * #include <file> Search in an implementation-dependent * list of places. * #include token Expand the token, it must be one of * "file" or <file>, process as such. * * Note: the November 12 draft forbids '>' in the #include <file> format. * This restriction is unnecessary and not implemented. */ { register int c; register int delim; #if HOST == SYS_VMS char def_filename[NAM$C_MAXRSS + 1]; #endif delim = macroid(skipws()); if (delim != '<' && delim != '"') goto incerr; if (delim == '<') delim = '>'; workp = work; instring = TRUE; /* Accept all characters */ while ((c = get()) != '\n' && c != EOF_CHAR) save(c); /* Put it away. */ unget(); /* Force nl after includee */ /* * The draft is unclear if the following should be done. */ while (--workp >= work && *workp == ' ') ; /* Trim blanks from filename */ if (*workp != delim) goto incerr; *workp = EOS; /* Terminate filename */ instring = FALSE; #if HOST == SYS_VMS /* * Assume the default .h filetype. */ if (!vmsparse(work, ".H", def_filename)) { perror(work); /* Oops. */ goto incerr; } else if (openinclude(def_filename, (delim == '"'))) return; #else if (openinclude(work, (delim == '"'))) return; #endif /* * No sense continuing if #include file isn't there. */ cfatal("Cannot open include file \"%s\"", work); incerr: cerror("#include syntax error", NULLST); return; } FILE_LOCAL int openinclude(filename, searchlocal) char *filename; /* Input file name */ int searchlocal; /* TRUE if #include "file" */ /* * Actually open an include file. This routine is only called from * doinclude() above, but was written as a separate subroutine for * programmer convenience. It searches the list of directories * and actually opens the file, linking it into the list of * active files. Returns TRUE if the file was opened, FALSE * if openinclude() fails. No error message is printed. */ { register char **incptr; #if HOST == SYS_VMS #if NWORK < (NAM$C_MAXRSS + 1) << error, NWORK isn't greater than NAM$C_MAXRSS >> #endif #endif char tmpname[NWORK]; /* Filename work area */ if (searchlocal) { /* * Look in local directory first */ #if HOST == SYS_UNIX /* * Try to open filename relative to the directory of the current * source file (as opposed to the current directory). (ARF, SCK). */ if (filename[0] != '/' && hasdirectory(infile->filename, tmpname)) strcat(tmpname, filename); else { strcpy(tmpname, filename); } #else if (!hasdirectory(filename, tmpname) && hasdirectory(infile->filename, tmpname)) strcat(tmpname, filename); else { strcpy(tmpname, filename); } #endif if (openfile(tmpname)) return (TRUE); } /* * Look in any directories specified by -I command line * arguments, then in the builtin search list. */ for (incptr = incdir; incptr < incend; incptr++) { if (strlen(*incptr) + strlen(filename) >= (NWORK - 1)) cfatal("Filename work buffer overflow", NULLST); else { #if HOST == SYS_UNIX if (filename[0] == '/') strcpy(tmpname, filename); else { sprintf(tmpname, "%s/%s", *incptr, filename); } #else if (!hasdirectory(filename, tmpname)) sprintf(tmpname, "%s%s", *incptr, filename); #endif if (openfile(tmpname)) return (TRUE); } } return (FALSE); } FILE_LOCAL int hasdirectory(source, result) char *source; /* Directory to examine */ char *result; /* Put directory stuff here */ /* * If a device or directory is found in the source filename string, the * node/device/directory part of the string is copied to result and * hasdirectory returns TRUE. Else, nothing is copied and it returns FALSE. */ { #if HOST == SYS_UNIX register char *tp; if ((tp = strrchr(source, '/')) == NULL) return (FALSE); else { strncpy(result, source, tp - source + 1); result[tp - source + 1] = EOS; return (TRUE); } #else #if HOST == SYS_VMS if (vmsparse(source, NULLST, result) && result[0] != EOS) return (TRUE); else { return (FALSE); } #else /* * Random DEC operating system (RSX, RT11, RSTS/E) */ register char *tp; if ((tp = strrchr(source, ']')) == NULL && (tp = strrchr(source, ':')) == NULL) return (FALSE); else { strncpy(result, source, tp - source + 1); result[tp - source + 1] = EOS; return (TRUE); } #endif #endif } #if HOST == SYS_VMS /* * EXP_DEV is set if a device was specified, EXP_DIR if a directory * is specified. (Both set indicate a file-logical, but EXP_DEV * would be set by itself if you are reading, say, SYS$INPUT:) */ #define DEVDIR (NAM$M_EXP_DEV | NAM$M_EXP_DIR) FILE_LOCAL int vmsparse(source, defstring, result) char *source; char *defstring; /* non-NULL -> default string. */ char *result; /* Size is at least NAM$C_MAXRSS + 1 */ /* * Parse the source string, applying the default (properly, using * the system parse routine), storing it in result. * TRUE if it parsed, FALSE on error. * * If defstring is NULL, there are no defaults and result gets * (just) the node::[directory] part of the string (possibly "") */ { struct FAB fab = cc$rms_fab; /* File access block */ struct NAM nam = cc$rms_nam; /* File name block */ char fullname[NAM$C_MAXRSS + 1]; register char *rp; /* Result pointer */ fab.fab$l_nam = &nam; /* fab -> nam */ fab.fab$l_fna = source; /* Source filename */ fab.fab$b_fns = strlen(source); /* Size of source */ fab.fab$l_dna = defstring; /* Default string */ if (defstring != NULLST) fab.fab$b_dns = strlen(defstring); /* Size of default */ nam.nam$l_esa = fullname; /* Expanded filename */ nam.nam$b_ess = NAM$C_MAXRSS; /* Expanded name size */ if (sys$parse(&fab) == RMS$_NORMAL) { /* Parse away */ fullname[nam.nam$b_esl] = EOS; /* Terminate string */ result[0] = EOS; /* Just in case */ rp = &result[0]; /* * Remove stuff added implicitly, accepting node names and * dev:[directory] strings (but not process-permanent files). */ if ((nam.nam$l_fnb & NAM$M_PPF) == 0) { if ((nam.nam$l_fnb & NAM$M_NODE) != 0) { strncpy(result, nam.nam$l_node, nam.nam$b_node); rp += nam.nam$b_node; *rp = EOS; } if ((nam.nam$l_fnb & DEVDIR) == DEVDIR) { strncpy(rp, nam.nam$l_dev, nam.nam$b_dev + nam.nam$b_dir); rp += nam.nam$b_dev + nam.nam$b_dir; *rp = EOS; } } if (defstring != NULLST) { strncpy(rp, nam.nam$l_name, nam.nam$b_name + nam.nam$b_type); rp += nam.nam$b_name + nam.nam$b_type; *rp = EOS; if ((nam.nam$l_fnb & NAM$M_EXP_VER) != 0) { strncpy(rp, nam.nam$l_ver, nam.nam$b_ver); rp[nam.nam$b_ver] = EOS; } } return (TRUE); } return (FALSE); } #endif -h- cpp3.c Mon Jan 7 23:59:31 1985 cpp3.c /* * C P P 3 . C * * File open and command line options * * Edit history * 13-Nov-84 MM Split from cpp1.c */ #include <stdio.h> #include <ctype.h> #include "cppdef.h" #include "cpp.h" #if DEBUG && (HOST == SYS_VMS || HOST == SYS_UNIX) #include <signal.h> extern int abort(); /* For debugging */ #endif int openfile(filename) char *filename; /* * Open a file, add it to the linked list of open files. * This is called only from openfile() above. */ { register FILE *fp; if ((fp = fopen(filename, "r")) == NULL) { #if DEBUG perror(filename); #endif return (FALSE); } #if DEBUG if (debug) fprintf(stderr, "Reading from \"%s\"\n", filename); #endif addfile(fp, filename); return (TRUE); } addfile(fp, filename) FILE *fp; /* Open file pointer */ char *filename; /* Name of the file */ /* * Initialize tables for this open file. This is called from openfile() * above (for #include files), and from the entry to cpp to open the main * input file. It calls a common routine, getfile() to build the FILEINFO * structure which is used to read characters. (getfile() is also called * to setup a macro replacement.) */ { register FILEINFO *file; extern FILEINFO *getfile(); file = getfile(NBUFF, filename); file->fp = fp; /* Better remember FILE * */ file->buffer[0] = EOS; /* Initialize for first read */ line = 1; /* Working on line 1 now */ wrongline = TRUE; /* Force out initial #line */ } setincdirs() /* * Append system-specific directories to the include directory list. * Called only when cpp is started. */ { #ifdef CPP_INCLUDE *incend++ = CPP_INCLUDE; #define IS_INCLUDE 1 #else #define IS_INCLUDE 0 #endif #if HOST == SYS_UNIX *incend++ = "/usr/include"; #define MAXINCLUDE (NINCLUDE - 1 - IS_INCLUDE) #endif #if HOST == SYS_VMS extern char *getenv(); if (getenv("C$LIBRARY") != NULL) *incend++ = "C$LIBRARY:"; *incend++ = "SYS$LIBRARY:"; #define MAXINCLUDE (NINCLUDE - 2 - IS_INCLUDE) #endif #if HOST == SYS_RSX extern int $$rsts; /* TRUE on RSTS/E */ extern int $$pos; /* TRUE on PRO-350 P/OS */ extern int $$vms; /* TRUE on VMS compat. */ if ($$pos) { /* P/OS? */ *incend++ = "SY:[ZZDECUSC]"; /* C #includes */ *incend++ = "LB:[1,5]"; /* RSX library */ } else if ($$rsts) { /* RSTS/E? */ *incend++ = "SY:@"; /* User-defined account */ *incend++ = "C:"; /* Decus-C library */ *incend++ = "LB:[1,1]"; /* RSX library */ } else if ($$vms) { /* VMS compatibility? */ *incend++ = "C:"; } else { /* Plain old RSX/IAS */ *incend++ = "LB:[1,1]"; } #define MAXINCLUDE (NINCLUDE - 3 - IS_INCLUDE) #endif #if HOST == SYS_RT11 extern int $$rsts; /* RSTS/E emulation? */ if ($$rsts) *incend++ = "SY:@"; /* User-defined account */ *incend++ = "C:"; /* Decus-C library disk */ *incend++ = "SY:"; /* System (boot) disk */ #define MAXINCLUDE (NINCLUDE - 3 - IS_INCLUDE) #endif } int dooptions(argc, argv) int argc; char *argv[]; /* * dooptions is called to process command line arguments (-Detc). * It is called only at cpp startup. */ { register char *ap; register DEFBUF *dp; register int c; int i, j; char *arg; SIZES *sizp; /* For -S */ int size; /* For -S */ int isdatum; /* FALSE for -S* */ int endtest; /* For -S */ for (i = j = 1; i < argc; i++) { arg = ap = argv[i]; if (*ap++ != '-' || *ap == EOS) argv[j++] = argv[i]; else { c = *ap++; /* Option byte */ if (islower(c)) /* Normalize case */ c = toupper(c); switch (c) { /* Command character */ case 'C': /* Keep comments */ cflag = TRUE; keepcomments = TRUE; break; case 'D': /* Define symbol */ #if HOST != SYS_UNIX zap_uc(ap); /* Force define to U.C. */ #endif /* * If the option is just "-Dfoo", make it -Dfoo=1 */ while (*ap != EOS && *ap != '=') ap++; if (*ap == EOS) ap = "1"; else *ap++ = EOS; /* * Now, save the word and its definition. */ dp = defendel(argv[i] + 2, FALSE); dp->repl = savestring(ap); dp->nargs = DEF_NOARGS; break; case 'E': /* Ignore non-fatal */ eflag = TRUE; /* errors. */ break; case 'I': /* Include directory */ if (incend >= &incdir[MAXINCLUDE]) cfatal("Too many include directories", NULLST); *incend++ = ap; break; case 'N': /* No predefineds */ nflag++; /* Repeat to undefine */ break; /* __LINE__, etc. */ case 'S': sizp = size_table; if (isdatum = (*ap != '*')) /* If it's just -S, */ endtest = T_FPTR; /* Stop here */ else { /* But if it's -S* */ ap++; /* Step over '*' */ endtest = 0; /* Stop at end marker */ } while (sizp->bits != endtest && *ap != EOS) { if (!isdigit(*ap)) { /* Skip to next digit */ ap++; continue; } size = 0; /* Compile the value */ while (isdigit(*ap)) { size *= 10; size += (*ap++ - '0'); } if (isdatum) sizp->size = size; /* Datum size */ else sizp->psize = size; /* Pointer size */ sizp++; } if (sizp->bits != endtest) cwarn("-S, too few values specified in %s", argv[i]); else if (*ap != EOS) cwarn("-S, too many values, \"%s\" unused", ap); break; case 'U': /* Undefine symbol */ #if HOST != SYS_UNIX zap_uc(ap); #endif if (defendel(ap, TRUE) == NULL) cwarn("\"%s\" wasn't defined", ap); break; #if DEBUG case 'X': /* Debug */ debug = (isdigit(*ap)) ? atoi(ap) : 1; #if (HOST == SYS_VMS || HOST == SYS_UNIX) signal(SIGINT, abort); /* Trap "interrupt" */ #endif fprintf(stderr, "Debug set to %d\n", debug); break; #endif default: /* What is this one? */ cwarn("Unknown option \"%s\"", arg); fprintf(stderr, "The following options are valid:\n\ -C\t\t\tWrite source file comments to output\n\ -Dsymbol=value\tDefine a symbol with the given (optional) value\n\ -Idirectory\t\tAdd a directory to the #include search list\n\ -N\t\t\tDon't predefine target-specific names\n\ -Stext\t\tSpecify sizes for #if sizeof\n\ -Usymbol\t\tUndefine symbol\n"); #if DEBUG fprintf(stderr, " -Xvalue\t\tSet internal debug flag\n"); #endif break; } /* Switch on all options */ } /* If it's a -option */ } /* For all arguments */ if (j > 3) { cerror( "Too many file arguments. Usage: cpp [input [output]]", NULLST); } return (j); /* Return new argc */ } #if HOST != SYS_UNIX FILE_LOCAL zap_uc(ap) register char *ap; /* * Dec operating systems mangle upper-lower case in command lines. * This routine forces the -D and -U arguments to uppercase. * It is called only on cpp startup by dooptions(). */ { while (*ap != EOS) { /* * Don't use islower() here so it works with Multinational */ if (*ap >= 'a' && *ap <= 'z') *ap = toupper(*ap); ap++; } } #endif initdefines() /* * Initialize the built-in #define's. There are two flavors: * #define decus 1 (static definitions) * #define __FILE__ ?? (dynamic, evaluated by magic) * Called only on cpp startup. * * Note: the built-in static definitions are supressed by the -N option. * __LINE__, __FILE__, and __DATE__ are always present. */ { register char **pp; register char *tp; register DEFBUF *dp; int i; long tvec; extern char *ctime(); /* * Predefine the built-in symbols. Allow the * implementor to pre-define a symbol as "" to * eliminate it. */ if (nflag == 0) { for (pp = preset; *pp != NULL; pp++) { if (*pp[0] != EOS) { dp = defendel(*pp, FALSE); dp->repl = savestring("1"); dp->nargs = DEF_NOARGS; } } } /* * The magic pre-defines (__FILE__ and __LINE__ are * initialized with negative argument counts. expand() * notices this and calls the appropriate routine. * DEF_NOARGS is one greater than the first "magic" definition. */ if (nflag < 2) { for (pp = magic, i = DEF_NOARGS; *pp != NULL; pp++) { dp = defendel(*pp, FALSE); dp->nargs = --i; } #if OK_DATE /* * Define __DATE__ as today's date. */ dp = defendel("__DATE__", FALSE); dp->repl = tp = getmem(27); dp->nargs = DEF_NOARGS; time(&tvec); *tp++ = '"'; strcpy(tp, ctime(&tvec)); tp[24] = '"'; /* Overwrite newline */ #endif } } #if HOST == SYS_VMS /* * getredirection() is intended to aid in porting C programs * to VMS (Vax-11 C) which does not support '>' and '<' * I/O redirection. With suitable modification, it may * useful for other portability problems as well. */ int getredirection(argc, argv) int argc; char **argv; /* * Process vms redirection arg's. Exit if any error is seen. * If getredirection() processes an argument, it is erased * from the vector. getredirection() returns a new argc value. * * Warning: do not try to simplify the code for vms. The code * presupposes that getredirection() is called before any data is * read from stdin or written to stdout. * * Normal usage is as follows: * * main(argc, argv) * int argc; * char *argv[]; * { * argc = getredirection(argc, argv); * } */ { register char *ap; /* Argument pointer */ int i; /* argv[] index */ int j; /* Output index */ int file; /* File_descriptor */ extern int errno; /* Last vms i/o error */ for (j = i = 1; i < argc; i++) { /* Do all arguments */ switch (*(ap = argv[i])) { case '<': /* <file */ if (freopen(++ap, "r", stdin) == NULL) { perror(ap); /* Can't find file */ exit(errno); /* Is a fatal error */ } break; case '>': /* >file or >>file */ if (*++ap == '>') { /* >>file */ /* * If the file exists, and is writable by us, * call freopen to append to the file (using the * file's current attributes). Otherwise, create * a new file with "vanilla" attributes as if the * argument was given as ">filename". * access(name, 2) returns zero if we can write on * the specified file. */ if (access(++ap, 2) == 0) { if (freopen(ap, "a", stdout) != NULL) break; /* Exit case statement */ perror(ap); /* Error, can't append */ exit(errno); /* After access test */ } /* If file accessable */ } /* * On vms, we want to create the file using "standard" * record attributes. creat(...) creates the file * using the caller's default protection mask and * "variable length, implied carriage return" * attributes. dup2() associates the file with stdout. */ if ((file = creat(ap, 0, "rat=cr", "rfm=var")) == -1 || dup2(file, fileno(stdout)) == -1) { perror(ap); /* Can't create file */ exit(errno); /* is a fatal error */ } /* If '>' creation */ break; /* Exit case test */ default: argv[j++] = ap; /* Not a redirector */ break; /* Exit case test */ } } /* For all arguments */ argv[j] = NULL; /* Terminate argv[] */ return (j); /* Return new argc */ } #endif