minow@decvax.UUCP (Martin Minow) (09/02/84)
-h- readme.txt Sat Sep 1 21:43:35 1984 readme.txt Decus CPP is a public-domain implementation of the C preprocessor. It runs on VMS native (Vax C), VMS compatibilty mode (Decus C), RSX-11M, RSTS/E, P/OS, and RT11, as well as on several varieties of Unix, including Ultrix-32. These notes describe how to extract the cpp source files, configure cpp for your needs, and mention a few design decisions that may be of interest to maintainers. Installation Because the primary development of cpp was not on Unix, it is distributed using the Decus C archive program (quite similar to the archiver published in Kernighan and Plauger's Software Tools). To extract the files from the net.sources distribution, save this message as cpp1.arc (and the other two distribution files as cpp2.arc and cpp3.arc). Then, using your favorite editor, locate the archx.c program, just following the line beginning with "-h- archx.c" -- the format of the tape is just: ... stuff -h- archx.c ... archx.c program -h- archc.c ... archc.c program Compile archx.c -- it shouldn't require any special editing. Then run it as follows: archx cpp1.arc archx cpp2.arc archx cpp3.arc You do not need to remove mail headers from the saved messages. You should then read through cppdef.h to make sure the HOST and TARGET (and other implementation-specific) definitions are set correctly for your machine, editing cppdef.h or makefile.txt as needed. You may then copy makefile.txt to Makefile, On Unix, cpp should be compiled by make without further difficulty. On other operating systems, you should compile the three source modules, linking them together. Note that, on Decus C based systems, you must extend the default stack allocation. The Decus C build utility will create the appropriate command file. Support Notes The distribution kit was designed to keep all submissions just under 50,000 bytes: cpp1.arc: readme.txt This file cpp.mem Documentation page (see below) archx.c Archive extraction program archc.c Archive construction program cpp.rno Source for cpp.mem (see below) makefile.txt Unix makefile -- copy to Makefile cpp2.arc: cpp.h Main header file (structure def's and globals) cppdef.h Configuration file (host and target definitions) cpp1.c Mainline code, most #control processing cpp3.arc: cpp2.c #define, macro expansion, and #if <expr> processors cpp3.c Support code (symbol table and I/O routines) While cpp intentionally does not rely on the presence of a full-scale macro preprocessor, it does require the simple parameter substitution preprocessor capabilities of Unix V6 and Decus C. If your C language lacks full preprocessing, you should make sure "nomacargs" is #define'd in cpp.h. (This is done automatically by the Decus C compiler.) The documentation (manual page) for cpp is included as cpp.mem and cpp.rno. These are in Dec Runoff format, built by a Decus C utility (getrno) from original source which is embedded in cpp1.c. To my knowledge, there is no equivalent program that creates the nroff source appropriate for Unix. I would be happy to receive fixes to any problems you encounter. As I do not maintain distribution kit base-levels, bare-bones diff listings without sufficient context are not very useful. It is unlikely that I can find time to help you with other difficulties. Acknowledgements I received a great deal of help from many people in debugging cpp. Alan Feuer and Sam Kendall used "state of the art" run-time code checkers to locate several errors. Ed Keiser found problems when cpp was used on machines with different int and pointer sizes. Dave Conroy helped with the initial debugging. Martin Minow decvax!minow -h- cpp.mem Sat Sep 1 21:43:35 1984 cpp.mem 1.0 C Pre-Processor ******* * cpp * ******* NAME: cpp -- C Pre-Processor SYNOPSIS: cpp [-options] [infile [outfile]] DESCRIPTION: CPP reads a C source file, expands macros and include files, and writes an input file for the C compiler. If no file arguments are given, cpp reads from stdin and writes to stdout. If one file argument is given, it will define the input file, while two file arguments define both input and output files. The following options are supported. Options may be given in either case. -Idirectory Add this directory to the list of directories searched for #include "..." and #include <...> commands. Note that there is no space between the "-I" and the directory string. More than one -I command is permitted. On non-Unix systems "directory" is forced to upper-case. -Dname=value Define the name as if the programmer wrote #define name value at the start of the first file. If "=value" is not given, a value of "1" will be used. On non-unix systems, all alphabetic text will be forced to upper-case. -Uname Undefine the name as if #undef name Page 2 cpp C Pre-Processor were given. On non-Unix systems, "name" will be forced to upper-case. The following variables are pre-defined: Target computer (as appropriate): pdp11, vax, M68000 m68000 m68k Target operating system (as appropriate): rsx, rt11, vms, unix Target compiler (as appropriate): decus, vax11c The implementor may add definitions to this list. The default definitions match the definition of the host computer, operating system, and C compiler. The following are always available unless undefined: __FILE__ The input (or #include) file being compiled (as a quoted string). __LINE__ The line number being compiled. __DATE__ The date and time of compilation as a Unix ctime quoted string (the trailing newline is removed). Thus, printf("Bug at line %s,", __LINE__); printf(" source file %s", __FILE__); printf(" compiled on %s", __DATE__); -Xnumber Enable debugging code. If no value is given, a value of 1 will be used. (For maintenence of CPP only.) DRAFT ANSI STANDARD CONSIDERATIONS: Comments are removed from the input text. The comment is replaced by a single space character. This differs from usage on some existing preprocessors (but it follows the Draft Ansi C Standard). Note that arguments may be concatenated as follows: #define I(x)x #define CAT(x,y)I(x)y int value = CAT(1,2); Page 3 cpp C Pre-Processor If the above macros are defined and invoked without extraneous spaces, they will be transportable to other implementations. Unfortunately, this will not properly expand int CAT(foo,__LINE__); int CAT(foo,__LINE__); as __LINE__ is copied into the input stream, yielding "foo__LINE__" in both cases, rather than the expected "foo123", "foo124", which would result if __LINE__ were expanded and the result copied into the input stream. Macro formal parameters are not recognized within quoted strings and character constants in macro definitions. CPP implements most of the ANSI draft standard. You should be aware of the following differences: o In the draft standard, the \n (backslash-newline) character is "invisible" to all processing. In this implementation, it is invisible to strings, but acts a "whitespace" (token-delimiter) outside of strings. This considerably simplifies error message handling. o The following new features of C are processed by cpp: #elif expression (#else #if) '\xNNN' (Hexadecimal constants) '\a' (Ascii BELL [silly]) '\v' (Ascii VT) #if defined NAME (1 if defined, 0 if not) #if defined (NAME) (1 if defined, 0 if not) unary + (gag me with a spoon) o The draft standard has extended C, adding a string concatenation operator, where "foo" "bar" is regarded as the single string "foobar". (This does not affect CPP's processing.) ERROR MESSAGES: Many. CPP prints warning messages if you try to use multiple-byte character constants (non-transportable) or if you #undef a symbol that was not defined. BUGS: Cpp prints spurious error or warning messages in #if sequences such as the following: Page 4 cpp C Pre-Processor #define foo 0 #if (foo != 0) ? (100 / foo) : 0 #undef foo #if ((defined(foo)) ? foo : 0) == 1 Cpp should supress the error message if the expression's value is already known. AUTHOR: Martin Minow -h- archx.c Sat Sep 1 21:43:35 1984 archx.c /* * A R C H X * * Archive extraction * */ /*)BUILD $(TKBOPTIONS) = { TASK = ...ARX } */ #ifdef DOCUMENTATION title archx text file archiver extraction index text file archiver extraction synopsis arch archive_files description Archx manages archives (libraries) of source files, allowing a large number of small files to be stored without using excessive system resources. Archx extracts all files from an archive. If no archive_name file is given, the standard input is read. Archive header records are echoed to the standard output. archive file format Archive files are standard text files. Each archive element is preceeded by a line of the format: .s.nf -h- file.name date true_name .s.f Note that there is no line or byte count. To prevent problems, a '-' at the beginning of a record within a user file or embedded archive will be "quoted" by doubling it. The date and true filename fields are ignored. On some operating systems, file.name is forced to lowercase. If the first non-blank line of an input file does not begin with "-h", the text will be appended to "archx.tmp" This is needed if archives are distributed by mail and arrive with initial routing and subject information. diagnostics Diagnostic messages should be self-explanatory author Martin Minow bugs #endif #include <stdio.h> #include <ctype.h> #define EOS 0 #define FALSE 0 #define TRUE 1 #ifdef vms #include <ssdef.h> extern int errno; #define IO_ERROR errno #define IO_NORMAL SS$_NORMAL #endif #ifndef IO_NORMAL #define IO_NORMAL 0 #endif #ifndef IO_ERROR #define IO_ERROR 1 #endif /* * The following status codes are returned by gethdr() */ #define DONE 0 #define GOTCHA 1 #define NOGOOD 2 char text[513]; /* Working text line */ char name[81]; /* Current archive member name */ char filename[81]; /* Working file name */ char arfilename[81]; /* Archive file name */ char fullname[81]; /* Output for argetname() */ int verbose = TRUE; /* TRUE for verbosity */ int first_archive; /* For mail header skipping */ main(argc, argv) int argc; /* Arg count */ char *argv[]; /* Arg vector */ { register int i; /* Random counter */ int status; /* Exit status */ #ifdef vms argc = getredirection(argc, argv); #endif status = IO_NORMAL; if (argc == 1) process(); else { for (i = 1; i < argc; i++) { if (freopen(argv[i], "r", stdin) != NULL) process(); else { perror(argv[i]); status = IO_ERROR; } } } exit(status); } process() /* * Process archive open on stdin */ { register char *fn; /* File name pointer */ register FILE *outfd; register int i; text[0] = EOS; while ((i = gethdr()) != DONE) { switch (i) { case GOTCHA: if ((outfd = fopen(name, "w")) == NULL) { perror(name); fprintf(stderr, "Can't create \"%s\"\n", name); arskip(); continue; } break; case NOGOOD: fprintf(stderr, "Missing -h-, writing to archx.tmp\n"); fprintf(stderr, "Current text line: %s", text); strcpy(name, "archx.tmp"); if ((outfd = fopen(name, "a")) == NULL) { perror(name); fprintf(stderr, "Cannot append to %s\n", name); arskip(); continue; } break; } arexport(outfd); fclose(outfd); } } int gethdr() /* * If text is null, read a record, returning to signal input state: * DONE Eof read * NOGOOD -h- wasn't first non-blank line. Line is in text[] * GOTCHA -h- found, parsed into name. */ { register char *tp; register char *np; again: if (text[0] == EOS && fgets(text, sizeof text, stdin) == NULL) return (DONE); if (text[0] == '\n' && text[1] == EOS) { text[0] = EOS; goto again; } if (text[0] != '-' || text[1] != 'h' || text[2] != '-') return (NOGOOD); for (tp = &text[3]; isspace(*tp); tp++) ; for (np = name; !isspace(*tp); *np++ = *tp++) ; *np = EOS; return (GOTCHA); } arskip() /* * Skip to next header */ { while (fgets(text, sizeof text, stdin) != NULL) { if (text[0] == '-' && text[1] != '-') return; } text[0] = EOS; /* EOF signal */ } arexport(outfd) register FILE *outfd; /* * Read secret archive format, writing archived data to outfd. * Clean out extraneous <cr>,<lf>'s */ { register char *tp; unsigned int nrecords; printf("Creating \"%s\", ", name); nrecords = 0; while (fgets(text, sizeof text, stdin) != NULL) { tp = &text[strlen(text)]; if (tp > &text[1] && *--tp == '\n' && *--tp == '\r') { *tp++ = '\n'; *tp = EOS; } if (text[0] == '-') { if (text[1] != '-') goto gotcha; fputs(text+1, outfd); } else { fputs(text, outfd); } nrecords++; } text[0] = EOS; gotcha: printf("%u records\n", nrecords); if (ferror(stdin) || ferror(outfd)) printf("Creation of \"%s\" completed with error\n", name); } /* * getredirection() is intended to aid in porting C programs * to VMS (Vax-11 C) which does not support '>' and '<' * I/O redirection. With suitable modification, it may * useful for other portability problems as well. */ #ifdef vms static int getredirection(argc, argv) int argc; char **argv; /* * Process vms redirection arg's. Exit if any error is seen. * If getredirection() processes an argument, it is erased * from the vector. getredirection() returns a new argc value. * * Warning: do not try to simplify the code for vms. The code * presupposes that getredirection() is called before any data is * read from stdin or written to stdout. * * Normal usage is as follows: * * main(argc, argv) * int argc; * char *argv[]; * { * argc = getredirection(argc, argv); * } */ { register char *ap; /* Argument pointer */ int i; /* argv[] index */ int j; /* Output index */ int file; /* File_descriptor */ extern int errno; /* Last vms i/o error */ for (j = i = 1; i < argc; i++) { /* Do all arguments */ switch (*(ap = argv[i])) { case '<': /* <file */ if (freopen(++ap, "r", stdin) == NULL) { perror(ap); /* Can't find file */ exit(errno); /* Is a fatal error */ } case '>': /* >file or >>file */ if (*++ap == '>') { /* >>file */ /* * If the file exists, and is writable by us, * call freopen to append to the file (using the * file's current attributes). Otherwise, create * a new file with "vanilla" attributes as if * the argument was given as ">filename". * access(name, 2) is TRUE if we can write on * the specified file. */ if (access(++ap, 2) == 0) { if (freopen(ap, "a", stdout) != NULL) break; /* Exit case statement */ perror(ap); /* Error, can't append */ exit(errno); /* After access test */ } /* If file accessable */ } /* * On vms, we want to create the file using "standard" * record attributes. create(...) creates the file * using the caller's default protection mask and * "variable length, implied carriage return" * attributes. dup2() associates the file with stdout. */ if ((file = creat(ap, 0, "rat=cr", "rfm=var")) == -1 || dup2(file, fileno(stdout)) == -1) { perror(ap); /* Can't create file */ exit(errno); /* is a fatal error */ } /* If '>' creation */ break; /* Exit case test */ default: argv[j++] = ap; /* Not a redirector */ break; /* Exit case test */ } } /* For all arguments */ return (j); } #endif -h- archc.c Sat Sep 1 21:43:35 1984 archc.c /* * A R C H I V E * * Create an archive * */ /*)BUILD $(TKBOPTIONS) = { TASK = ...ARC } */ #ifdef DOCUMENTATION title archc text file archive creation index text file archive creation synopsis archc file[s] >archive description Archc manages archives (libraries) of source files, allowing a large number of small files to be stored without using excessive system resources. It copies the set of named files to standard output in archive format. The archx program will recreate the files from an archive. Note: there are no checks against the same file appearing twice in an archive. archive file format Archive files are standard text files. Each archive element is preceeded by a line of the format: .s.nf -h- file.name date true_path_name .s.f Note that there is no line or byte count. To prevent problems, a '-' at the beginning of a record within a user file or embedded archive will be "quoted" by doubling it. The date and true filename fields are ignored. On Dec operating systems, file.name is forced to lowercase. diagnostics Diagnostic messages should be self-explanatory author Martin Minow #endif #include <stdio.h> #include <ctype.h> #define EOS 0 #define FALSE 0 #define TRUE 1 char text[513]; /* Working text */ char name[81]; /* Current archive member name */ char pathname[81]; /* Output for argetname() */ char *timetext; /* Time of day text */ int verbose = TRUE; /* TRUE for verbosity */ FILE *infd; /* Input file */ main(argc, argv) int argc; /* Arg count */ char *argv[]; /* Arg vector */ { register int i; /* Random counter */ register char *fn; /* File name pointer */ register char *argp; /* Arg pointer */ int nfiles; extern char *ctime(); extern long time(); long timval; time(&timval); timetext = ctime(&timval); timetext[24] = EOS; argc = getredirection(argc, argv); if (argc <= 1) fprintf(stderr, "No files to archive?\n"); #ifdef unix for (i = 1; i < argc; i++) { if ((infd = fopen(argv[i], "r")) == NULL) perror(argv[i]); else { strcpy(pathname, argv[i]); import(); fclose(infd); } } #else for (i = 1; i < argc; i++) { if ((infd = fwild(argv[i], "r")) == NULL) perror(argv[i]); else { for (nfiles = 0; fnext(infd) != NULL; nfiles++) { fgetname(infd, pathname); import(); } fclose(infd); if (nfiles == 0) fprintf(stderr, "No files match \"%s\"\n", argv[i]); } } #endif } import() /* * Add the file open on infd (with file name in pathname) to * the archive. */ { unsigned int nrecords; fixname(); nrecords = 0; printf("-h- %s\t%s\t%s\n", name, timetext, pathname); while (fgets(text, sizeof text, infd) != NULL) { if (text[0] == '-') putchar('-'); /* Quote */ fputs(text, stdout); nrecords++; } if (ferror(infd)) { perror(name); fprintf(stderr, "Error when importing a file\n"); } if (verbose) { fprintf(stderr, "%u records read from %s\n", nrecords, pathname); } } fixname() /* * Get file name (in pathname), stripping off device:[directory] * and ;version. The archive name ("file.ext") is written to name[]. * On a dec operating system, name is forced to lowercase. */ { register char *tp; register char *ip; char bracket; extern char *strrchr(); #ifdef unix /* * name is after all directory information */ if ((tp = strrchr(pathname, '/')) != NULL) tp++; else tp = pathname; strcpy(name, tp); #else strcpy(name, pathname); if ((tp = strrchr(name, ';')) != NULL) *tp = EOS; while ((tp = strchr(name, ':')) != NULL) strcpy(name, tp + 1); switch (name[0]) { case '[': bracket = ']'; break; case '<': bracket = '>'; break; case '(': bracket = ')'; break; default: bracket = EOS; break; } if (bracket != EOS) { if ((tp = strchr(name, bracket)) == NULL) { fprintf(stderr, "? Illegal file name \"%s\"\n", pathname); } else { strcpy(name, tp + 1); } } for (tp = name; *tp != EOS; tp++) { if (isupper(*tp)) *tp = tolower(*tp); } #endif } #ifdef unix char * strrchr(stng, chr) register char *stng; register char chr; /* * Return rightmost instance of chr in stng. * This has the wrong name on some Unix systems. */ { register char *result; result = NULL; do { if (*stng == chr) result = stng; } while (*stng++ != EOS); return (result); } #endif /* * getredirection() is intended to aid in porting C programs * to VMS (Vax-11 C) which does not support '>' and '<' * I/O redirection. With suitable modification, it may * useful for other portability problems as well. */ static int getredirection(argc, argv) int argc; char **argv; /* * Process vms redirection arg's. Exit if any error is seen. * If getredirection() processes an argument, it is erased * from the vector. getredirection() returns a new argc value. * * Warning: do not try to simplify the code for vms. The code * presupposes that getredirection() is called before any data is * read from stdin or written to stdout. * * Normal usage is as follows: * * main(argc, argv) * int argc; * char *argv[]; * { * argc = getredirection(argc, argv); * } */ { #ifdef vms register char *ap; /* Argument pointer */ int i; /* argv[] index */ int j; /* Output index */ int file; /* File_descriptor */ extern int errno; /* Last vms i/o error */ for (j = i = 1; i < argc; i++) { /* Do all arguments */ switch (*(ap = argv[i])) { case '<': /* <file */ if (freopen(++ap, "r", stdin) == NULL) { perror(ap); /* Can't find file */ exit(errno); /* Is a fatal error */ } case '>': /* >file or >>file */ if (*++ap == '>') { /* >>file */ /* * If the file exists, and is writable by us, * call freopen to append to the file (using the * file's current attributes). Otherwise, create * a new file with "vanilla" attributes as if * the argument was given as ">filename". * access(name, 2) is TRUE if we can write on * the specified file. */ if (access(++ap, 2) == 0) { if (freopen(ap, "a", stdout) != NULL) break; /* Exit case statement */ perror(ap); /* Error, can't append */ exit(errno); /* After access test */ } /* If file accessable */ } /* * On vms, we want to create the file using "standard" * record attributes. create(...) creates the file * using the caller's default protection mask and * "variable length, implied carriage return" * attributes. dup2() associates the file with stdout. */ if ((file = creat(ap, 0, "rat=cr", "rfm=var")) == -1 || dup2(file, fileno(stdout)) == -1) { perror(ap); /* Can't create file */ exit(errno); /* is a fatal error */ } /* If '>' creation */ break; /* Exit case test */ default: argv[j++] = ap; /* Not a redirector */ break; /* Exit case test */ } } /* For all arguments */ return (j); #else /* * Note: argv[] is referenced to fool the Decus C * syntax analyser, supressing an unneeded warning * message. */ return (argv[0], argc); /* Just return as seen */ #endif } -h- cpp.rno Sat Sep 1 21:43:35 1984 cpp.rno .lm 8.rm 72.nhy .no autosubtitle .style headers 3,0,0 .pg.uc.ps 58,80.lm 8.rm 72 .hd .hd mixed .head mixed .st ########cpp#####C Pre-Processor .pg .hl 1 ^&C Pre-Processor\& .s 2 .c ;******* .c ;* cpp * .c ;******* .s 2 .lm +8 .s.i -8;NAME: cpp -- C Pre-Processor .s.f .i -8;SYNOPSIS: .s.nf cpp [-options] [infile [outfile]] .s.f .i -8;DESCRIPTION: .s CPP reads a C source file, expands macros and include files, and writes an input file for the C compiler. If no file arguments are given, cpp reads from stdin and writes to stdout. If one file argument is given, it will define the input file, while two file arguments define both input and output files. .s The following options are supported. Options may be given in either case. .lm +16 .p -16 --Idirectory Add this directory to the list of directories searched for _#include "..." and _#include <...> commands. Note that there is no space between the "-I" and the directory string. More than one -I command is permitted. On non-Unix systems "directory" is forced to upper-case. .p -16 --Dname=value Define the name as if the programmer wrote .s .nf _#define name value .fill .s at the start of the first file. If "=value" is not given, a value of "1" will be used. .s On non-unix systems, all alphabetic text will be forced to upper-case. .s .p -16 --Uname Undefine the name as if .s .nf _#undef name .fill .s were given. On non-Unix systems, "name" will be forced to upper-case. .s.lm -16 The following variables are pre-defined: .s Target computer (as appropriate): .s .nf pdp11, vax, M68000 m68000 m68k .fill .s Target operating system (as appropriate): .s .nf rsx, rt11, vms, unix .fill .s Target compiler (as appropriate): .s .nf decus, vax11c .fill .s The implementor may add definitions to this list. The default definitions match the definition of the host computer, operating system, and C compiler. .s The following are always available unless undefined: .lm +16 .p -12 ____FILE____ The input (or _#include) file being compiled (as a quoted string). .p -12 ____LINE____ The line number being compiled. .p -12 ____DATE____ The date and time of compilation as a Unix ctime quoted string (the trailing newline is removed). Thus, .s .nf printf("Bug at line _%s,", ____LINE____); printf(" source file _%s", ____FILE____); printf(" compiled on _%s", ____DATE____); .fill .p -16 --Xnumber Enable debugging code. If no value is given, a value of 1 will be used. (For maintenence of CPP only.) .s.lm -16 .s .i -8;DRAFT ANSI STANDARD CONSIDERATIONS: .s Comments are removed from the input text. The comment is replaced by a single space character. This differs from usage on some existing preprocessors (but it follows the Draft Ansi C Standard). .s Note that arguments may be concatenated as follows: .s.nf .nf _#define I(x)x _#define CAT(x,y)I(x)y int value = CAT(1,2); .fill .s.f If the above macros are defined and invoked without extraneous spaces, they will be transportable to other implementations. Unfortunately, this will not properly expand .s.nf .nf int CAT(foo,____LINE____); int CAT(foo,____LINE____); .fill .s.f as ____LINE____ is copied into the input stream, yielding "foo____LINE____" in both cases, rather than the expected "foo123", "foo124", which would result if ____LINE____ were expanded and the result copied into the input stream. .s Macro formal parameters are not recognized within quoted strings and character constants in macro definitions. .s CPP implements most of the ANSI draft standard. You should be aware of the following differences: .lm +4 .s.i-4;o###In the draft standard, the _\n (backslash-newline) character is "invisible" to all processing. In this implementation, it is invisible to strings, but acts a "whitespace" (token-delimiter) outside of strings. This considerably simplifies error message handling. .s.i-4;o###The following new features of C are processed by cpp: .s .br;####_#elif expression####(_#else _#if) .br;####'_\xNNN'#############(Hexadecimal constants) .br;####'_\a'################(Ascii BELL [silly]) .br;####'_\v'################(Ascii VT) .br;####_#if defined NAME####(1 if defined, 0 if not) .br;####_#if defined (NAME)##(1 if defined, 0 if not) .br;####_unary +#############(gag me with a spoon) .s.i-4;o###The draft standard has extended C, adding a string concatenation operator, where .s .nf "foo" "bar" .fill .s is regarded as the single string "foobar". (This does not affect CPP's processing.) .s.lm -4 .i -8;ERROR MESSAGES: .s Many. CPP prints warning messages if you try to use multiple-byte character constants (non-transportable) or if you _#undef a symbol that was not defined. .s .i -8;BUGS: .s Cpp prints spurious error or warning messages in _#if sequences such as the following: .s .br;####_#define foo 0 .br;####_#if (foo != 0) _? (100 / foo) _: 0 .br;####_#undef foo .br;####_#if ((defined(foo)) _? foo _: 0) == 1 .s Cpp should supress the error message if the expression's value is already known. .s .i -8;AUTHOR: .s Martin Minow .s .lm 8.rm 72.nhy -h- makefile.txt Sat Sep 1 21:43:35 1984 makefile.txt # Unix makefile for cpp CFLAGS = -O # # The following is needed for 4.2 bsd (and maybe some other Unices) # CFLAGS = -O -Dstrchr=index -Dstrrchr=rindex LINT = lint # # ** compile cpp # SRCS = cpp1.c cpp2.c cpp3.c OBJECTS.cpp = cpp1.o cpp2.o cpp3.o cpp: $(OBJECTS.cpp) $(CC) $(CFLAGS) $(OBJECTS.cpp) -o cpp # # ** Test cpp by preprocessing itself, compiling the result, # ** repeating the process and diff'ing the result. Note: this # ** is not a good test of cpp, but a simple verification. # ** The diff's should not report any changes. # test: cpp cpp1.c >old.tmp1.c cpp cpp2.c >old.tmp2.c cpp cpp3.c >old.tmp3.c $(CC) $(CFLAGS) old.tmp[123].c a.out cpp1.c >new.tmp1.c a.out cpp2.c >new.tmp2.c a.out cpp3.c >new.tmp3.c diff old.tmp1.c new.tmp1.c diff old.tmp2.c new.tmp2.c diff old.tmp3.c new.tmp3.c rm a.out old.tmp[123].* new.tmp[123].* # # ** Lint the code # lint: $(SRCS) $(LINT) $(SRCS) # # ** Rebuild the archive files needed to distribute cpp # ** Uses the Decus C archive utility. # archc: archc.c $(CC) $(CFLAGS) archc.c -o archc archx: archx.c $(CC) $(CFLAGS) archx.c -o archx archive: archc archc readme.txt cpp.mem archx.c archc.c cpp.rno makefile.txt >cpp1.arc archc cpp*.h cpp1.c >cpp2.arc archc cpp2.c cpp3.c >cpp3.arc cpp1.c : cpp.h cppdef.h cpp2.c : cpp.h cppdef.h cpp3.c : cpp.h cppdef.h
minow@decvax.UUCP (Martin Minow) (09/02/84)
-h- cpp.h Sat Sep 1 21:43:39 1984 cpp.h /* * I n t e r n a l D e f i n i t i o n s f o r C P P * * In general, definitions in this file should not be changed. */ #ifndef TRUE #define TRUE 1 #define FALSE 0 #endif #define EOS '\0' /* End of string */ #define EOF_CHAR 0 /* Returned by get() on eof */ #define NULLST ((char *) NULL) /* Pointer to nowhere (linted) */ #if COMMENT_INVISIBLE #define COM_SPACE 0x1F /* End of comment separator */ #endif #define DEF_NOARGS (-1) /* #define foo vs #define foo() */ /* * Note -- in Ascii, the following will map macro formals onto the C1 * control character region (decimal 128 .. (128 + NPARM)) which will * be ok as long as NPARM is less than 32). */ #define PFLAG 0x80 /* Macro formals start here */ #if NPARM >= 32 assertion fails -- NPARM isn't less than 32 #endif /* * Character type codes. */ #define INV 0 /* Invalid, must be zero */ #define OP_EOE INV /* End of expression */ #define DIG 1 /* Digit */ #define LET 2 /* Identifier start */ #define FIRST_BINOP OP_ADD #define OP_ADD 3 #define OP_SUB 4 #define OP_MUL 5 #define OP_DIV 6 #define OP_MOD 7 #define OP_ASL 8 #define OP_ASR 9 #define OP_AND 10 /* &, not && */ #define OP_OR 11 /* |, not || */ #define OP_XOR 12 #define OP_EQ 13 #define OP_NE 14 #define OP_LT 15 #define OP_LE 16 #define OP_GE 17 #define OP_GT 18 #define OP_ANA 19 /* && */ #define OP_ORO 20 /* || */ #define OP_QUE 21 /* ? */ #define OP_COL 22 /* : */ #define OP_CMA 23 /* , (relevant?) */ #define LAST_BINOP OP_CMA /* Last binary operand */ /* * The following are unary. */ #define FIRST_UNOP OP_PLU /* First Unary operand */ #define OP_PLU 24 /* + (draft ANSI standard) */ #define OP_NEG 25 /* - */ #define OP_COM 26 /* ~ */ #define OP_NOT 27 /* ! */ #define LAST_UNOP OP_NOT #define OP_LPA 28 /* ( */ #define OP_RPA 29 /* ) */ #define OP_END 30 /* End of expression marker */ #define OP_MAX (OP_END + 1) /* Number of operators */ #define OP_FAIL (OP_END + 1) /* For error returns */ /* * The following are for lexical scanning only. */ #define QUO 65 /* Both flavors of quotation */ #define DOT 66 /* . can start a number */ #define SPA 67 /* Space and tab */ #define BSH 68 /* Just a backslash */ #define END 69 /* EOF */ /* * The DEFBUF structure stores information about #defined * macros. Note that the defbuf->repl information is always * in malloc storage. */ typedef struct defbuf { struct defbuf *link; /* Next define in chain */ char *repl; /* -> replacement */ int hash; /* Symbol table hash */ int nargs; /* For define(args) */ char name[1]; /* #define name */ } DEFBUF; /* * The FILEINFO structure stores information about open files * and macros being expanded. */ typedef struct fileinfo { char *bptr; /* Buffer pointer */ int line; /* for include or macro */ FILE *fp; /* File if non-null */ struct fileinfo *parent; /* Link to includer */ char *filename; /* File/macro name */ char *progname; /* From #line statement */ char buffer[1]; /* current input line */ } FILEINFO; /* * nomacarg is a built-in #define on Decus C. */ #if COMMENT_INVISIBLE #ifdef nomacarg #define cput output /* Comment concatenates tokens */ #else #define cput(c) { if (c != COM_SPACE) putchar(c); } #endif #else #define cput putchar /* Comment == space */ #define cget get /* Normal get routine */ #endif #ifndef nomacarg #define streq(s1, s2) (strcmp(s1, s2) == 0) #endif /* * Error codes. VMS uses system definitions. * Decus C codes are defined in stdio.h. * Others are cooked to order. */ #if HOST == SYS_VMS #include <ssdef.h> #include <stsdef.h> #define IO_NORMAL (SS$_NORMAL | STS$M_INHIB_MSG) #define IO_ERROR SS$_ABORT #endif /* * Note: IO_NORMAL and IO_ERROR are defined in the Decus C stdio.h file */ #ifndef IO_NORMAL #define IO_NORMAL 0 #endif #ifndef IO_ERROR #define IO_ERROR 1 #endif /* * Externs */ extern int line; /* Current line number */ extern int wrongline; /* Force #line to cc pass 1 */ extern char type[]; /* Character classifier */ extern char token[IDMAX]; /* Current input token */ extern int instring; /* TRUE if scanning string */ extern int errors; /* Error counter */ extern int recursion; /* Macro depth counter */ extern FILEINFO *infile; /* Current input file */ extern char work[NWORK]; /* #define scratch */ extern char *workp; /* Free space in work */ #if DEBUG extern int debug; /* Debug level */ #endif extern char *getmem(); /* Get memory or die. */ extern DEFBUF *lookid(); /* Look for a #define'd thing */ extern DEFBUF *defendel(); /* Symbol table enter/delete */ extern char *savestring(); /* Stuff string in malloc mem. */ extern char *strcpy(); extern char *strcat(); extern char *strrchr(); extern char *strchr(); extern long time(); -h- cppdef.h Sat Sep 1 21:43:39 1984 cppdef.h /* * S y s t e m D e p e n d e n t * D e f i n i t i o n s f o r C P P * * Definitions in this file may be edited to configure CPP for particular * host operating systems and target configurations. * * NOTE: cpp assumes it is compiled by a compiler that supports macros * with arguments. If this is not the case (as for Decus C), #define * nomacarg -- and provide function equivalents for all macros. * * cpp also assumes the host and target implement the Ascii character set. * If this is not the case, you will have to do some editing here and there. */ /* * This redundant definition of TRUE and FALSE works around * a limitation of Decus C. */ #ifndef TRUE #define TRUE 1 #define FALSE 0 #endif /* * Define the HOST operating system. This is needed so that * cpp can use appropriate filename conventions. */ #define SYS_UNKNOWN 0 #define SYS_UNIX 1 #define SYS_VMS 2 #define SYS_RSX 3 #define SYS_RT11 4 #define SYS_LATTICE 5 #define SYS_ONYX 6 #define SYS_68000 7 #ifndef HOST #ifdef unix #define HOST SYS_UNIX #else #ifdef vms #define HOST SYS_VMS #else #ifdef rsx #define HOST SYS_RSX #else #ifdef rt11 #define HOST SYS_RSX #endif #endif #endif #endif #endif #ifndef HOST #define HOST SYS_UNKNOWN #endif /* * We assume that the target is the same as the host system */ #ifndef TARGET #define TARGET HOST #endif /* * In order to predefine machine-dependent constants, * several strings are defined here: * * MACHINE defines the target cpu * SYSTEM defines the target operating system * COMPILER defines the target compiler * * The above may be #defined as "" if they are not wanted. * They should not be #defined as NULL. * * LINE_PREFIX defines the # output line prefix, if not "line" * FILE_LOCAL marks functions which are referenced only in the * file they reside. Some C compilers allow these * to be marked "static" even though they are referenced * by "extern" statements elsewhere. */ #if TARGET == SYS_LATTICE #define MACHINE "i8086" #define SYSTEM "pcdos" /* Dos for IBM PC */ #endif #if TARGET == SYS_ONYX #define MACHINE "z8000" #define SYSTEM "unix" #endif #if TARGET == SYS_VMS #define MACHINE "vax" #define SYSTEM "vms" #define COMPILER "vax11c" #endif #if TARGET == SYS_RSX #define MACHINE "pdp11" #define SYSTEM "rsx" #define COMPILER "decus" #endif #if TARGET == SYS_RT11 #define MACHINE "pdp11" #define SYSTEM "rt11" #define COMPILER "decus" #endif #if TARGET == SYS_68000 #define MACHINE "M68000", "m68000", "m68k" #define SYSTEM "unix" #endif #if TARGET == SYS_UNIX #define SYSTEM "unix" #ifdef pdp11 #define MACHINE "pdp11" #else #ifdef vax #define MACHINE "vax" #endif #endif #endif /* * defaults */ #ifndef MSG_PREFIX #define MSG_PREFIX "cpp: " #endif #ifndef LINE_PREFIX #ifdef decus #define LINE_PREFIX "" #else #define LINE_PREFIX "line" #endif #endif /* * BITS_CHAR may be defined to set the number of bits per character. * it is needed only for multi-byte character constants. */ #ifndef BITS_CHAR #define BITS_CHAR 8 #endif /* * BIG_ENDIAN is set TRUE on machines (such as the IBM 360 series) * where 'ab' stores 'a' in the high-bits and 'b' in the low-bits. * It is set FALSE on machines (such as the PDP-11 and Vax-11) * where 'ab' stores 'a' in the low-bits and 'b' in the high-bits. * (Or is it the other way around?) */ #ifndef BIG_ENDIAN #define BIG_ENDIAN FALSE #endif /* * NO_REG_UNION is set TRUE for host compilers that do not allow * the following construction: * register union { * int i; * char *p; * } */ #ifndef NO_REG_UNION #ifdef pcc #define NO_REG_UNION 1 #else #define NO_REG_UNION 0 #endif #endif #if NO_REG_UNION #define REG_UNION union #else #define REG_UNION register union #endif /* * COMMENT_INVISIBLE may be defined to allow "old-style" comment * processing, whereby the comment becomes a zero-length token * delimiter. This permitted tokens to be concatenated in macro * expansions. This was removed from the Draft Ansi Standard. */ #ifndef COMMENT_INVISIBLE #define COMMENT_INVISIBLE 0 /* Comment == space */ #endif /* * STRING_FORMAL may be defined to allow recognition of macro parameters * in replacement strings. This was removed from the Draft Ansi Standard. */ #ifndef STRING_FORMAL #define STRING_FORMAL 0 /* No string formals */ #endif /* * Some common definitions. */ #ifndef DEBUG #define DEBUG 1 /* Compile debugging code */ #endif /* * The following definitions are used to allocate memory for * work buffers. In general, they should not be modified * by implementors. * * NPARM The maximum number of #define parameters * IDMAX The longest identifier * NBUFF Input buffer size * NWORK Work buffer size -- the longest macro * must fit here after expansion. * NEXP The nesting depth of #if expressions. * NINCLUDE The number of directories that may be specified * on a per-system basis, or by the -I option. */ #define NPARM 10 /* Max number of parameters */ #ifndef IDMAX #define IDMAX 31 /* Longest identifier per std. */ #endif #define NBUFF 256 /* Input buffer (line) size */ #define NWORK 256 /* Work buffer size */ #define NEXP 20 /* #if expression stack depth */ #define NINCLUDE 7 /* #include directories */ #define NPARMWORK (NWORK * 2) /* Parm work buffer size */ /* * Some special constants. These may need to be changed if cpp * is ported to a wierd machine. * * NOTE: if cpp is run on a non-ascii machine, ALERT and VT may * need to be changed. They are used to implement the proposed * ANSI standard C control characters '\a' and '\v' only. * DEL is used to tag macro tokens to prevent #define foo foo * from looping. Note that we don't try to prevent more elaborate * #define loops from occurring. */ #define ALERT '\007' /* '\a' is "Bell" */ #define VT '\013' /* Vertical Tab (CTRL/K) */ #define DEL '\177' /* Magic for #defines */ #ifndef FILE_LOCAL #ifdef decus #define FILE_LOCAL static #else #ifdef vax11c #define FILE_LOCAL static #else #define FILE_LOCAL /* gets global scope on others */ #endif #endif #endif -h- cpp1.c Sat Sep 1 21:43:39 1984 cpp1.c /* * CPP main program. * * Edit history * 21-May-84 MM "Field test" release * 23-May-84 MM Some minor hacks. * 30-May-84 ARF Didn't get enough memory for __DATE__ * Added code to read stdin if no input * files are provided. * 29-Jun-84 MM Added ARF's suggestions, Unixifying cpp. * 11-Jul-84 MM "Official" first release (that's what I thought!) * 22-Jul-84 MM/ARF/SCK Fixed line number bugs, added cpp recognition * of #line, fixed problems with #include. * 23-Jul-84 MM More (minor) include hacking, some documentation. * Also, redid cpp's #include files * 25-Jul-84 MM #line filename isn't used for #include searchlist * #line format is <number> <optional name> * 25-Jul-84 ARF/MM Various bugs, mostly serious. Removed homemade doprint * 01-Aug-84 MM Fixed recursion bug, remove extra newlines and * leading whitespace from cpp output. * 02-Aug-84 MM Hacked (i.e. optimized) out blank lines and unneeded * whitespace in general. Cleaned up unget()'s. * 03-Aug-84 Keie Several bug fixes from Ed Keizer, Vrije Universitet. * -- corrected arg. count in -D and pre-defined * macros. Also, allow \n inside macro actual parameter * lists. * 06-Aug-84 MM If debugging, dump the preset vector at startup. * 12-Aug-84 MM/SCK Some small changes from Sam Kendall * 15-Aug-84 Keie/MM cerror, cwarn, etc. take a single string arg. * cierror, etc. take a single int. arg. * changed LINE_PREFIX slightly so it can be * changed in the makefile. * 31-Aug-84 MM USENET net.sources release. */ /*)BUILD $(PROGRAM) = cpp $(FILES) = { cpp1 cpp2 cpp3 } $(INCLUDE) = { cppdef.h cpp.h } $(STACK) = 2000 $(TKBOPTIONS) = { STACK = 2000 } */ #ifdef DOCUMENTATION title cpp C Pre-Processor index C pre-processor synopsis .s.nf cpp [-options] [infile [outfile]] .s.f description CPP reads a C source file, expands macros and include files, and writes an input file for the C compiler. If no file arguments are given, cpp reads from stdin and writes to stdout. If one file argument is given, it will define the input file, while two file arguments define both input and output files. The following options are supported. Options may be given in either case. .lm +16 .p -16 -Idirectory Add this directory to the list of directories searched for #include "..." and #include <...> commands. Note that there is no space between the "-I" and the directory string. More than one -I command is permitted. On non-Unix systems "directory" is forced to upper-case. .p -16 -Dname=value Define the name as if the programmer wrote .s #define name value .s at the start of the first file. If "=value" is not given, a value of "1" will be used. .s On non-unix systems, all alphabetic text will be forced to upper-case. .s .p -16 -Uname Undefine the name as if .s #undef name .s were given. On non-Unix systems, "name" will be forced to upper-case. .s.lm -16 The following variables are pre-defined: .s Target computer (as appropriate): .s pdp11, vax, M68000 m68000 m68k .s Target operating system (as appropriate): .s rsx, rt11, vms, unix .s Target compiler (as appropriate): .s decus, vax11c .s The implementor may add definitions to this list. The default definitions match the definition of the host computer, operating system, and C compiler. .s The following are always available unless undefined: .lm +16 .p -12 __FILE__ The input (or #include) file being compiled (as a quoted string). .p -12 __LINE__ The line number being compiled. .p -12 __DATE__ The date and time of compilation as a Unix ctime quoted string (the trailing newline is removed). Thus, .s printf("Bug at line %s,", __LINE__); printf(" source file %s", __FILE__); printf(" compiled on %s", __DATE__); .p -16 -Xnumber Enable debugging code. If no value is given, a value of 1 will be used. (For maintenence of CPP only.) .s.lm -16 Draft Ansi Standard Considerations Comments are removed from the input text. The comment is replaced by a single space character. This differs from usage on some existing preprocessors (but it follows the Draft Ansi C Standard). Note that arguments may be concatenated as follows: .s.nf #define I(x)x #define CAT(x,y)I(x)y int value = CAT(1,2); .s.f If the above macros are defined and invoked without extraneous spaces, they will be transportable to other implementations. Unfortunately, this will not properly expand .s.nf int CAT(foo,__LINE__); int CAT(foo,__LINE__); .s.f as __LINE__ is copied into the input stream, yielding "foo__LINE__" in both cases, rather than the expected "foo123", "foo124", which would result if __LINE__ were expanded and the result copied into the input stream. Macro formal parameters are not recognized within quoted strings and character constants in macro definitions. CPP implements most of the ANSI draft standard. You should be aware of the following differences: .lm +4 .s.i-4;o###In the draft standard, the _\n (backslash-newline) character is "invisible" to all processing. In this implementation, it is invisible to strings, but acts a "whitespace" (token-delimiter) outside of strings. This considerably simplifies error message handling. .s.i-4;o###The following new features of C are processed by cpp: .s .br;####_#elif expression####(_#else _#if) .br;####'_\xNNN'#############(Hexadecimal constants) .br;####'_\a'################(Ascii BELL [silly]) .br;####'_\v'################(Ascii VT) .br;####_#if defined NAME####(1 if defined, 0 if not) .br;####_#if defined (NAME)##(1 if defined, 0 if not) .br;####_unary +#############(gag me with a spoon) .s.i-4;o###The draft standard has extended C, adding a string concatenation operator, where .s "foo" "bar" .s is regarded as the single string "foobar". (This does not affect CPP's processing.) .s.lm -4 error messages Many. CPP prints warning messages if you try to use multiple-byte character constants (non-transportable) or if you #undef a symbol that was not defined. bugs Cpp prints spurious error or warning messages in #if sequences such as the following: .s .br;####_#define foo 0 .br;####_#if (foo != 0) _? (100 / foo) _: 0 .br;####_#undef foo .br;####_#if ((defined(foo)) _? foo _: 0) == 1 .s Cpp should supress the error message if the expression's value is already known. author Martin Minow #endif #include <stdio.h> #include <ctype.h> #include "cppdef.h" #include "cpp.h" /* * Commonly used global variables: * line is the current input line number. * wrongline is set in many places when the actual output * line is out of sync with the numbering, e.g, * when expanding a macro with an embedded newline. * * Note that line and wrongline are initialized in such * a way that the code starts by outputting a #line. * * token holds the last identifier scanned (which might * be a candidate for macro expansion). * errors is the running cpp error counter. * infile is the head of a linked list of input files (extended by * #include and macros being expanded). infile always points * to the current file/macro. infile->parent to the includer, * etc. infile->fd is NULL if this input stream is a macro. */ int line; /* Current line number */ int wrongline; /* Force #line to compiler */ char token[IDMAX]; /* Current input token */ int errors; /* cpp error counter */ FILEINFO *infile = NULL; /* Current input file */ #if DEBUG int debug; /* TRUE if debugging now */ #endif /* * This counter is incremented when a macro expansion is initiated. * If it exceeds a built-in value, the expansion stops -- this tests * for a runaway condition: * #define X Y * #define Y X * X * It is decremented, in get(), when the macro expansion terminates. */ int recursion; /* Infinite recursion counter */ /* * instring is set TRUE when a string is scanned. It modifies the * behavior of the "get next character" routine -- comments aren't * skipped over, and \<newline> is silently absorbed. It is set * by routines that scan "string" and 'char'. It is essentially * a parameter to the get() routine, but made global for speed. */ int instring = FALSE; /* TRUE if scanning string */ /* * work[] and workp are used to store one piece of text in a temporay * buffer. To initialize storage, set workp = work. To store one * character, call save(c); (This will fatally exit if there isn't * room.) To terminate the string, call save(EOS). Note that * the work buffer is used by several subroutines -- be sure your * data won't be overwritten. */ char work[NWORK]; /* Work buffer */ char *workp; /* Work buffer pointer */ /* * flevel and tlevel are used to compute #if nesting. if flevel == 0, * cpp is emitting tokens, if > 0, it is skipping over tokens to an * #else or #endif. Hard to understand code in control() modifies the * counters when #else, #endif, or another #if is processed. */ static int flevel = 0; /* #ifdef false level */ static int tlevel = 0; /* #ifdef true level */ /* * incdir[] and ninclude store the -i directories (and the system-specific * #include <...> directories. */ static char *incdir[NINCLUDE]; /* -i directories */ static int ninclude; /* Number of -i directories */ /* * This is the table used to predefine target machine and operating * system designators. It may need hacking for specific circumstances. * Note: it is not clear that this is part of the Ansi Standard. */ static char *preset[] = { /* names defined at cpp start */ #ifdef MACHINE MACHINE, #endif #ifdef SYSTEM SYSTEM, #endif #ifdef COMPILER COMPILER, #endif #if DEBUG "decus_cpp", /* Ourselves! */ #endif NULL /* Must be last */ }; /* * The value of these predefined symbols must be recomputed whenever * they are evaluated. The order must not be changed. */ static char *magic[] = { /* Note: order is important */ "__LINE__", "__FILE__", NULL /* Must be last */ }; main(argc, argv) int argc; char *argv[]; { register int i; #if HOST == SYS_VMS argc = getredirection(argc, argv); #endif initdefines(); switch (dooptions(argc, argv)) { case 0: /* No args? */ case 1: /* No files, stdin -> stdout */ #if HOST == SYS_UNIX work[0] = EOS; /* Unix can't find stdin name */ #else fgetname(stdin, work); /* Vax-11C, Decus C know name */ #endif break; case 3: #if HOST == SYS_VMS /* * Reopen stdout with "vanilla rms" attributes. */ if ((i = creat(argv[2], 0, "rat=cr", "rfm=var")) == -1 || dup2(i, fileno(stdout)) == -1) { #else if (freopen(argv[2], "w", stdout) == NULL) { #endif perror(argv[2]); cerror("Can't open output file \"%s\"", argv[2]); exit(IO_ERROR); } /* Continue by opening input */ case 2: /* One file -> stdout */ if (freopen(argv[1], "r", stdin) == NULL) { perror(argv[1]); cerror("Can't open input file \"%s\"", argv[1]); exit(IO_ERROR); } strcpy(work, argv[1]); /* Remember input filename */ break; default: exit(IO_ERROR); /* Can't happen */ } setincdirs(); /* Setup -I include directories */ addfile(stdin, work); /* "open" main input file */ #if DEBUG if (debug > 0) dumpdef("preset #define symbols"); #endif cppmain(); /* Process main file */ if ((i = flevel + tlevel) != 0) cierror("Inside #ifdef block at end of input, depth = %d", i); fclose(stdout); if (errors > 0) { fprintf(stderr, (errors == 1) ? "%d error in preprocessor" : "%d errors in preprocessor", errors); exit(IO_ERROR); } exit(IO_NORMAL); } FILE_LOCAL cppmain() /* * Main process for cpp -- copies tokens from the current input * stream (main file, include file, or a macro) to the output * file. */ { register int c; /* Current character */ register int counter; /* newlines and spaces */ extern int output(); /* Output one character */ /* * Explicitly output a #line at the start of cpp output so * that lint (etc.) knows the name of the original source * file. If we don't do this explicitly, we may get * the name of the first #include file instead. */ line = 1; sharp(); line = 0; /* * This loop is started "from the top" at the beginning of each * line. wrongline is set TRUE in many places if it is necessary * to write a #line record. (But we don't write them when expanding * macros.) * The counter variable has two different uses: at * the start of a line, it counts the number of blank lines that * have been skipped over. These are then either output via * #line records or by outputting explicit blank lines. * * When expanding tokens within a line, the counter remembers * whether a blank/tab has been output. These are dropped * at the end of the line, and replaced by a single blank * within lines. */ for (;;) { for (counter = 0;; counter++) { while (type[(c = get())] == SPA) /* Skip leading blanks */ ; /* in this line. */ if (c == '\n') /* If line's all blank, */ ; /* Do nothing now */ else if (c == '#') /* Is 1st non-space '#' */ control(); /* Yes, do a #command */ else if (c == EOF_CHAR) /* At end of file? */ break; else if (flevel > 0) /* #ifdef false? */ skipnl(); /* Skip to newline */ else { break; /* Actual token */ } } if (c == EOF_CHAR) /* Exit process at */ break; /* End of file */ /* * If the loop didn't terminate because of end of file, we * know there is a token to compile. First, clean up after * absorbing newlines. counter has the number we skipped. */ if (wrongline && infile->fp != NULL) sharp(); /* Output # line number */ else if (counter > 0) { /* Get rid of the */ if (counter > 4) /* pending newlines. */ sharp(); /* (lots of them here) */ else { /* If just a few, stuff */ while (--counter >= 0) /* them out ourselves */ putchar('\n'); } } /* * Process each token on this line. counter * is now used to skip over trailing blanks. */ for (counter = 0; c != EOF_CHAR && c != '\n';) { if (type[c] == SPA) counter++; else { if (counter > 0) { /* Any pending */ putchar(' '); /* whitespace is output */ counter = 0; /* restart the counter */ } switch (type[c]) { case LET: if (!macroid(c)) /* Scan ID; do macros */ fputs(token, stdout); /* Just output if not */ break; case DIG: /* Digits and '.' may */ case DOT: /* begin numbers */ scannumber(c, output); /* Output the number */ break; case QUO: /* char or string const */ scanstring(c, output); /* Copy it to output */ break; default: /* Some other character */ cput(c); /* Just output it */ break; } /* Switch ends */ } /* if not a space */ c = get(); /* And get another */ } if (c == '\n') { /* Compiling at EOL? */ putchar('\n'); /* Output newline, if */ if (infile->fp == NULL) /* Expanding a macro, */ wrongline = TRUE; /* Output # line later */ } } /* Continue until EOF */ } FILE_LOCAL output(c) int c; /* * Output one character to stdout -- output() is passed as an * argument to scanstring() */ { #if COMMENT_INVISIBLE if (c != COM_SPACE) putchar(c); #else putchar(c); #endif } static char *sharpfilename = NULL; FILE_LOCAL sharp() /* * Output a line number line. */ { register char *name; printf("#%s %d", LINE_PREFIX, line); if (infile->fp != NULL) { name = (infile->progname != NULL) ? infile->progname : infile->filename; if (sharpfilename == NULL || sharpfilename != NULL & !streq(name, sharpfilename)) { if (sharpfilename != NULL) free(sharpfilename); sharpfilename = savestring(name); printf(" \"%s\"", name); } } putchar('\n'); wrongline = FALSE; } /* * Process #control lines */ #define ISIFNDEF FALSE /* Must be FALSE */ #define ISIFDEF TRUE /* Must be TRUE */ #define ISIF (TRUE + 1) /* Must have onebit set */ #if (ISIF == ISIFNDEF) error << The above won't work >> #endif /* * The following is generated by a "perfect hash" routine. */ #define L_else 4 #define L_line 5 #define L_define 6 #define L_elif 7 #define L_endif 8 #define L_if 9 #define L_undef 10 #define L_include 11 #define L_ifdef 12 #define L_ifndef 13 #define L_assert 14 #define L_option 15 #define FIRST 'a' #define LAST 'u' static char px_assoc[] = { 0, /* 'a' */ -1, /* 'b' */ -1, /* 'c' */ 0, /* 'd' */ 0, /* 'e' */ 3, /* 'f' */ -1, /* 'g' */ -1, /* 'h' */ 4, /* 'i' */ -1, /* 'j' */ -1, /* 'k' */ 1, /* 'l' */ -1, /* 'm' */ 9, /* 'n' */ 0, /* 'o' */ -1, /* 'p' */ -1, /* 'q' */ -1, /* 'r' */ -1, /* 's' */ 8, /* 't' */ 2, /* 'u' */ }; static char *px_table[] = { NULL, /* 0 */ NULL, /* 1 */ NULL, /* 2 */ NULL, /* 3 */ "else", /* 4 */ "line", /* 5 */ "define", /* 6 */ "elif", /* 7 */ "endif", /* 8 */ "if", /* 9 */ "undef", /* 10 */ "include", /* 11 */ "ifdef", /* 12 */ "ifndef", /* 13 */ "assert", /* 14 */ "option", /* 15 */ }; FILE_LOCAL control() /* * Process #control lines. Simple commands are processed inline, * while complex commands have their own subroutines. */ { register int c; register char *tp; register int hash; char *ep; c = skipws(); if (c == '\n' || c == EOF_CHAR) return; scanid(token, c); /* * Look for keyword (string of alpha) in the perfect hash table. * Set hash to the index (L_xxx value) or 0 if not found */ if (token[0] < FIRST || token[0] > LAST) hash = 0; else { for (tp = token; isalpha(*tp); tp++) ; hash = (tp - token); if (*--tp < FIRST || *tp > LAST) hash = 0; else { hash += px_assoc[*token - FIRST] + px_assoc[*tp - FIRST]; if (px_table[hash] == NULL || !streq(token, px_table[hash])) hash = 0; } } /* * hash is now set to a unique value corresponding to the * control keyword (or zero if it's not in the table). */ if (infile->fp == NULL) cwarn("Control line \"%s\" within macro expansion", token); if (flevel > 0) { switch (hash) { case L_line: /* These aren't */ case L_include: /* interesting */ case L_define: /* if we */ case L_undef: /* aren't */ case L_assert: /* compiling. */ case L_option: /* New option, too. */ skipnl(); return; } } switch (hash) { case L_line: /* * Parse the line to update the line number and "progname" * field and line number for the next input line. * Set wrongline to force it out later. */ c = skipws(); workp = work; /* Save name in work */ while (c != '\n' && c != EOF_CHAR) { if (c != '"') save(c); c = get(); } unget(); save(EOS); /* * Split #line argument into <line-number> and <name> * We subtract 1 as we want the number of the next line. */ line = atoi(work) - 1; /* Reset line number */ for (tp = work; isdigit(*tp) || type[*tp] == SPA; tp++) ; /* Skip over digits */ if (*tp != EOS) { /* Got a filename, so: */ if (*tp == '"' && (ep = strrchr(tp + 1, '"')) != NULL) { tp++; /* Skip over left quote */ *ep = EOS; /* And ignore right one */ } if (infile->progname != NULL) /* Give up the old name */ free(infile->progname); /* if it's allocated. */ infile->progname = savestring(tp); } wrongline = TRUE; /* Force output later */ break; case L_include: doinclude(); break; case L_define: dodefine(); break; case L_undef: doundef(); break; case L_ifdef: doif(ISIFDEF); break; case L_ifndef: doif(ISIFNDEF); break; case L_elif: case L_else: if (flevel == 0) { /* Compiling now? */ if (tlevel == 0) /* Yes, but in an if? */ cerror("#%s without corresponding #if", (hash == L_elif) ? "elif" : "else"); else { /* Ok: */ flevel++; /* Make it false. */ tlevel--; /* False isn't true */ } } else if (--flevel == 0) { /* Drop false count, */ tlevel++; /* Step true if need be */ wrongline = TRUE; /* Need #line now */ } else { /* Not compiling yet so */ flevel++; /* Keep it false */ } if (hash == L_else) /* Else stops here */ break; if (flevel > 0) /* #elif, fake an */ flevel--; /* #endif and fall */ else if (tlevel > 0) /* into #if */ tlevel--; /* processor. */ case L_if: doif(ISIF); break; case L_endif: if (flevel > 0) { /* If not compiling */ if (--flevel == 0) /* Maybe start, if so, */ wrongline = TRUE; /* Need a #line first */ } else if (tlevel > 0) /* Still compiling, but */ tlevel--; /* Drop true counter */ else { cerror("#endif without corresponding #if", NULLST); } break; case L_assert: if (eval() == 0) cerror("Preprocessor assertion failure", NULLST); break; case L_option: /* * #option is provided to pass "pragmas" to later * passes of the compiler. cpp doesn't have any yet. */ printf("#option "); while ((c = get()) != '\n' && c != EOF_CHAR) cput(c); unget(); break; default: #if DEBUG /* * For debugging, we allow #debug and #nodebug */ if (streq("debug", token)) { debug++; break; } if (streq("nodebug", token)) { debug--; break; } #endif /* * Undefined #control keyword. * Note: the correct behavior may be to warn and * pass the line to a subsequent compiler pass. * This would allow #asm or similar extensions. */ cwarn("Illegal # line", NULLST); skipws(); unget(); break; } #if 1 skipnl(); /* Dump rest of control line */ #else if (skipws() != '\n') { /* * Some people have written: * #ifdef foobar * ... * #endif foobar * * Vax-11 C doesn't print a warning, so we don't either. */ cwarn("Unrecognized text after control command", NULLST); while ((c = get()) != '\n' && c != EOF_CHAR) ; } #endif } FILE_LOCAL doif(isifdef) int isifdef; /* * Process an #if, #ifdef, or #ifndef. The latter two are straightforward, * while #if needs a subroutine of its own to evaluate the expression. * Eventually, tlevel and flevel are modified accordingly. */ { register int c; register int found; if ((c = skipws()) == '\n' || c == EOF_CHAR) { unget(); goto badif; } if (isifdef == ISIF) { unget(); found = (eval() != 0); /* Evaluate expr, != 0 is TRUE */ isifdef = TRUE; /* #if is now like #ifdef */ } else { if (type[c] != LET) /* Next non-blank isn't letter */ goto badif; /* ... is an error */ found = (lookid(c) != NULL); /* Look for it in symbol table */ } if (flevel == 0 && (isifdef == found)) tlevel++; else flevel++; return; badif: cerror("#if, #ifdef, or #ifndef without an argument", NULLST); } FILE_LOCAL doinclude() /* * Process the #include control line. */ { register int c; int delim; delim = skipws(); if (delim != '<' && delim != '"') goto incerr; if (delim == '<') delim = '>'; workp = work; while ((c = get()) != EOF_CHAR && c != '\n' && c != delim) { #if COMMENT_INVISIBLE if (c != COM_SPACE) save(c); #else save(c); #endif } save(EOS); if (c != delim) goto incerr; skipnl(); /* Ignore rest of #include line */ unget('\n'); /* Force nl after includee */ openinclude(work, (delim == '"')); return; incerr: cerror("#include syntax error", NULLST); return; } FILE_LOCAL openinclude(filename, searchlocal) char *filename; /* Input file name */ int searchlocal; /* TRUE if #include "file" */ /* * Actually open an include file. This routine is only called from * doinclude() above, but was written as a separate subroutine for * programmer convenience. It searches the list of directories * and actually opens the file, linking it into the list of * active files. */ { register char *tp; /* -> source file name */ register int i; char tmpname[NWORK]; /* Filename work area */ if (searchlocal) { /* * Look in local directory first */ #if HOST == SYS_UNIX /* * Try to open filename relative to the directory of the current * source file (as opposed to the current directory). (ARF, SCK). */ if (filename[0] == '/' || (tp = strrchr(infile->filename, '/')) == NULL) strcpy(tmpname, filename); else { sprintf(tmpname, "%.*s/%s", tp - infile->filename, infile->filename, filename); } if (openfile(tmpname)) return; #else /* * Same problem, but for DEC operating systems. * Filenames may have "device:[directory]" */ if (strchr(filename, ']') != NULL || strchr(filename, ':') != NULL || ( (tp = strrchr(infile->filename, ']')) == NULL && (tp = strrchr(infile->filename, ':')) == NULL)) strcpy(tmpname, filename); else { sprintf(tmpname, "%.*s%s", tp - infile->filename + 1, infile->filename, filename); } if (openfile(tmpname)) return; #endif } /* * Look in any directories specified by -I command line * arguments, then in the builtin search list. */ for (i = 0; i < ninclude; i++) { if (strlen(incdir[i]) + strlen(filename) >= (NWORK - 1)) cfatal("Filename work buffer overflow", NULLST); else { #if HOST == SYS_UNIX if (filename[0] == '/') strcpy(tmpname, filename); else { sprintf(tmpname, "%s/%s", incdir[i], filename); } #else if (strrchr(filename, ']') != NULL || strrchr(filename, ':') != NULL) strcpy(tmpname, filename); else { sprintf(tmpname, "%s%s", incdir[i], filename); } #endif if (openfile(tmpname)) return; } } /* * No sense continuing if #include file isn't there. */ cfatal("Cannot open include file \"%s\"", filename); } FILE_LOCAL int openfile(filename) char *filename; /* * Open a file, add it to the linked list of open files. * This is called only from openfile() above. */ { register FILE *fp; if ((fp = fopen(filename, "r")) == NULL) return (FALSE); #if DEBUG if (debug) fprintf(stderr, "Reading from \"%s\"\n", filename); #endif addfile(fp, filename); return (TRUE); } FILE_LOCAL addfile(fp, filename) FILE *fp; /* Open file pointer */ char *filename; /* Name of the file */ /* * Initialize tables for this open file. This is called from openfile() * above (for #include files), and from the entry to cpp to open the main * input file. It calls a common routine, getfile() to build the FILEINFO * structure which is used to read characters. (getfile() is also called * to setup a macro replacement.) */ { register FILEINFO *file; extern FILEINFO *getfile(); file = getfile(NBUFF, filename); file->fp = fp; /* Better remember FILE * */ file->buffer[0] = '\n'; /* Fake initial newline to */ file->buffer[1] = EOS; /* initialize for first read */ line = 0; /* Note correct line number */ wrongline = TRUE; /* Force out initial #line */ } FILE_LOCAL setincdirs() /* * Append system-specific directories to the include directory list. * Called only when cpp is started. */ { #if HOST == SYS_UNIX incdir[ninclude++] = "/usr/include"; #define MAXINCLUDE (NINCLUDE - 1) #endif #if HOST == SYS_VMS extern char *getenv(); if (getenv("C$LIBRARY") != NULL) incdir[ninclude++] = "C$LIBRARY:"; incdir[ninclude++] = "SYS$LIBRARY:"; #define MAXINCLUDE (NINCLUDE - 2) #endif #if HOST == SYS_RSX extern int $$rsts; /* TRUE on RSTS/E */ extern int $$pos; /* TRUE on PRO-350 P/OS */ extern int $$vms; /* TRUE on VMS compat. */ if ($$pos) { /* P/OS? */ incdir[ninclude++] = "SY:[ZZDECUSC]"; incdir[ninclude++] = "LB:[1,5]"; } else if ($$rsts) { /* RSTS/E? */ incdir[ninclude++] = "SY:@"; /* User-defined account */ incdir[ninclude++] = "C:"; /* Decus-C library */ incdir[ninclude++] = "LB:[1,1]"; /* RSX library */ } else if ($$vms) { /* VMS compatibility? */ incdir[ninclude++] = "C:"; } else { /* Plain old RSX/IAS */ incdir[ninclude++] = "LB:[1,1]"; } #define MAXINCLUDE (NINCLUDE - 3) #endif #if HOST == SYS_RT11 extern int $$rsts; /* RSTS/E emulation? */ if ($$rsts) incdir[ninclude++] = "SY:@"; /* User-defined account */ incdir[ninclude++] = "C:"; /* Decus-C library disk */ incdir[ninclude++] = "SY:"; /* System (boot) disk */ #define MAXINCLUDE (NINCLUDE - 3) #endif } FILE_LOCAL int dooptions(argc, argv) int argc; char *argv[]; /* * dooptions is called to process command line arguments (-Detc). * It is called only at cpp startup. */ { register char *ap; register DEFBUF *dp; register int c; int i, j; char *arg; for (i = j = 1; i < argc; i++) { arg = ap = argv[i]; if (*ap++ != '-') argv[j++] = argv[i]; else { c = *ap++; /* Option byte */ if (islower(c)) /* Normalize case */ c = toupper(c); switch (c) { /* Command character */ case 'I': /* Include directory */ if (ninclude >= MAXINCLUDE) cfatal("Too many include directories", NULLST); incdir[ninclude++] = ap; break; case 'D': /* Define symbol */ #if HOST != SYS_UNIX zap_uc(ap); /* Force define to U.C. */ #endif /* * If the option is just "-Dfoo", make it -Dfoo=1 */ while (*ap != EOS && *ap != '=') ap++; if (*ap == EOS) ap = "1"; else *ap++ = EOS; /* * Now, save the word and its definition. */ dp = defendel(argv[i] + 2, FALSE); dp->repl = savestring(ap); dp->nargs = DEF_NOARGS; break; case 'U': /* Undefine symbol */ #if HOST != SYS_UNIX zap_uc(ap); #endif if (defendel(ap, TRUE) == NULL) cwarn("\"%s\" wasn't defined", ap); break; #if DEBUG case 'X': /* Debug */ debug = (isdigit(*ap)) ? atoi(ap) : 1; fprintf(stderr, "Debug set to %d\n", debug); break; #endif default: /* What is this one? */ cwarn("Unknown option \"%s\"\n\ The following options are valid:\n\ -Dsymbol=value\tDefine a symbol with the given (optional) value\n\ -Idirectory\tAdd a directory to the #include search list\n\ -Usymbol\tUndefine symbol\n", arg); #if DEBUG fprintf(stderr, "-Xvalue\tSet internal debug flag\n"); #endif break; } /* Switch on all options */ } /* If it's a -option */ } /* For all arguments */ if (j > 3) { cerror( "Too many file arguments. Usage: cpp [input [output]]", NULLST); } return (j); /* Return new argc */ } #if HOST != SYS_UNIX FILE_LOCAL zap_uc(ap) register char *ap; /* * Dec operating systems mangle upper-lower case in command lines. * This routine forces the -D and -U arguments to uppercase. * It is called only on cpp startup by dooptions(). */ { while (*ap != EOS) { /* * Don't use islower() here so it works with Multinational */ if (*ap >= 'a' && *ap <= 'z') *ap = toupper(*ap); ap++; } } #endif FILE_LOCAL initdefines() /* * Initialize the built-in #define's. There are two flavors: * #define decus 1 (static definitions) * #define __FILE__ ?? (dynamic, evaluated by magic) * Called only on cpp startup. */ { register char **pp; REG_UNION { int i; char *p; } t; register DEFBUF *dp; long tvec; extern char *ctime(); /* * Predefine the built-in symbols. Allow the * implementor to pre-define a symbol as "" to * eliminate it. */ for (pp = preset; *pp != NULL; pp++) { if (*pp[0] != EOS) { dp = defendel(*pp, FALSE); dp->repl = savestring("1"); dp->nargs = DEF_NOARGS; } } /* * The magic pre-defines (__FILE__ and __LINE__ are * initialized with negative argument counts. expand() * notices this and calls the appropriate routine. * DEF_NOARGS is one greater than the first "magic" definition. */ for (pp = magic, t.i = DEF_NOARGS; *pp != NULL; pp++) { dp = defendel(*pp, FALSE); dp->nargs = --t.i; } /* * Define __DATE__ as today's date. */ dp = defendel("__DATE__", FALSE); dp->repl = t.p = getmem(27); dp->nargs = DEF_NOARGS; time(&tvec); *t.p++ = '"'; strcpy(t.p, ctime(&tvec)); t.p[24] = '"'; /* Overwrite newline */ } #if HOST == SYS_VMS /* * getredirection() is intended to aid in porting C programs * to VMS (Vax-11 C) which does not support '>' and '<' * I/O redirection. With suitable modification, it may * useful for other portability problems as well. */ FILE_LOCAL int getredirection(argc, argv) int argc; char **argv; /* * Process vms redirection arg's. Exit if any error is seen. * If getredirection() processes an argument, it is erased * from the vector. getredirection() returns a new argc value. * * Warning: do not try to simplify the code for vms. The code * presupposes that getredirection() is called before any data is * read from stdin or written to stdout. * * Normal usage is as follows: * * main(argc, argv) * int argc; * char *argv[]; * { * argc = getredirection(argc, argv); * } */ { register char *ap; /* Argument pointer */ int i; /* argv[] index */ int j; /* Output index */ int file; /* File_descriptor */ extern int errno; /* Last vms i/o error */ for (j = i = 1; i < argc; i++) { /* Do all arguments */ switch (*(ap = argv[i])) { case '<': /* <file */ if (freopen(++ap, "r", stdin) == NULL) { perror(ap); /* Can't find file */ exit(errno); /* Is a fatal error */ } break; case '>': /* >file or >>file */ if (*++ap == '>') { /* >>file */ /* * If the file exists, and is writable by us, * call freopen to append to the file (using the * file's current attributes). Otherwise, create * a new file with "vanilla" attributes as if the * argument was given as ">filename". * access(name, 2) returns zero if we can write on * the specified file. */ if (access(++ap, 2) == 0) { if (freopen(ap, "a", stdout) != NULL) break; /* Exit case statement */ perror(ap); /* Error, can't append */ exit(errno); /* After access test */ } /* If file accessable */ } /* * On vms, we want to create the file using "standard" * record attributes. creat(...) creates the file * using the caller's default protection mask and * "variable length, implied carriage return" * attributes. dup2() associates the file with stdout. */ if ((file = creat(ap, 0, "rat=cr", "rfm=var")) == -1 || dup2(file, fileno(stdout)) == -1) { perror(ap); /* Can't create file */ exit(errno); /* is a fatal error */ } /* If '>' creation */ break; /* Exit case test */ default: argv[j++] = ap; /* Not a redirector */ break; /* Exit case test */ } } /* For all arguments */ argv[j] = NULL; /* Terminate argv[] */ return (j); /* Return new argc */ } #endif
minow@decvax.UUCP (Martin Minow) (09/04/84)
-h- cpp2.c Sat Sep 1 21:43:42 1984 cpp2.c /* * C P P 2 . C * M a c r o D e f i n i t i o n s * a n d E x p r e s s i o n E v a l u a t i o n * * Edit History * 31-Aug-84 MM USENET net.sources release */ #include <stdio.h> #include <ctype.h> #include "cppdef.h" #include "cpp.h" /* * parm[], parmp, and parlist[] are used to store #define() argument * lists. nargs contains the actual number of parameters stored. */ static char parm[NPARMWORK + 1]; /* define param work buffer */ static char *parmp; /* Free space in parm */ static char *parlist[NPARM]; /* -> start of each parameter */ static int nargs; /* Parameters for this macro */ dodefine() /* * Called from control when a #define is scanned. This module * parses formal parameters and the replacement string. When * the formal parameter name is encountered in the replacement * string, it is replaced by a character in the range 128 to * 128+NPARAM (this allows up to 32 parameters within the * Dec Multinational range). If cpp is ported to an EBCDIC * machine, you will have to make other arrangements. * * There is some special case code to distinguish * #define foo bar * from #define foo() bar * * Also, we make sure that * #define foo foo * doesn't put cpp into an infinite loop. * * A warning message is printed if you redefine a symbol to a * different text. I.e, * #define foo 123 * #define foo 123 * is ok, but * #define foo 123 * #define foo +123 * is not. * * The following subroutines are called from define(): * checkparm called when a token is scanned. It checks through the * array of formal parameters. If a match is found, the * token is replaced by a control byte which will be used * to locate the parameter when the macro is expanded. * textput puts a string in the macro work area (parm[]), updating * parmp to point to the first free byte in parm[]. * textput() tests for work buffer overflow. * charput puts a single character in the macro work area (parm[]) * in a manner analogous to textput(). */ { register int c; register DEFBUF *dp; /* -> new definition */ int isredefine; /* TRUE if redefined */ char *old; /* Remember redefined */ #if STRING_FORMAL int delim; /* String delimiter */ #endif extern int save(); /* Save char in work[] */ if (type[(c = skipws())] != LET) goto bad_define; isredefine = FALSE; /* Set if redefining */ if ((dp = lookid(c)) == NULL) /* If not known now */ dp = defendel(token, FALSE); /* Save the name */ else { /* It's known: */ isredefine = TRUE; /* Remember this fact */ old = dp->repl; /* Remember replacement */ dp->repl = NULL; /* No replacement now */ } parlist[0] = parmp = parm; /* Setup parm buffer */ if ((c = get()) == '(') { /* With arguments? */ nargs = 0; /* Init formals counter */ do { /* Collect formal parms */ if (nargs >= NPARM) cfatal("Too many arguments for macro", NULLST); else if ((c = skipws()) == ')') break; /* Got them all */ else if (type[c] != LET) /* Bad formal syntax */ goto bad_define; scanid(token, c); /* Get the formal param */ parlist[nargs++] = parmp; /* Save its start */ textput(token); /* Save text in parm[] */ } while ((c = skipws()) == ','); /* Get another argument */ if (c != ')') /* Must end at ) */ goto bad_define; c = ' '; /* Will skip to body */ } else { /* * DEF_NOARGS is needed to distinguish between * "#define foo" and "#define foo()". */ nargs = DEF_NOARGS; /* No () parameters */ } if (type[c] == SPA) /* At whitespace? */ c = skipws(); /* Not any more. */ workp = work; /* Replacement put here */ while (c != EOF_CHAR && c != '\n') { /* Compile macro body */ switch (type[c]) { case LET: checkparm(c, dp); /* Might be a formal */ break; case DIG: /* Number in mac. body */ case DOT: scannumber(c, save); break; case QUO: /* String in mac. body */ #if STRING_FORMAL save(c); instring = TRUE; delim = c; while ((c = get()) != delim && c != '\n' && c != EOF_CHAR) { if (type[c] == LET) /* Maybe formal parm */ checkparm(c, dp); else { save(c); if (c == '\\') save(get()); } } instring = FALSE; if (c != delim) cerror("Unterminated string in macro body", NULLST); save(c); #else scanstring(c, save); #endif break; case BSH: /* Backslash */ if ((c = get()) == '\n') { save('\n'); wrongline = TRUE; } else { save('\\'); save(c); } break; case SPA: /* Absorb whitespace */ /* * Note: the "end of comment" marker is passed on * to allow comments to separate tokens. */ if (workp[-1] == ' ') /* Absorb multiple */ break; /* spaces */ else if (c == '\t') c = ' '; /* Normalize tabs */ /* Fall through to store character */ default: /* Other character */ save(c); break; } c = get(); } unget(); /* For control check */ if (workp > work && workp[-1] == ' ') /* Drop trailing blank */ workp--; *workp = EOS; /* Terminate work */ dp->repl = savestring(work); /* Save the string */ dp->nargs = nargs; /* Save arg count */ #if DEBUG if (debug) dumpadef("macro definition", dp); #endif if (isredefine) { /* Warn on redefinition */ if ((old != NULL && dp->repl != NULL && !streq(old, dp->repl)) || (old == NULL && dp->repl != NULL) || (old != NULL && dp->repl == NULL)) { cwarn("Redefining macro \"%s\"", dp->name); } if (old != NULL) /* We don't need the */ free(old); /* old definition now. */ } return; bad_define: cerror("#define syntax error", NULLST); } checkparm(c, dp) register int c; DEFBUF *dp; /* * Replace this param if it's defined. Note that the macro name is a * possible replacement token. We stuff DEL in front of the token * which is treated as a LETTER by the token scanner and eaten by * the output routine. This prevents the macro expander from * looping if someone writes "#define foo foo". */ { register int i; register char *cp; scanid(token, c); for (i = 0; i < nargs; i++) { /* For each argument */ if (streq(parlist[i], token)) { /* If it's known */ save(i + PFLAG); /* Save a magic cookie */ return; /* And exit the search */ } } if (streq(dp->name, token)) /* Macro name in body? */ save(DEL); /* Save magic marker */ for (cp = token; *cp != EOS;) /* And save */ save(*cp++); /* The token itself */ } doundef() /* * Remove the symbol from the defined list. * Called from the #control processor. */ { register int c; if (type[(c = skipws())] != LET) cerror("Illegal #undef argument", NULLST); else { scanid(token, c); if (defendel(token, TRUE) == NULL) { cwarn("Symbol \"%s\" not defined in #undef", token); } } } textput(text) char *text; /* * Put the string in the parm[] buffer. */ { register int size; size = strlen(text) + 1; if ((parmp + size) >= &parm[NPARMWORK]) cfatal("Macro work area overflow", NULLST); else { strcpy(parmp, text); parmp += size; } } charput(c) register int c; /* * Put the byte in the parm[] buffer. */ { if (parmp >= &parm[NPARMWORK]) cfatal("Macro work area overflow", NULLST); else { *parmp++ = c; } } /* * M a c r o E x p a n s i o n */ static DEFBUF *macro; /* Catches start of infinite macro */ expand(tokenp) register DEFBUF *tokenp; /* * Expand a macro. Called from the cpp mainline routine (via subroutine * macroid()) when a token is found in the symbol table. It calls * expcollect() to parse actual parameters, checking for the correct number. * It then creates a "file" containing a single line containing the * macro with actual parameters inserted appropriately. This is * "pushed back" onto the input stream. (When the get() routine runs * off the end of the macro line, it will dismiss the macro itself.) */ { register int c; register FILEINFO *file; extern FILEINFO *getfile(); #if DEBUG if (debug) dumpadef("expand entry", tokenp); #endif /* * If no macro is pending, save the name of this macro * for an eventual error message. */ if (recursion == 0) macro = tokenp; else if (recursion >= 30) { /* Too many recursions */ cerror("Recursive macro definition of \"%s\"", tokenp->name); fprintf(stderr, "(Defined by \"%s\")\n", macro->name); do { /* Unwind the macros */ c = get(); /* Tossing all text */ } while (recursion > 0); unget(); return; } /* * Here's a macro to expand. */ nargs = 0; /* Formals counter */ parmp = parm; /* Setup parm buffer */ switch (tokenp->nargs) { case (-2): /* __LINE__ */ printf("%d", line); break; case (-3): /* __FILE__ */ for (file = infile; file != NULL; file = file->parent) { if (file->fp != NULL) { printf("\"%s\"", (file->progname != NULL) ? file->progname : file->filename); break; } } break; default: /* * Nothing funny about this macro. */ if (tokenp->nargs < 0) cfatal("Bug: Illegal __ macro \"%s\"", tokenp->name); while ((c = skipws()) == '\n') /* Look for (, skipping */ wrongline = TRUE; /* spaces and newlines */ if (c != '(') { /* * If the programmer writes * #define foo() ... * ... * foo [no ()] * just write foo to the output stream. */ unget(); cwarn("Macro \"%s\" needs arguments", tokenp->name); printf("%s", tokenp->name); return; } else if (expcollect()) { /* Collect arguments */ if (tokenp->nargs != nargs) { /* ?? != or > */ cwarn("Wrong number of macro arguments for \"%s\"", tokenp->name); } #if DEBUG if (debug) dumpparm("expand"); #endif } /* Collect arguments */ case DEF_NOARGS: /* No parameters just stuffs */ expstuff(tokenp); /* Do actual parameters */ } /* nargs switch */ } FILE_LOCAL int expcollect() /* * Collect the actual parameters for this macro. TRUE if ok. */ { register int c; register int paren; /* For embedded ()'s */ extern int charput(); for (;;) { paren = 0; /* Collect next arg. */ while ((c = skipws()) == '\n') /* Skip over whitespace */ wrongline = TRUE; /* and newlines. */ if (c == ')') { /* At end of all args? */ /* * Note that there is a guard byte in parm[] * so we don't have to check for overflow here. */ *parmp = EOS; /* Make sure terminated */ break; /* Exit collection loop */ } else if (nargs >= NPARM) cfatal("Too many arguments in macro expansion", NULLST); parlist[nargs++] = parmp; /* At start of new arg */ for (;; c = get()) { /* Collect arg's bytes */ if (c == EOF_CHAR) { cerror("end of file within macro argument", NULLST); return (FALSE); /* Sorry. */ } else if (c == '\\') { /* Quote next character */ charput(c); /* Save the \ for later */ charput(cget()); /* Save the next char. */ continue; /* And go get another */ } else if (type[c] == QUO) { /* Start of string? */ scanstring(c, charput); /* Scan it off */ continue; /* Go get next char */ } else if (c == '(') /* Worry about balance */ paren++; /* To know about commas */ else if (c == ')') { /* Other side too */ if (paren == 0) { /* At the end? */ unget(); /* Look at it later */ break; /* Exit arg getter. */ } paren--; /* More to come. */ } else if (c == ',' && paren == 0) /* Comma delimits args */ break; else if (c == '\n') /* Newline inside arg? */ wrongline = TRUE; /* We'll need a #line */ charput(c); /* Store this one */ } /* Collect an argument */ charput(EOS); /* Terminate argument */ #if DEBUG if (debug) printf("parm[%d] = \"%s\"\n", nargs, parlist[nargs - 1]); #endif } /* Collect all args. */ return (TRUE); /* Normal return */ } FILE_LOCAL expstuff(tokenp) DEFBUF *tokenp; /* Current macro being expanded */ /* * Stuff the macro body, replacing formal parameters by actual parameters. */ { register int c; /* Current character */ register char *inp; /* -> repl string */ register char *defp; /* -> macro output buff */ int size; /* Actual parm. size */ char *defend; /* -> output buff end */ FILEINFO *file; /* Funny #include */ extern FILEINFO *getfile(); file = getfile(NBUFF, tokenp->name); recursion++; /* In a macro, now */ inp = tokenp->repl; /* -> macro replacement */ defp = file->buffer; /* -> output buffer */ defend = defp + (NBUFF - 1); /* Note its end */ if (inp != NULL) { while ((c = (*inp++ & 0XFF)) != EOS) { if (c >= PFLAG && c <= (PFLAG + NPARM)) { /* * Replace formal parameter by actual parameter string. */ if ((c -= PFLAG) < nargs) { size = strlen(parlist[c]); if ((defp + size) >= defend) goto nospace; strcpy(defp, parlist[c]); defp += size; } } else if (defp >= defend) { nospace: cfatal("Out of space in macro \"%s\" arg expansion", tokenp->name); } else { *defp++ = c; } } } *defp = EOS; #if DEBUG if (debug > 1) printf("macroline: \"%s\"\n", file->buffer); #endif } #if DEBUG dumpparm(why) char *why; /* * Dump parameter list. */ { register int i; printf("dump of %d parameters (%d bytes total) %s\n", nargs, parmp - parm, why); for (i = 0; i < nargs; i++) { printf("parm[%d] (%d) = \"%s\"\n", i + 1, strlen(parlist[i]), parlist[i]); } } #endif /* * Evaluate an #if expression. */ static char *opname[] = { /* For debug and error messages */ "end of expression", "val", "id", "+", "-", "*", "/", "%", "<<", ">>", "&", "|", "^", "==", "!=", "<", "<=", ">=", ">", "&&", "||", "?", ":", ",", "unary +", "unary -", "~", "!", "(", ")", "stack end", }; /* * opdope[] has the operator precedence: * Bits * 7 Unused (so the value is always positive) * 6-2 Precedence (000x .. 017x) * 1-0 Binary op. flags: * 01 The binop flag should be set/cleared when this op is seen. * 10 The new value of the binop flag. * Note: Expected, New binop * constant 0 1 Binop, end, or ) should follow constants * End of line 1 0 End may not be preceeded by an operator * binary 1 0 Binary op follows a value, value follows. * unary 0 0 Unary op doesn't follow a value, value follows * ( 0 0 Doesn't follow value, value or unop follows * ) 1 1 Follows value. Op follows. */ static char opdope[OP_MAX] = { 0001, /* End of expression */ 0002, /* Digit */ 0000, /* Letter (identifier) */ 0141, 0141, 0151, 0151, 0151, /* ADD, SUB, MUL, DIV, MOD */ 0131, 0131, 0101, 0071, 0071, /* ASL, ASR, AND, OR, XOR */ 0111, 0111, 0121, 0121, 0121, 0121, /* EQ, NE, LT, LE, GE, GT */ 0061, 0051, 0041, 0041, 0031, /* ANA, ORO, QUE, COL, CMA */ /* * Unary op's follow */ 0160, 0160, 0160, 0160, /* NEG, PLU, COM, NOT */ 0170, 0013, 0023, /* LPA, RPA, END */ }; /* * OP_QUE and OP_RPA have alternate precedences: */ #define OP_RPA_PREC 0013 #define OP_QUE_PREC 0034 typedef struct optab { char op; /* Operator */ char prec; /* Its precedence */ } OPTAB; static int evalue; /* Current value from evallex() */ #ifdef nomacargs FILE_LOCAL int isbinary(op) register int op; { return (op >= FIRST_BINOP && op <= LAST_BINOP); } FILE_LOCAL int isunary(op) register int op; { return (op >= FIRST_UNOP && op <= LAST_UNOP); } #else #define isbinary(op) (op >= FIRST_BINOP && op <= LAST_BINOP) #define isunary(op) (op >= FIRST_UNOP && op <= LAST_UNOP) #endif #ifdef DEBUG_EVAL dumpstack(opstack, opp, value, valp) OPTAB opstack[NEXP]; /* Operand stack */ register OPTAB *opp; /* Operator stack */ int value[NEXP]; /* Value stack */ register int *valp; /* -> value vector */ { printf("op stack dump\n"); while (opp > opstack) { printf("[%d] %d, %s 0%o\n", opp - opstack, opp->op, opname[opp->op], opp->prec); opp--; } while (--valp >= value) { printf("value[%d] = %d\n", (valp - value), *valp); } } #endif int eval() /* * Evaluate an expression. Straight-forward operator precedence. * This is called from control() on encountering an #if statement. * It calls the following routines: * evallex Lexical analyser -- returns the type and value of * the next input token. * evaleval Evaluate the current operator, given the values on * the value stack. Returns a pointer to the (new) * value stack. */ { register int op; /* Current operator */ register int *valp; /* -> value vector */ register OPTAB *opp; /* Operator stack */ int prec; /* Op precedence */ int binop; /* Set if binary op. needed */ int op1; /* Operand from stack */ int value[NEXP]; /* Value stack */ OPTAB opstack[NEXP]; /* Operand stack */ extern int *evaleval(); /* Does actual evaluation */ valp = value; opp = opstack; opp->op = OP_END; /* Mark bottom of stack */ opp->prec = opdope[OP_END]; /* And its precedence */ binop = 0; again: ; #ifdef DEBUG_EVAL printf("In #if at again:, binop = %d, line is: %s", binop, infile->bptr); #endif if ((op = evallex()) == OP_SUB && !binop) op = OP_NEG; /* Unary minus */ else if (op == OP_ADD && !binop) op = OP_PLU; /* Unary plus */ else if (op == OP_FAIL) return (0); /* Error in evallex */ #ifdef DEBUG_EVAL printf("op = %s, opdope = 0%03o, binop = %d\n", opname[op], opdope[op], binop); #endif if (op == DIG) { /* Value? */ if (binop) return (cerror("misplaced constant", NULLST)); else if (valp >= &value[NEXP-1]) return (cerror("if expression stack overflow", NULLST)); else { #ifdef DEBUG_EVAL printf("pushing %d onto stack[%d]\n", evalue, valp - value); #endif *valp++ = evalue; binop = 1; } goto again; } else if (op > OP_END) return (cerror("Illegal #if line", NULLST)); prec = opdope[op]; if (binop != (prec & 1)) return(cerror("Operator %s in incorrect context", opname[op])); binop = ((prec & 2) != 0); for (;;) { #ifdef DEBUG_EVAL printf("op %s, prec %d., stacked op %s, prec %d\n", opname[op], prec, opname[opp->op], opp->prec); #endif if (prec > opp->prec) { if (op == OP_LPA) prec = OP_RPA_PREC; else if (op == OP_QUE) prec = OP_QUE_PREC; /* * Push operator onto op. stack. */ opp++; if (opp >= &opstack[NEXP]) return (cerror("expression stack overflow", NULLST)); #ifdef DEBUG_EVAL printf("push %s (0%o) onto operand stack[%d]\n", opname[op], prec, opp - opstack); #endif opp->op = op; opp->prec = prec; goto again; } /* * Pop operator from op. stack and evaluate it. * End of stack and '(' are specials. */ switch ((op1 = (opp--)->op)) { /* Looked at stacked op */ case OP_END: /* Stack end marker */ if (op == OP_EOE) return (valp[-1]); /* Finished ok. */ opp++; /* More to come. */ goto again; /* Read another op. */ case OP_LPA: /* ( on stack */ if (op != OP_RPA) { /* Matches ) on input */ #ifdef DEBUG_EVAL printf("Expecting match to ), read '%s'\n", opname[op]); dumpstack(opname, opp, value, valp); #endif return (cerror("unbalanced paren's", NULLST)); } goto again; case OP_QUE: opp++; /* Keep it for a while */ goto again; /* Evaluate next op. */ case OP_COL: /* : on stack. */ if ((opp--)->op != OP_QUE) { /* Matches ? on stack? */ return(cerror( "Misplaced '?' or ':', previous operator is %s", opname[(opp+1)->op])); } /* * Evaluate op1. */ default: /* Others: */ #ifdef DEBUG_EVAL printf("Stack before evaluation of %s\n", opname[op1]); dumpstack(opstack, opp, value, valp); #endif valp = evaleval(valp, op1); /* Evaluate value(s) */ #ifdef DEBUG_EVAL printf("Stack after evaluation\n"); dumpstack(opstack, opp, value, valp); #endif } /* op1 switch end */ } /* Stack unwind loop */ } FILE_LOCAL int evallex() /* * Return next eval operator or value. Called from eval(). It * calls a special-purpose routines for 'char' strings and * numeric values: * evalchar called to evaluate 'x' * evalnum called to evaluate numbers. */ { register int c, c1, t; again: if ((c = skipws()) == EOF_CHAR || c == '\n') { unget(); return (OP_EOE); /* End of expression */ } if ((t = type[c]) == INV) { /* Total nonsense */ if (isascii(c) && isprint(c)) cierror("illegal character '%c' in #if", c); else cierror("illegal character (%d decimal) in #if", c); return (OP_FAIL); } else if (t == QUO) { /* ' or " */ if (c == '\'') { /* Character constant */ evalue = evalchar(); /* Somewhat messy */ #ifdef DEBUG_EVAL printf("evalchar returns %d.\n", evalue); #endif return (DIG); /* Return a value */ } cerror("Can't use a string in an #if", NULLST); return (OP_FAIL); } else if (t == LET) { /* ID must be a macro */ if (macroid(c)) /* Try to expand it */ goto again; /* Reread if so. */ else if (streq(token, "defined")) { /* Or defined name */ c1 = c = skipws(); if (c == '(') /* Allow defined(name) */ c = skipws(); if (type[c] == LET) { evalue = (lookid(c) != NULL); if (c1 != '(' /* Need to balance */ || skipws() == ')') /* Did we balance? */ return (DIG); /* Parsed ok */ } cerror("Bad #if ... defined() syntax", NULLST); return (OP_FAIL); } /* * The Draft ANSI C Standard says that an undefined symbol * in an #if has the value zero. We should really check that * the programmer didn't write "#if defined(foo) ? foo : 0" * before printing the warning. */ cwarn("undefined symbol \"%s\" in #if, 0 used", token); evalue = 0; return (DIG); } else if (t == DIG) { /* Numbers are harder */ evalue = evalnum(c); #ifdef DEBUG_EVAL printf("evalnum returns %d.\n", evalue); #endif } else if (strchr("!=<>&|\\", c) != NULL) { /* * Process a possible multi-byte lexeme. */ c1 = get(); /* Peek at next char */ switch (c) { case '!': if (c1 == '=') return (OP_NE); break; case '=': if (c1 != '=') { /* Can't say a=b in #if */ unget(); cerror("= not allowed in #if", NULLST); return (OP_FAIL); } return (OP_EQ); case '>': case '<': if (c1 == c) return ((c == '<') ? OP_ASL : OP_ASR); else if (c1 == '=') return ((c == '<') ? OP_LE : OP_GE); break; case '|': case '&': if (c1 == c) return ((c == '|') ? OP_ORO : OP_ANA); break; case '\\': if (c1 == '\n') /* Multi-line if */ goto again; cerror("Unexpected \\ in #if", NULLST); return (OP_FAIL); } unget(); } return (t); } FILE_LOCAL int evalnum(c) register int c; /* * Expand number for #if lexical analysis. */ { register int value; register int base; register int c1; if (c != '0') base = 10; else if ((c = get()) == 'x' || c == 'X') { base = 16; c = get(); } else base = 8; value = 0; for (;;) { c1 = c; if (isascii(c) && isupper(c1)) c1 = tolower(c1); if (c1 >= 'a') c1 -= ('a' - 10); else c1 -= '0'; if (c1 < 0 || c1 >= base) break; value *= base; value += c1; c = get(); } unget(); return (value); } /* * GETCC is called by evalchar() to read a character. It absorbs * the embedded-comment magic cookie that some Unix implementations use to * allow token concatenation. */ #if COMMENT_INVISIBLE #define GETCC getcc FILE_LOCAL int getcc() { register int c; do { c = get(); } while (c == COM_SPACE); return (c); } #else #define GETCC get #endif FILE_LOCAL int evalchar() /* * Get a character constant */ { register int c; register int value; register int count; instring = TRUE; if ((c = get()) == '\\') { switch ((c = GETCC())) { case 'a': value = ALERT; /* New in Standard */ break; case 'b': value = '\b'; break; case 'f': value = '\f'; break; case 'n': value = '\n'; break; case 'r': value = '\r'; break; case 't': value = '\t'; break; case 'v': value = VT; /* Vertical tab */ break; case 'x': /* '\xFF' */ count = 3; value = 0; while ((((c = get()) >= '0' && c <= '9') || (c >= 'a' && c <= 'f') || (c >= 'A' && c <= 'F')) && (--count >= 0)) { value *= 16; value += (c <= '9') ? (c - '0') : ((c & 0xF) + 9); } unget(); break; default: if (c >= '0' && c <= '7') { count = 3; value = 0; while (c >= '0' && c <= '7' && --count >= 0) { value *= 8; value += (c - '0'); c = get(); } unget(); } else value = c; break; } } else if (c == '\'') value = 0; else value = c; /* * We warn on multi-byte constants and try to hack * (big|little)endian machines. */ #if BIG_ENDIAN count = 0; #endif while ((c = get()) != '\'' && c != EOF_CHAR && c != '\n') { ciwarn("multi-byte constant '%c' isn't portable", c); #if BIG_ENDIAN count += BITS_CHAR; value += (c << count); #else value <<= BITS_CHAR; value += c; #endif } instring = FALSE; return (value); } FILE_LOCAL int * evaleval(valp, op) register int *valp; int op; /* * Apply the argument operator to the data on the value stack. * One or two values are popped from the value stack and the result * is pushed onto the value stack. * * OP_COL is a special case. * * evaleval() returns the new pointer to the top of the value stack. */ { register int v1, v2; if (isbinary(op)) v2 = *--valp; v1 = *--valp; #ifdef DEBUG_EVAL printf("%s op %s", (isbinary(op)) ? "binary" : "unary", opname[op]); if (isbinary(op)) printf(", v2 = %d.", v2); printf(", v1 = %d.\n", v1); #endif switch (op) { case OP_EOE: break; case OP_ADD: v1 += v2; break; case OP_SUB: v1 -= v2; break; case OP_MUL: v1 *= v2; break; case OP_DIV: if (v2 == 0) { cwarn("divide by zero in #if, zero result assumed", NULLST); v1 = 0; } else v1 /= v2; break; case OP_MOD: if (v2 == 0) { cwarn("modulus by zero in #if, zero result assumed", NULLST); v1 = 0; } else v1 %= v2; break; case OP_ASL: v1 <<= v2; break; case OP_ASR: v1 >>= v2; break; case OP_AND: v1 &= v2; break; case OP_OR: v1 |= v2; break; case OP_XOR: v1 ^= v2; break; case OP_EQ: v1 = (v1 == v2); break; case OP_NE: v1 = (v1 != v2); break; case OP_LT: v1 = (v1 < v2); break; case OP_LE: v1 = (v1 <= v2); break; case OP_GE: v1 = (v1 >= v2); break; case OP_GT: v1 = (v1 > v2); break; case OP_ANA: v1 = (v1 && v2); break; case OP_ORO: v1 = (v1 || v2); break; case OP_COL: /* * v1 has the "true" value, v2 the "false" value. * The top of the value stack has the test. */ v1 = (*--valp) ? v1 : v2; break; case OP_NEG: v1 = (-v1); break; case OP_PLU: break; case OP_COM: v1 = ~v1; break; case OP_NOT: v1 = !v1; break; default: cierror("#if bug, operand = %d.", op); v1 = 0; } *valp++ = v1; return (valp); } -h- cpp3.c Sat Sep 1 21:43:42 1984 cpp3.c /* * C P P . 3 * S u p p o r t R o u t i n e s * * Edit History * 25-May-84 MM Added 8-bit support to type table. * 30-May-84 ARF sharp() should output filename in quotes * 02-Aug-84 MM Newline and #line hacking. sharp() now in cpp1.c * 31-Aug-84 MM USENET net.sources release */ #include <stdio.h> #include <ctype.h> #include "cppdef.h" #include "cpp.h" /* * skipnl() skips over input text to the end of the line. * skipws() skips over "whitespace" (spaces or tabs), it * does not skip over the end of the line. * scanid() reads the next token (C identifier) into a token * buffer (usually token[]). The caller has already * read the first character of the identifier. * macroid() reads the next token (C identifier) into token[]. * If it is a #defined macro, it is expanded, and * macroid() returns TRUE, otherwise, FALSE. * scanstring() Reads a string from the input stream, calling * a user-supplied function for each character. * This function may be output() to write the * string to the output file, or save() to save * the string in the work buffer. * scannumber() Reads a C numeric constant from the input stream, * calling the user-supplied function for each * character. (output() or save() as noted above.) * save() Save one character in the work[] buffer. * savestring() Saves a string in malloc() memory. * getfile() Initialize a new FILEINFO structure, called when * #include opens a new file, or a macro is to be * expanded. * getmem() Get a specified number of bytes from malloc memory. * output() Write one character to stdout (calling putchar) -- * implemented as a function so its address may be * passed to scanstring() and scannumber(). * lookid() Scans the next token (identifier) from the input * stream. Looks for it in the #defined symbol table. * Returns a pointer to the definition, if found, or NULL * if not present. The identifier is stored in token[]. * defnedel() Define enter/delete subroutine. Updates the * symbol table. * get() Read the next byte from the current input stream, * handling end of (macro/file) input and embedded * comments appropriately. Note that the global * instring is -- essentially -- a parameter to get(). * unget() Push last gotten character back on the input stream. * cerror(), cwarn(), cfatal(), cierror(), ciwarn() * These routines format an print messages to the user. * cerror & cwarn take a format and a single string argument. * cierror & ciwarn take a format and a single int (char) argument. * cfatal takes a format and a single string argument. * sharp() Output the #line. */ /* * Note that DEL is a letter -- this is needed to hack #define foo foo * This table must be modified for non-Ascii machines. */ char type[256] = { /* Character type codes Hex */ END, 000, 000, 000, 000, 000, 000, 000, /* 00 */ 000, SPA, 000, 000, 000, 000, 000, 000, /* 08 */ 000, 000, 000, 000, 000, 000, 000, 000, /* 10 */ 000, 000, 000, 000, 000, 000, 000, SPA, /* 18 */ SPA,OP_NOT, QUO, 000, LET,OP_MOD,OP_AND, QUO, /* 20 !"#$%&' */ OP_LPA,OP_RPA,OP_MUL,OP_ADD, 000,OP_SUB, DOT,OP_DIV, /* 28 ()*+,-./ */ DIG, DIG, DIG, DIG, DIG, DIG, DIG, DIG, /* 30 01234567 */ DIG, DIG,OP_COL, 000, OP_LT, OP_EQ, OP_GT,OP_QUE, /* 38 89:;<=>? */ 000, LET, LET, LET, LET, LET, LET, LET, /* 40 @ABCDEFG */ LET, LET, LET, LET, LET, LET, LET, LET, /* 48 HIJKLMNO */ LET, LET, LET, LET, LET, LET, LET, LET, /* 50 PQRSTUVW */ LET, LET, LET, 000, BSH, 000,OP_XOR, LET, /* 58 XYZ[\]^_ */ 000, LET, LET, LET, LET, LET, LET, LET, /* 60 `abcdefg */ LET, LET, LET, LET, LET, LET, LET, LET, /* 68 hijklmno */ LET, LET, LET, LET, LET, LET, LET, LET, /* 70 pqrstuvw */ LET, LET, LET, 000, OP_OR, 000,OP_NOT, LET, /* 78 xyz{|}~ */ 000, 000, 000, 000, 000, 000, 000, 000, /* 80 .. FF */ 000, 000, 000, 000, 000, 000, 000, 000, /* 80 .. FF */ 000, 000, 000, 000, 000, 000, 000, 000, /* 80 .. FF */ 000, 000, 000, 000, 000, 000, 000, 000, /* 80 .. FF */ 000, 000, 000, 000, 000, 000, 000, 000, /* 80 .. FF */ 000, 000, 000, 000, 000, 000, 000, 000, /* 80 .. FF */ 000, 000, 000, 000, 000, 000, 000, 000, /* 80 .. FF */ 000, 000, 000, 000, 000, 000, 000, 000, /* 80 .. FF */ }; skipnl() /* * Skip to the end of the current input line. */ { register int c; do { /* Skip to newline */ c = get(); } while (c != '\n' && c != EOF_CHAR); } int skipws() /* * Skip over whitespace */ { register int c; do { /* Skip to newline */ c = get(); } while (type[c] == SPA); return (c); } scanid(buffer, c) char *buffer; /* Store ID here */ register int c; /* First char of id */ /* * Get the next token (an id) into the buffer. * Note: this code is duplicated in lookid(). * Change one, change both. */ { register char *bp; if (c == DEL) /* Eat the magic token */ c = get(); /* undefiner. */ bp = buffer; do { if (bp < &buffer[IDMAX]) *bp++ = c; c = get(); } while (type[c] == LET || type[c] == DIG); unget(); *bp = EOS; } int macroid(c) /* * Scan the id, if it's #defined, expand it and return TRUE. * Else, the id is in "token", return FALSE. */ { register DEFBUF *dp; if ((dp = lookid(c)) == NULL) return (FALSE); else { expand(dp); return (TRUE); } } scanstring(delim, outfun) register int delim; /* ' or " */ int (*outfun)(); /* Output function */ /* * Scan off a string. Warning if terminated by newline or EOF. * outfun() outputs the character -- to a buffer if in a macro. */ { register int c; instring = TRUE; /* Don't strip comments */ (*outfun)(delim); while ((c = get()) != delim && c != '\n' && c != EOF_CHAR) { (*outfun)(c); if (c == '\\') (*outfun)(get()); } if (c == delim) (*outfun)(c); else { cerror("Unterminated string", NULLST); unget(); } instring = FALSE; } scannumber(c, outfun) register int c; /* First char of number */ register int (*outfun)(); /* Output/store func */ /* * Process a number */ { if (c == '0') { /* Octal or hex */ (*outfun)(c); if ((c = get()) == 'X' || c == 'x') { /* Is it hex? */ (*outfun)(c); /* Hex */ while (((c = get()) >= '0' && c <= '9') || (c >= 'A' && c <= 'F') || (c >= 'a' && c <= 'f')) { (*outfun)(c); } } else { while (c >= '0' && c <= '7') { /* Octal */ (*outfun)(c); c = get(); } } if (c == 'l' || c == 'L') { (*outfun)(c); /* Long hex/oct */ c = get(); } } else { /* Int or float */ while (type[c] == DIG) { /* Int part */ (*outfun)(c); c = get(); } if (c == 'l' || c == 'L') { /* Long int */ (*outfun)(c); c = get(); } else { /* Maybe Float */ if (c == '.') { /* '.' is float */ (*outfun)(c); /* Do fraction */ while (type[(c = get())] == DIG) (*outfun)(c); } if (c == 'E' || c == 'e') { /* Exponential */ (*outfun)(c); if ((c = get()) == '+' || c == '-') { (*outfun)(c); c = get(); } while (type[c] == DIG) { (*outfun)(c); c = get(); } } /* If E format */ } /* If not long int */ } /* If Decimal number */ unget(); /* Rescan next char */ } save(c) register int c; { if (workp >= &work[NWORK]) cfatal("Work buffer overflow", NULLST); else *workp++ = c; } char * savestring(text) char *text; /* * Store a string into free memory. */ { register char *result; result = getmem(strlen(text) + 1); strcpy(result, text); return (result); } FILEINFO * getfile(bufsize, name) int bufsize; /* Line or define buffer size */ char *name; /* File or macro name string */ /* * Common FILEINFO buffer initialization for a new file or macro. */ { register FILEINFO *file; register int size; size = strlen(name); /* File/macro name */ file = (FILEINFO *) getmem(sizeof (FILEINFO) + bufsize + size); file->parent = infile; /* Chain files together */ file->fp = NULL; /* No file yet */ file->filename = savestring(name); /* Save file/macro name */ file->progname = NULL; /* No #line seen yet */ file->bptr = file->buffer; /* Initialize line ptr */ file->buffer[0] = EOS; /* Force first read */ file->line = 0; /* (Not used just yet) */ if (infile != NULL) /* If #include file */ infile->line = line; /* Save current line */ infile = file; /* New current file */ line = 1; /* Note first line */ return (file); /* All done. */ } char * getmem(size) int size; /* * Get a block of free memory. */ { register char *result; extern char *malloc(); if ((result = malloc((unsigned) size)) == NULL) cfatal("Out of memory", NULLST); return (result); } /* * C P P S y m b o l T a b l e s */ #ifndef SBSIZE #define SBSIZE 64 /* Hash chain size (power of 2) */ #endif #define SBMASK (SBSIZE - 1) #if (SBSIZE ^ SBMASK) != ((SBSIZE * 2) - 1) << error, SBSIZE must be a power of 2 >> #endif static DEFBUF *symtab[SBSIZE]; /* Symbol table queue headers */ DEFBUF * lookid(c) int c; /* First character of token */ /* * Look for the next token in the symbol table. Returns token in "token". * If found, returns the table pointer; Else returns NULL. */ { register int nhash; register DEFBUF *dp; REG_UNION { char *np; int temp; } r; int isrecurse; /* For #define foo foo */ r.np = token; nhash = 0; if ((isrecurse = (c == DEL))) /* If recursive macro */ c = get(); /* hack, skip over DEL */ do { if (r.np < &token[IDMAX]) { *r.np++ = c; /* Store token byte */ nhash += c; /* Update hash value */ } c = get(); /* And get another byte */ } while (type[c] == LET || type[c] == DIG); unget(); /* Rescan terminator */ *r.np = EOS; /* Terminate token */ if (isrecurse) /* Recursive definition */ return (NULL); /* undefined just now */ nhash += (r.np - token); /* Fix hash value */ /* printf("look for '%s' [%d], hash %d, index %d\n", ** token, (r.np - token), nhash, nhash & SBMASK); */ dp = symtab[nhash & SBMASK]; /* Starting bucket */ while (dp != (DEFBUF *) NULL) { /* Search symbol table */ if (dp->hash == nhash /* Fast precheck */ && (r.temp = strcmp(dp->name, token)) >= 0) break; dp = dp->link; /* Nope, try next one */ } return ((r.temp == 0) ? dp : NULL); } DEFBUF * defendel(name, delete) char *name; int delete; /* TRUE to delete a symbol */ /* * Enter this name in the lookup table (delete = FALSE) * or delete this name (delete = TRUE). * Returns a pointer to the define block (delete = FALSE) * Returns NULL if the symbol wasn't defined (delete = TRUE). */ { register DEFBUF *dp; REG_UNION { DEFBUF **prevp; char *np; } r; register int nhash; int temp; int size; for (nhash = 0, r.np = name; *r.np != EOS;) nhash += *r.np++; size = (r.np - name); nhash += size; /* printf("'%s', [%d], hash = %d, index = %d\n", ** name, size, nhash, nhash & SBMASK); */ r.prevp = &symtab[nhash & SBMASK]; while ((dp = *r.prevp) != (DEFBUF *) NULL) { if (dp->hash == nhash && (temp = strcmp(dp->name, name)) >= 0) { if (temp > 0) dp = NULL; /* Not found */ else { *r.prevp = dp->link; /* Found, unlink and */ if (dp->repl != NULL) /* Free the replacement */ free(dp->repl); /* if any, and then */ free((char *) dp); /* Free the symbol */ } break; } r.prevp = &dp->link; } if (!delete) { dp = (DEFBUF *) getmem(sizeof (DEFBUF) + size); dp->link = *r.prevp; *r.prevp = dp; dp->hash = nhash; dp->repl = NULL; dp->nargs = 0; strcpy(dp->name, name); } return (dp); } #if DEBUG dumpdef(why) char *why; { register DEFBUF *dp; register DEFBUF **syp; printf("CPP symbol table dump %s\n", why); for (syp = symtab; syp < &symtab[SBSIZE]; syp++) { if ((dp = *syp) != (DEFBUF *) NULL) { printf("symtab[%d]\n", (syp - symtab)); do { dumpadef((char *) NULL, dp); } while ((dp = dp->link) != (DEFBUF *) NULL); } } } dumpadef(why, dp) char *why; /* Notation */ register DEFBUF *dp; { register char *cp; register int c; printf(" \"%s\" [%d]", dp->name, dp->nargs); if (why != NULL) printf(" (%s)", why); if (dp->repl != NULL) { printf(" => "); for (cp = dp->repl; (c = *cp++ & 0xFF) != EOS;) { if (c >= PFLAG && c <= (PFLAG + NPARM)) printf("<%d>", c - PFLAG); else if (isprint(c) || c == '\n' || c == '\t') putchar(c); else if (c < ' ') printf("<^%c>", c + '@'); else printf("<\\0%o>", c); } } else { printf(", no replacement."); } putchar('\n'); } #endif /* * G E T */ int get() /* * Return the next character from a macro or the current file. * Handle end of file from #include files. */ { register int c; register FILEINFO *file; get_from_file: if ((file = infile) == NULL) return (EOF_CHAR); newline: #if 0 printf("get(%s), line %d, bptr = %d, buffer \"%s\"\n", file->filename, line, file->bptr - file->buffer, file->buffer); #endif /* * Read a character from the current input line or macro. * At EOS, either finish the current macro (freeing temp. * storage) or read another line from the current input file. * At EOF, exit the current file (#include) or, at EOF from * a command-line specified file, return EOF_CHAR to trigger * processing. */ if ((c = *file->bptr++ & 0xFF) == EOS) { /* * Nothing in current line or macro. Get next line (if * input from a file), or do end of file/macro processing. * In the latter case, jump back to restart from the top. */ if (file->fp == NULL) /* NULL if macro */ infile = file->parent; /* Unwind file chain */ else { /* Get from a file */ if ((file->bptr = fgets(file->buffer, NBUFF, file->fp)) != NULL) { #if DEBUG if (debug > 1) { /* Dump it to cpp.tmp */ printf("\n#line %d (%s), %s", line, file->filename, file->buffer); } #endif goto newline; /* process the line */ } else { fclose(file->fp); /* Close finished file */ if ((infile = file->parent) != NULL) { /* * There is an "ungotten" newline in the current * infile buffer (set there by doinclude() in * cpp1.c). Thus, we know that the mainline code * is skipping over blank lines and will do a * #line at its convenience. */ wrongline = TRUE; /* Need a #line now */ } } } /* * Free up space used by the (finished) file or macro and * restart input from the parent file/macro, if any. */ free(file->filename); /* Free name and */ if (file->progname != NULL) /* if a #line was seen, */ free(file->progname); /* free it, too. */ free((char *) file); /* Free file space */ if (infile == NULL) /* If at end of file */ return (EOF_CHAR); /* Return end of file */ line = infile->line; /* Reset line number */ goto get_from_file; /* Get from the top. */ } else if (file->fp != NULL) { /* * A byte was read from a "real" file. * * The macro recursion hacking is a bit messy and * deserves an explanation: * To expand a macro, we read a token from an input file. * The character just after the token is pushed back on * the input stream. Thus, the true next byte from the input * file is signaled by uindex == 0 on entrance, and input * from a file. In that case, the macro recursion counter * is set to zero. We reach this point only if a character * was actually read from a real input file. Not if * ungotten, and not if read from a macro. */ recursion = 0; /* Stop recursive worry */ } /* * Common processing for the new character. */ if (c == DEL && file->fp != NULL) /* Don't allow DEL from */ goto newline; /* a file */ if (c == '\n') /* Maintain current */ ++line; /* line counter */ if (instring) /* Don't test for */ return (c); /* Comments in strings */ if (c == '\f' || c == VT) /* Form feed, vertical */ c = ' '; /* Tab are whitespace */ if (c != '/') /* / begins a comment */ return (c); /* Not / so exit. */ else { instring = TRUE; /* So get() won't loop */ if ((c = get()) != '*') { /* Next byte '*'? */ instring = FALSE; /* Nope, no comment */ unget(); /* Push the char. back */ return ('/'); /* Return the slash */ } for (;;) { /* Eat a comment */ c = get(); test: switch (c) { case EOF_CHAR: cerror("EOF in comment", NULLST); return (EOF_CHAR); case '/': if ((c = get()) != '*') /* Don't let comments */ break; /* Nest. */ cwarn("Nested comments", NULLST); /* Fall into * stuff */ case '*': if ((c = get()) != '/') /* If comment doesn't */ goto test; /* end, look at next */ instring = FALSE; /* End of comment, */ #if COMMENT_INVISIBLE return (COM_SPACE); /* Syntactic space */ #else return (' '); /* Real space */ #endif case '\n': /* we'll need a #line */ wrongline = TRUE; /* later... */ default: /* Anything else is */ break; /* Just a character */ } /* End switch */ } /* End comment loop */ } /* End if in comment */ } unget() /* * Backup the pointer to reread the last character. Fatal error * (code bug) if we backup too far. unget() may be called, * without problems, at end of file. */ { register FILEINFO *file; if ((file = infile) == NULL) return; /* Unget after EOF */ if (--file->bptr < file->buffer) cfatal("Too much pushback", NULLST); if (*file->bptr == '\n') /* Ungetting a newline? */ --line; /* Unget the line number, too */ } #if COMMENT_INVISIBLE int cget() /* * Get one character, absorb "funny space" after comments. */ { register int c; do { c = get(); } while (c == COM_SPACE); return (c); } #endif /* * Error messages and other hacks. The first byte of severity * is 'S' for string arguments and 'I' for int arguments. This * is needed for portability with machines that have int's that * are shorter than char *'s. */ static domsg(severity, format, arg) char *severity; /* "Error", "Warning", "Fatal" */ char *format; /* Format for the error message */ char *arg; /* Something for the message */ /* * Print filenames, macro names, and line numbers for error messages. */ { register char *tp; register FILEINFO *file; char buf[80]; #ifdef MSG_PREFIX fputs(MSG_PREFIX, stderr); #endif if (*severity++ == 'S') sprintf(buf, format, arg); else sprintf(buf, format, (int) arg); fprintf(stderr, "line %d, %s: %s.\n", line, severity, buf); if ((file = infile) == NULL) return; /* At end of file */ if (file->fp != NULL) { tp = file->buffer; /* Print current file */ fprintf(stderr, "%s", tp); /* name, making sure */ if (tp[strlen(tp) - 1] != '\n') /* there's a newline */ putc('\n', stderr); } while ((file = file->parent) != NULL) { /* Print #includes, too */ if (file->fp == NULL) fprintf(stderr, "from macro %s\n", file->filename); else { tp = file->buffer; fprintf(stderr, "from file %s, line %d:\n%s", (file->progname != NULL) ? file->progname : file->filename, file->line, tp); if (tp[strlen(tp) - 1] != '\n') putc('\n', stderr); } } } int cerror(format, sarg) char *format; char *sarg; /* Single string argument */ /* * Print a normal error message -- return zero to simplify #if evaluator */ { domsg("SError", format, sarg); errors++; return (0); /* For expression parser */ } int cierror(format, narg) char *format; int narg; /* Single numeric argument */ /* * Print a normal error message -- return zero to simplify #if evaluator */ { domsg("IError", format, (char *) narg); errors++; return (0); /* For expression parser */ } cfatal(format, sarg) char *format; char *sarg; /* Single string argument */ /* * A real disaster */ { domsg("SFatal error", format, sarg); exit(IO_ERROR); } cwarn(format, sarg) char *format; char *sarg; /* Single string argument */ /* * A non-fatal error */ { domsg("SWarning", format, sarg); } ciwarn(format, narg) char *format; int narg; /* Single string argument */ /* * A non-fatal error */ { domsg("IWarning", format, (char *) narg); }