rsalz@uunet.uu.net (Rich Salz) (11/18/88)
Submitted-by: arizona!rupley!local (John Rupley) Posting-number: Volume 16, Issue 88 Archive-name: identlist TITLE: cdeclist, identlist - list external declarations for C source files; list identifiers SUMMARY: The attached Lex source files are for filters that generate, for a C source file: (1) a list of external declarations (functions, arrays, variables, structures); (2) a file of identifiers suitable for inverted indexing, making a word list, etc. Run under SysV or BSD. See README for test instructions: Hope you find them useful, John Rupley uucp: ..{cmcl2 | hao!ncar!noao}!arizona!rupley!local internet: rupley!local@megaron.arizona.edu ------------------------------------------------------------------------- #!/bin/sh # to extract, remove the header and type "sh filename" if `test ! -s ./README` then echo "writing ./README" cat > ./README << '\Rogue\Monster\' README - Sat Aug 20 23:22:04 MST 1988 The attached Lex source files are for filters that generate, for a C source file: (1) a list of external declarations (functions, arrays, variables, structures); each declaration on one line, for manipulation by awk, etc.; initializers replaced by a /*comment*/; function definitions with parameter declarations and a dummy statement with appropriate return: { [return [(n)];] } preprocessing with cpp replaces defined names and executes compilation conditionals; (2) a file of identifiers suitable for inverted indexing, making a word list, etc. Use of the filters is explained by example: (1) mk_cdeclist and the test targets in the Makefile show the generation of a declaration list; (2) mk_indentlist shows the generation of a list of identifiers. The make file works under csh/BSD and ksh/SYSV. To test: copy a C source file and any non-system #include dependencies into the directory with the unshared files from cdeclist.shar run: make TESTC="C_src_file" testall you should get a file LLLLfilename, which is the declaration list, some intermediate files LLfilename and temp?, which shou what is going on at each stage, and a file ZZZZfilename, which is the list of C keywords and identifiers, The cdeclist filters, although simple, should parse most styles of coding. Adjustment may be needed for the new ANSI standard. The output is suitable for making a lint library. The identlist filters are offered without much being claimed for them. I use the list as a guide and aid, without assuming it absolutely correct. The shell scripts mk_* were written under ksh on a SYSV machine, and they need modification to run under csh or BSD. John Rupley uucp: ..{cmcl2 | hao!ncar!noao}!arizona!rupley!local internet: rupley!local@megaron.arizona.edu (H) 30 Calle Belleza, Tucson AZ 85716 - (602) 325-4533 (O) Dept. Biochemistry, Univ. Arizona, Tucson AZ 85721 - (602) 621-3929 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ MANIFEST: (for cdeclist.shr1) README Makefile cdeclist.1 man page uncomment.l | cdeclist1.l | cdeclist2.l | shell wrapper and filters generating declaration list cdeclist3.l | cdeclist4.l | mk_cdeclist | identlist.l | shell wrapper and filters generating file of identifiers identlist1.l | mk_identlist | \Rogue\Monster\ else echo "will not over write ./README" fi if [ `wc -c ./README | awk '{printf $1}'` -ne 2358 ] then echo `wc -c ./README | awk '{print "Got " $1 ", Expected " 2358}'` fi if `test ! -s ./Makefile` then echo "writing ./Makefile" cat > ./Makefile << '\Rogue\Monster\' #MAKEFILE - #5 filters for extracting a list of declarations from a C source file; # delete: comments; # (uncomment.l) # filter with cpp: replace #defines; execute #ifdef control statements; # (cdeclist1.l pre-processes, cdeclist2.l post-processes) # delete: function {bodies}; initializations (= ....;); # (cdeclist3.l) # reformat: to one-line declarations; and a bit more; # (cdeclist4.l) #output should be suitable for: # making /*LINTLIBRARY*/, massaging with grep, awk..., etc; #all filters read stdin, write stdout; #probably need some adaptation for new ANSI standard; # #also 2 filters (identlist and identlist1) that prepare a C source #file for making a word list or index of identifiers; please don't #flame me if the word list misses a few identifiers or includes a few #non-existent ones -- I make no great claim for it -- and it assumes, #I am sure, an idiosyncratic style (mine). # #John Rupley # uucp: ..{cmcl2 | hao!ncar!noao}!arizona!rupley!local # internet: rupley!local@megaron.arizona.edu # (H) 30 Calle Belleza, Tucson AZ 85716 - (602) 325-4533 # (O) Dept. Biochemistry, Univ. Arizona, Tucson AZ 85721 - (602) 621-3929 OPT= dummy: @echo please supply a target all: uncomment cdeclist1 cdeclist2 cdeclist3 cdeclist4 identlist identlist1 #remove comments uncomment: uncomment.l lex -v uncomment.l $(CC) lex.yy.c $(CFLAGS) -ll $(LDFLAGS) -o uncomment #remove comments and quoted strings etc identlist: identlist.l lex -v identlist.l $(CC) lex.yy.c $(CFLAGS) -ll $(LDFLAGS) -o identlist #remove upper-case words and numbers identlist1: identlist1.l lex -v identlist1.l $(CC) lex.yy.c $(CFLAGS) -ll $(LDFLAGS) -o identlist1 #prepare C source file for cpp processing cdeclist1: cdeclist1.l lex -v cdeclist1.l $(CC) lex.yy.c $(CFLAGS) -ll $(LDFLAGS) -o cdeclist1 #remove additions associated with cpp processing cdeclist2: cdeclist2.l lex -v cdeclist2.l $(CC) lex.yy.c $(CFLAGS) -ll $(LDFLAGS) -o cdeclist2 #remove function {bodies}, put in appropriate { return (n); } cdeclist3: cdeclist3.l lex -v cdeclist3.l $(CC) lex.yy.c $(CFLAGS) -ll $(LDFLAGS) -o cdeclist3 #one-line declarations;remove initializatons;beautify a little cdeclist4: cdeclist4.l lex -v cdeclist4.l $(CC) lex.yy.c $(OPT) $(CFLAGS) -ll $(LDFLAGS) -o cdeclist4 SHARLIST= \ README\ Makefile\ cdeclist.1\ uncomment.l\ cdeclist1.l\ cdeclist2.l\ cdeclist3.l\ cdeclist4.l\ mk_cdeclist\ identlist.l\ identlist1.l\ mk_identlist mkshar: shar -f cdeclist -c $(SHARLIST) TESTC="please define TESTC=C_src_file on make line" #HDIR=hard_wired_path_for_headers #CDIR=hard_wired_path_for_sources HDIR="." CDIR="." #test making a declaration list testc: all cat $(CDIR)/$(TESTC)|uncomment|cdeclist1 >temp1 /lib/cpp -P -C -I$(HDIR) temp1 >temp2 cat temp2|cdeclist2 >temp3 echo /*$(TESTC)*/ >LL$(TESTC) echo >>LL$(TESTC) cat temp3|cdeclist3 >>LL$(TESTC) test2c: cdeclist2 cat temp2|cdeclist2 >temp3 test3c: cdeclist3 echo /*$(TESTC)*/ >LL$(TESTC) echo >>LL$(TESTC) cat temp3|cdeclist3 >>LL$(TESTC) TESTLL=LL$(TESTC) testll: cdeclist4 cat $(TESTLL)|cdeclist4 >LL$(TESTLL) #test making a word list testw: identlist identlist1 cat $(TESTC)|identlist|identlist1 |\ tr -s " " "\012\012"|sort|uniq >ZZZZ$(TESTC) #careful! testlint: lint -uvx -Ml -otemp LL$(TESTLL) cp llib-ltemp.ln /usr/lib/large lint -uvx -Ml -ltemp $(TESTC) testall: testc testll testw #testlint \Rogue\Monster\ else echo "will not over write ./Makefile" fi if [ `wc -c ./Makefile | awk '{printf $1}'` -ne 3409 ] then echo `wc -c ./Makefile | awk '{print "Got " $1 ", Expected " 3409}'` fi if `test ! -s ./cdeclist.1` then echo "writing ./cdeclist.1" cat > ./cdeclist.1 << '\Rogue\Monster\' .TH CDECLIST 1 .SH NAME mk_cdeclist, mk_identlist \- list external declarations for C source files; list identifiers .SH SYNOPSIS .B mk_cdeclist filelist .br .B mk_identlist filelist .SH DESCRIPTION .B Mk_cdeclist is a shell script that links a set of filters and the C preprocessor, .B cpp, to convert C source files into a list of external declarations (functions, arrays, variables, structures). Each declaration is on one line, to simplify subsequent manipulation by awk, etc. Initializers are replaced by a comment, /*INITIALIZED*/. Function definitions include parameter declarations and a dummy statement with appropriate return: .in +10 { [return [(n)];] } .in The C preprocessor is used to replace defined names and to execute compilation conditionals; declarations introduced by the preprocessing are removed, and include directives are restored. .PP The output of .B mk_cdeclist can be compiled into a lint library. .PP .B Mk_identlist is a shell script with filters for converting C source files into a file of identifiers suitable for inverted indexing, making a word list, etc. .PP .SH FILES The filters for both .B mk_cdeclist and .B mk_identlist are from Lex source files: .in +5 uncomment.l, cdeclist[1-4].l and identlist[ 1].l. .in The executables are correspondingly named. .br .B Cpp is used in .B mk_cdeclist. .SH AUTHOR John Rupley .br uucp: ..{cmcl2 | hao!ncar!noao}!arizona!rupley!local .br internet: rupley!local@megaron.arizona.edu .br Dept. Biochemistry, Univ. Arizona, Tucson AZ 85721 .SH BUGS The cdeclist filters, although simple, should parse most styles of coding. Adjustment may be needed for the new ANSI standard. If there is a problem, try filtering first with .B cb or .B indent. .sp The identlist filters are offered without much being claimed for them. Use the list as a guide and aid, without assuming it absolutely correct. .sp The shell scripts mk_* were written under ksh on a SYSV machine, and they need modification to run under csh or BSD. \Rogue\Monster\ else echo "will not over write ./cdeclist.1" fi if [ `wc -c ./cdeclist.1 | awk '{printf $1}'` -ne 1998 ] then echo `wc -c ./cdeclist.1 | awk '{print "Got " $1 ", Expected " 1998}'` fi if `test ! -s ./uncomment.l` then echo "writing ./uncomment.l" cat > ./uncomment.l << '\Rogue\Monster\' %{ /*UNCOMMENT- based on usenet posting by: */ /* Chris Thewalt; thewalt@ritz.cive.cmu.edu */ %} STRING \"([^"\n]|\\\")*\" COMMENTBODY ([^*\n]|"*"+[^*/\n])* COMMENTEND ([^*\n]|"*"+[^*/\n])*"*"*"*/" %START COMMENT %% <COMMENT>{COMMENTBODY} ; <COMMENT>{COMMENTEND} BEGIN 0; <COMMENT>.|\n ; "/*" BEGIN COMMENT; {STRING} ECHO; .|\n ECHO; \Rogue\Monster\ else echo "will not over write ./uncomment.l" fi if [ `wc -c ./uncomment.l | awk '{printf $1}'` -ne 349 ] then echo `wc -c ./uncomment.l | awk '{print "Got " $1 ", Expected " 349}'` fi if `test ! -s ./cdeclist1.l` then echo "writing ./cdeclist1.l" cat > ./cdeclist1.l << '\Rogue\Monster\' %{ /*-CDECLIST1: prepare for cpp execution of #ifdefs, etc.; */ /*i.e., setup to restore #includes & remove code added by cpp */ %} %% ^\#[ \t]*include.*$ {printf("/*%s*/\n", yytext); printf("/*DINGDONGDELL*/\n"); printf("%s\n", yytext); printf("/*DELLDONGDING*/\n");} .|\n ECHO; \Rogue\Monster\ else echo "will not over write ./cdeclist1.l" fi if [ `wc -c ./cdeclist1.l | awk '{printf $1}'` -ne 289 ] then echo `wc -c ./cdeclist1.l | awk '{print "Got " $1 ", Expected " 289}'` fi if `test ! -s ./cdeclist2.l` then echo "writing ./cdeclist2.l" cat > ./cdeclist2.l << '\Rogue\Monster\' %{ /*-CDECLIST2: remove cpp-included code, restore #include's */ %} %START DING %% ^"/*DINGDONGDELL*/"$ BEGIN DING; ^"/*DELLDONGDING*/"$ BEGIN 0; <DING>^"/*#"[^*]*"*/"$ ; <DING>.|\n ; ^"/*#"[^*]*"*/"$ {yytext[yyleng-2] = 0; printf("%s", &yytext[2]);} ^[ \t]*\n ; .|\n ECHO; \Rogue\Monster\ else echo "will not over write ./cdeclist2.l" fi if [ `wc -c ./cdeclist2.l | awk '{printf $1}'` -ne 291 ] then echo `wc -c ./cdeclist2.l | awk '{print "Got " $1 ", Expected " 291}'` fi if `test ! -s ./cdeclist3.l` then echo "writing ./cdeclist3.l" cat > ./cdeclist3.l << '\Rogue\Monster\' %{ /*-CDECLIST3: for each function, remove the {function body}; */ /* (look out for {}'s escaped or within " or ' pairs) */ %} int curly, retval; WLF ([ \t\n\r\f]*) FUNCSTRT (\){WLF}\{|\;{WLF}\{) SKIPALLQUOTED (\"([^"\n]|\\\")*\"|\'.\'|\\.) %START CURLY %% <CURLY>\{ curly++; <CURLY>\} {if (--curly == 0) { if (retval > 1) printf(" return (%d); }", retval); else if (retval) printf(" return; }"); else printf("}"); BEGIN 0; }} <CURLY>{FUNCSTRT} curly++; <CURLY>{SKIPALLQUOTED}|.|\n ; <CURLY>"return"{WLF}\; retval |= 1; <CURLY>"return"{WLF}([^;]|{SKIPALLQUOTED})*\; retval |= 2; {FUNCSTRT} {curly=1;retval=0;ECHO;BEGIN CURLY;} ^[ \t]*\n ; .|\n ECHO; \Rogue\Monster\ else echo "will not over write ./cdeclist3.l" fi if [ `wc -c ./cdeclist3.l | awk '{printf $1}'` -ne 720 ] then echo `wc -c ./cdeclist3.l | awk '{print "Got " $1 ", Expected " 720}'` fi if `test ! -s ./cdeclist4.l` then echo "writing ./cdeclist4.l" cat > ./cdeclist4.l << '\Rogue\Monster\' %{ /*-CDECLIST4: process output of cdeclist3: delete initialization expressions; reformat for one-line declarations; some beautfying -- wise to process original source with cb|indent; parsing in DECLST is hacked; would be better to base it cleanly on std ANSI; to delete externs or whatever, try: {WLF}extern[^;]*\;{WLF} */ %} int curly; W ([ \t]*) WLF ([ \t\n\f\r]*) LET [_a-zA-Z] DIGIT [0-9+-/*] DIGLET [_a-zA-Z0-9] NAME ([*]*{LET}{DIGLET}*) ARRAY (\[{DIGIT}*\]) DECL ([;,*]|{WLF}|{NAME}|{ARRAY}) FUNCPTR (\({DECL}*\)\({DECL}*\)) DECLST ({DECL}|{FUNCPTR})* FINDSTRUCT (struct|union|enum){WLF}{NAME}?{WLF}\{ ENDSTRUCT {WLF}\}{WLF} FINDFUNC \){DECLST}\{[ ]?(return[^}\n]*)?\} ENDFUNC {WLF}\{[ ]?(return[^}\n]*)?\} FINDINIT {WLF}\={WLF} ENDINIT1 {WLF}\;{WLF} ENDINIT2 {WLF}\,{WLF} SKIPALLQUOTED (\"([^"\n]|\\\")*\"|\'.\'|\\.) %START NORM DELETE DECL %{ #include <ctype.h> main() { BEGIN NORM; yylex(); return 0; } %} %% <NORM>"/*"[^\n]*"*/" print_skip(yytext); <NORM>"#"[^\n]*$ print_skip(yytext); <NORM>{WLF}{FINDINIT} {BEGIN 0;BEGIN DELETE;} <NORM>{FINDFUNC}|{FINDSTRUCT} {curly=0;BEGIN 0;BEGIN DECL;REJECT;} <DELETE>{SKIPALLQUOTED}|[^{},;] ; <DELETE>\{ curly++; <DELETE>\} curly--; <DELETE>{ENDINIT1} {if (curly == 0) { printf("\;\t/*INITIALIZED*/\n"); ;BEGIN 0; BEGIN NORM;}} <DELETE>{ENDINIT2} {if (curly == 0) { printf(" /*INITIALIZED*/, "); ;BEGIN 0; BEGIN NORM;}} <DECL>{ENDSTRUCT} {printf("} "); if (--curly == 0) { BEGIN 0;BEGIN NORM; }} <DECL>{ENDFUNC} {printf(" ");print_skip(yytext); BEGIN 0;BEGIN NORM;} <DECL>\{ {ECHO;curly++;} <DECL>{WLF}\){WLF}/{ENDFUNC} printf(")"); <DECL>{WLF}\;{WLF}/{ENDFUNC} printf(";"); <DECL>{WLF}\;{WLF} printf("; "); <NORM>{WLF}\;{WLF} printf(";\n"); <NORM,DECL>{WLF}\,{WLF} printf(", "); <NORM,DECL>{WLF}/[/#] ; <NORM,DECL>{WLF} printf(" "); <NORM,DECL>. ECHO; %% print_skip(s) char * s; { int c; while (isspace(*s)) s++; printf("%s\n", s); while (isspace(c=input())) ; unput(c); } \Rogue\Monster\ else echo "will not over write ./cdeclist4.l" fi if [ `wc -c ./cdeclist4.l | awk '{printf $1}'` -ne 2012 ] then echo `wc -c ./cdeclist4.l | awk '{print "Got " $1 ", Expected " 2012}'` fi if `test ! -s ./mk_cdeclist` then echo "writing ./mk_cdeclist" cat > ./mk_cdeclist << '\Rogue\Monster\' # MK_CDECLIST DEFAULTDIR="." CDIR=`dirname $1` if [ ${CDIR} = "." ] then CDIR=${DEFAULTDIR} fi HDIR=${CDIR} CDECLIST="" #option: if define LLOUT="Sall.1", then output concatenated onto Sall.1; #else get individual output files with prefix "LL"; LLOUT="Sall.1" #LLOUT="dumdumdum" for i in $* do FILE=`basename ${i}` echo processing file: $i >&2 if [ ${LLOUT} != "Sall.1" ] then LLOUT=LL${FILE} CDECLIST="${CDECLIST} ${LLOUT}" fi echo "\n\n/*********************************************/\n" >>${LLOUT} echo /*${FILE}*/ >>${LLOUT} echo >>${LLOUT} cat ${CDIR}/${FILE}|uncomment|cdeclist1|/lib/cpp -P -C -I${HDIR}| cdeclist2|cdeclist3|cdeclist4 >>${LLOUT} done #temporary++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ exit #NOTE: what follows is an example of manipulation of the cdeclist output. #the following cats a list of separate "LL" files onto Sall.1 echo echo +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ echo combining files: ${CDECLIST} >Sall.1 for i in ${CDECLIST} do cat ${i} >>Sall.1 done #the following extracts a list of function definitions _only_; #(no static or external declarations; no definitions of arrays/vars/structs); #collect comment header lines, functions, and struct declarations; #get rid of register specs; #strip any variable declarations trailing struct declarations; #change return (n) -> return (0), so ok for return of pointers; #insert VARARGS where needed; #to include structure definitions, add to 1st egrep: |struct[ ]+[^;({]+\{ #remove several functions with "static" scope -- and that give errors in #compilation without struct definitions (same scope); #this gives a partial lint library - full library needs array/variables; cat Sall.1|egrep '\/\*[^L]|\{\}|\ return[ ;]' | egrep -v '^static\ |INITIALIZED' | egrep -v '\ engr_at[(]|^del_engr[(]|\ outentry[(]|^ini_inv[(]' | sed -e "s/register[ ]\([^;, ][^;, ]*[;,]\)/int\ \1/g" \ -e "s/register[ ]//g" \ -e "s/\}[^;}]*\;$/\}\ \;/" \ -e "s/return\ .[1-3]./return\ \(0\)/g" \ -e "s/^panic/\/\*VARARGS1\*\/\\ panic/" \ -e "s/^error/\/\*VARARGS1\*\/\\ error/" \ -e "s/^pline/\/\*VARARGS1\*\/\\ pline/" \ -e "s/^impossible/\/\*VARARGS1\*\/\\ impossible/p" >Sall.2 \Rogue\Monster\ else echo "will not over write ./mk_cdeclist" fi if [ `wc -c ./mk_cdeclist | awk '{printf $1}'` -ne 2230 ] then echo `wc -c ./mk_cdeclist | awk '{print "Got " $1 ", Expected " 2230}'` fi if `test ! -s ./identlist.l` then echo "writing ./identlist.l" cat > ./identlist.l << '\Rogue\Monster\' %{ /*IDENTLIST- comments & strings deleted; punc to space; etc*/ /*(a filter to prep C source for bib INDEXing, making a word list, ...)*/ /*convert to white space all tokens except C keywords and identifiers;*/ /*comment recognition based on usenet posting by: */ /* Chris Thewalt; thewalt@ritz.cive.cmu.edu */ %} W [ \t] STRING \"([^"\n]|\\\")*\" %START COMMENT %% <COMMENT>([^*\n]|"*"+[^*/\n])* ; <COMMENT>([^*\n]|"*"+[^*/\n])*"*"*"*/" BEGIN 0; <COMMENT>.|\n ; "/*" BEGIN COMMENT; {STRING} ; ^#.*$ ; [a-zA-Z_0-9]\.[a-zA-Z_0-9] ECHO; [a-zA-Z_0-9]\-\>[a-zA-Z_0-9] ECHO; \\|\||\+|\)|\(|\*|\&|\^|\%|\$ printf(" "); \#|\@|\!|\~|\`|\-|\=|\}|\{ printf(" "); \]|\[|\"|\:|\'|\;|\?|\/|\>|\.|\<|\, printf(" "); .|\n ECHO; \Rogue\Monster\ else echo "will not over write ./identlist.l" fi if [ `wc -c ./identlist.l | awk '{printf $1}'` -ne 735 ] then echo `wc -c ./identlist.l | awk '{print "Got " $1 ", Expected " 735}'` fi if `test ! -s ./identlist1.l` then echo "writing ./identlist1.l" cat > ./identlist1.l << '\Rogue\Monster\' %{ /*IDENTLIST1- remove strings that are all uppercase or all numbers*/ /*(another filter to prep C source for bib INDEXing, making a word list, ...)*/ /* ^[A-Z_0-9]+/{W}+ printf(" "); {W}+[A-Z_0-9]+$ printf(" "); ^[A-Z_0-9]+$ printf(" "); */ %} W [ \t] %START COMMENT %% (^|{W}+)[A-Z_0-9]+/(\n|{W}+) printf(" "); .|\n ECHO; \Rogue\Monster\ else echo "will not over write ./identlist1.l" fi if [ `wc -c ./identlist1.l | awk '{printf $1}'` -ne 335 ] then echo `wc -c ./identlist1.l | awk '{print "Got " $1 ", Expected " 335}'` fi if `test ! -s ./mk_identlist` then echo "writing ./mk_identlist" cat > ./mk_identlist << '\Rogue\Monster\' # MK_IDENTLIST #prepare C source files for making an inverted index, or whatever, using #identlist and identlist1 to: delete quoted strings, comments; #convert punctuation to white space; delete numbers and all upper case words; #the result, with some stuff flagged for later reversal, #should be suitable for making an inverted index of identifiers in each #source file. # #if use "invert" from the "bib" suite of programs, can exclude, as common #words, the C keywords and the system library # #(NOTE: as a test, the output of the loop containing the filters is #sent through tr, sort and uniq, to produce a word list) DEFAULTDIR="." CDIR=`dirname $1` if [ ${CDIR} = "." ] then CDIR=${DEFAULTDIR} fi for i in $* do FILE=`basename ${i}` echo processing file: $i >&2 cat ${CDIR}/${FILE}|identlist|identlist1| sed -e '/^[ ]*$/d' done |tr -s " " "\012\012"|sort|uniq #produce a word list #temporary++++++++++++++++++++++++++++++++++++++++++++++++++ exit #to make an inverted index, replace the above loop by the following: for i in $* do FILE=`basename ${i}` echo processing file: $i cat ${CDIR}/${FILE}|identlist|identlist1| sed -e 's/\([a-zA-Z_0-9]\)\.[^ \t]*[ \t]/\1\qsq\ /g' \ -e 's/\([a-zA-Z_0-9]\)\-\>[^ \t]*[ \t]/\1qsq\ /g' \ -e 's/\([a-zA-Z0-9]\)[_]\([a-zA-Z0-9]\)/\1quq\2/g' | sed -e '/^[ ]*$/d' >${FILE} FILELIST=${FILELIST}" "${FILE} done #NOTE: if the above changes are made, mk_identlist should be run in #an empty (scratch) directory; #intermediate files of the same name as the C source files are created; #the directory with the C source files can be given as DEFAULTDIR or as #the absolute pathname of the first file in the list; #NOTE: the following is an example of processing to get an inverted index. invert -ccommon -k5000 -l30 ${FILELIST} mv INDEX INDEX.long cat INDEX.long| sed -e 's/quq/\_/g' -e 's/qsq/STRUCT/g' \ -e 's/\([^0-9]\)[0-9][0-9]*\/[0-9][0-9]*\([^0-9]\)/\1\2/g' \ -e 's/\([^0-9]\)[0-9][0-9]*\/[0-9][0-9]*$/\1/' | sort -o INDEX \Rogue\Monster\ else echo "will not over write ./mk_identlist" fi if [ `wc -c ./mk_identlist | awk '{printf $1}'` -ne 2001 ] then echo `wc -c ./mk_identlist | awk '{print "Got " $1 ", Expected " 2001}'` fi echo "Finished archive 1 of 1" exit -- Please send comp.sources.unix-related mail to rsalz@uunet.uu.net.