peckham@svax.cs.cornell.edu (Stephen Peckham) (03/25/89)
In article <620@gonzo.UUCP> daveb@gonzo.UUCP (Dave Brower) writes: >So, I offer this week's challenge: Smallest program that will take >"blank line" style cpp output on stdin and send to stdout a scrunched >version with appropriate #line directives. [f]lex, Yacc, [na]awk, sed, >perl, c, c++ are all acceptable. This will be an amusing excercise in >typical text massaging that can be enlightening for many people. > Here's an awk program that will do the trick. Single blank lines are left as is. Multiple blank lines are removed, and a new line directive is added. {if (NF == 0) blanks++ else if ($1=="#") {l_no = $2-1; f = $3; blanks = 2;} else { if (blanks > 1) print "#", l_no, f; else if (blanks == 1) print ""; blanks = 0; print $0; } l_no++; } Steve Peckham
duane@cg-atla.UUCP (Andrew Duane) (03/27/89)
> In article <620@gonzo.UUCP> daveb@gonzo.UUCP (Dave Brower) writes: >So, I offer this week's challenge: Smallest program that will take >"blank line" style cpp output on stdin and send to stdout a scrunched >version with appropriate #line directives. [f]lex, Yacc, [na]awk, sed, >perl, c, c++ are all acceptable. If shell scripts are acceptable, how about: #!/bin/sh cat -s You may have to use "more" rather than cat. The moral: please don't reinvent the wheel [1/2 ;-)] Andrew L. Duane (JOT-7) w:(508)-658-5600 X5993 h:(603)-434-7934 Compugraphic Corp. decvax!cg-atla!duane 200 Ballardvale St. ulowell/ \laidback Wilmington, Mass. 01887 cbosgd!ima/ \cgeuro Mail Stop 200II-3-5S ism780c/ \wizvax Only my cat shares my opinions, and she has no blank lines.
daveb@gonzo.UUCP (Dave Brower) (03/29/89)
In article <6839@cg-atla.UUCP> duane@cg-atla.UUCP (Andrew Duane) writes: >> In article <620@gonzo.UUCP> daveb@gonzo.UUCP (Dave Brower) writes: >>So, I offer this week's challenge: Smallest program that will take >>"blank line" style cpp output on stdin and send to stdout a scrunched >>version with appropriate #line directives. [f]lex, Yacc, [na]awk, sed, >>perl, c, c++ are all acceptable. > >If shell scripts are acceptable, how about: > > #!/bin/sh > cat -s > >You may have to use "more" rather than cat. The moral: please >don't reinvent the wheel [1/2 ;-)] Sorry, you lept at the naive and incorrect solution. Please say "with appropriate #line directives." Cat -s obfuscates matching the output lines with the input lines. That is the point of the challenge. I have two entries so far, one in "lex" and another in "awk". Both are less than 20 lines. It will be interesting to compare timings between awk, gawk, nawk, lex and flex. -dB -- "I came here for an argument." "Oh. This is getting hit on the head" {sun,mtxinu,amdahl,hoptoad}!rtech!gonzo!daveb daveb@gonzo.uucp
bernsten@phoenix.Princeton.EDU (Dan Bernstein) (03/30/89)
Dave Brower asks for a filter ``that will take "blank line" style cpp output on stdin and send to stdout a scrunched version with appropriate #line directives.'' If we may combine built-in utilities to handle the problem, then this 9-line shell script will do it (combine the last two lines to make it 8): #!/bin/sh ( tr XY '\375\376' | sed 's/^\(.\)\(.*\)/X\1\2Y/ tend i\ X#line d :end =' | uniq | tr '\012X' ' \012'; echo ''; ) | sed 's/Y.*//' | tr '\375\376' XY | sed -n '1!p' The idea is reasonably simple; one could use, e.g., grep -n '.' to obtain a similar solution. This particular version destroys any \375 and \376 you may have in your source, and because it's based on sed, it omits the final line if it has no newline. It has been tested successfully on a wide variety of sources, and I must say the next time I feel compelled to look at cpp output, I'll definitely use it. > I have two entries so far, one in "lex" and another in "awk". Both are > less than 20 lines. It will be interesting to compare timings between > awk, gawk, nawk, lex and flex. Ahem? Are we forgetting sed here? (Then again, I hate awk, love sed, and prefer C to lex. I'd rather have a sed script twice as slow as an awk script. But that's just personal bias.) If you time, make sure to test out on really long sources too. I'd hate to see my script penalized just because it totals eight+sh execs :-). ---Dan Bernstein, bernsten@phoenix.princeton.edu
rupley@arizona.edu (John Rupley) (03/31/89)
In article <7472@phoenix.Princeton.EDU>, bernsten@phoenix.Princeton.EDU (Dan Bernstein) writes: > Dave Brower asks for a filter ``that will take "blank line" style cpp > output on stdin and send to stdout a scrunched version with appropriate ^^^^^^^^^^^ > #line directives.'' If we may combine built-in utilities to handle the > problem, then this 9-line shell script will do it (combine the last > two lines to make it 8): > > #!/bin/sh > ( tr XY '\375\376' | sed 's/^\(.\)\(.*\)/X\1\2Y/ > tend > i\ > X#line > d > :end > =' | uniq | tr '\012X' ' \012'; echo ''; ) > | sed 's/Y.*//' | tr '\375\376' XY | sed -n '1!p' I am not sure this is what the original poster wanted, ie ``appropriate'' may refer to #line directives with line numbers that reference the source file, not the cpp output. Regardless, the above script is truly trivial in Lex: %% \n\n+ printf("\n#line %d \n", yylineno); .|\n ECHO; > Ahem? Are we forgetting sed here? (Then again, I hate awk, love sed, > and prefer C to lex. I'd rather have a sed script twice as slow as an > awk script. But that's just personal bias.) How could one forget sed (:-)? But for matching patterns that cross line boundaries, Lex is a natural, because it sees a file as a stream of characters rather than as a stream of records. Sed and awk are record-based and thus seem forced for multi-line matching. Prefer C to Lex? Hmmm... Lex is just the machinery for a pattern-based switch statement, with the user supplying ``case'' statements written in C. John Rupley rupley!local@megaron.arizona.edu
rupley@arizona.edu (John Rupley) (04/01/89)
From rupley!local Fri Mar 31 13:43:14 1989 In article <620@gonzo.UUCP> daveb@gonzo.UUCP (Dave Brower) writes: >So, I offer this week's challenge: Smallest program that will take >"blank line" style cpp output on stdin and send to stdout a scrunched >version with appropriate #line directives. The following Lex source is somewhat shorter than a previous Lex version. Specifications assumed: single blank lines, as well as runs of blank lines +- <#> line directives, are to be replaced by <# lineno "filename">; only truly blank lines (no space or tab) are to be considered blank. ------------------------------------------------------------------------ char f[80]; %S P %% #.+\n {sscanf(yytext,"#%d%s",&yylineno,f);BEGIN P;} <P>.+\n {printf("# %d %s\n",yylineno-1,f);ECHO;BEGIN 0;} \n BEGIN P; .+\n ECHO; ------------------------------------------------------------------------ John Rupley rupley!local@megaron.arizona.edu
rupley@arizona.edu (John Rupley) (04/01/89)
In article <620@gonzo.UUCP> daveb@gonzo.UUCP (Dave Brower) writes: >So, I offer this week's challenge: Smallest program that will take >"blank line" style cpp output on stdin and send to stdout a scrunched >version with appropriate #line directives. Yet another Lex version: ------------------------------------------------------------------------ char f[80]; int x; %% #.+\n {sscanf(yytext,"#%d%s",&yylineno,f); x++;} .+\n {if(x)printf("# %d %s\n",yylineno-1,f); ECHO; x=0;} \n x++; ------------------------------------------------------------------------ John Rupley rupley!local@megaron.arizona.edu