maart@cs.vu.nl (Maarten Litmaath) (03/25/89)
jim@bilpin.UUCP (Jim G) writes:
\#{ zapcom.sh }
\# Remove comments from a C program
\# sed removes comment strings which begin and end on the same line
\# awk removes comment strings which extend across multiple lines
\# sed/awk both handle nesting of comments within their context
Aha! You're using a SHELL script! Well, in that case there's another word
for my `sed approach' :-)
No awk necessary. This pipeline is reasonably fast too!
Usage:
sed -f Cstrip.1.sed foo.c | sed -f Cstrip.2.sed | sed -f Cstrip.3.sed
: This is a shar archive. Extract with sh, not csh.
: This archive ends with exit, so do not worry about trailing junk.
: --------------------------- cut here --------------------------
PATH=/bin:/usr/bin:/usr/ucb
echo Extracting 'Cstrip.1.sed'
sed 's/^X//' > 'Cstrip.1.sed' << '+ END-OF-FILE ''Cstrip.1.sed'
X#n
Xs/\(.\)/\1\
X/g
Xs/$/==/p
+ END-OF-FILE Cstrip.1.sed
chmod 'u=rw,g=r,o=r' 'Cstrip.1.sed'
set `wc -c 'Cstrip.1.sed'`
count=$1
case $count in
27) :;;
*) echo 'Bad character count in ''Cstrip.1.sed' >&2
echo 'Count should be 27' >&2
esac
echo Extracting 'Cstrip.2.sed'
sed 's/^X//' > 'Cstrip.2.sed' << '+ END-OF-FILE ''Cstrip.2.sed'
X#n
X/"/{
X : L0
X p
X n
X /"/{
X p
X b
X }
X /\\/{
X p
X n
X }
X b L0
X}
X/'/{
X : L1
X p
X n
X /'/{
X p
X b
X }
X /\\/{
X p
X n
X }
X b L1
X}
X/\\/{
X p
X n
X p
X b
X}
X/\//{
X h
X n
X /*/{
X : L2
X n
X : L3
X /*/{
X n
X /\//b
X b L3
X }
X b L2
X }
X H
X g
X}
Xp
+ END-OF-FILE Cstrip.2.sed
chmod 'u=rw,g=r,o=r' 'Cstrip.2.sed'
set `wc -c 'Cstrip.2.sed'`
count=$1
case $count in
232) :;;
*) echo 'Bad character count in ''Cstrip.2.sed' >&2
echo 'Count should be 232' >&2
esac
echo Extracting 'Cstrip.3.sed'
sed 's/^X//' > 'Cstrip.3.sed' << '+ END-OF-FILE ''Cstrip.3.sed'
X#n
X/==/{
X g
X s/\n//gp
X s/.*//
X x
X b
X}
XH
+ END-OF-FILE Cstrip.3.sed
chmod 'u=rw,g=r,o=r' 'Cstrip.3.sed'
set `wc -c 'Cstrip.3.sed'`
count=$1
case $count in
40) :;;
*) echo 'Bad character count in ''Cstrip.3.sed' >&2
echo 'Count should be 40' >&2
esac
exit 0
--
Modeless editors and strong typing: |Maarten Litmaath @ VU Amsterdam:
both for people with weak memories. |maart@cs.vu.nl, mcvax!botter!maartjim@bilpin.UUCP (Jim G) (03/30/89)
#{ v_langC.2 }
IN ARTICLE <2216@solo8.cs.vu.nl>, maart@cs.vu.nl (Maarten Litmaath) WRITES:
> jim@bilpin.UUCP (Jim G) [**THAT'S ME, FOLKS!**] writes:
> \#{ zapcom.sh }
> \# Remove comments from a C program
> \# sed removes comment strings which begin and end on the same line
> \# awk removes comment strings which extend across multiple lines
> \# sed/awk both handle nesting of comments within their context
[small but perfectly formed awk/sed script deleted]
>
> Aha! You're using a SHELL script! Well, in that case there's another word
> for my `sed approach' :-)
> No awk necessary. This pipeline is reasonably fast too!
[immense sed script deleted]
Although I don't dispute the efficacy of the supplied script ( I haven't
checked it out, though ), I think that this m-iii-ght be taking a
preference for sed a m-iii-te too far. My 3 line sed + 13 line awk
script has been replaced by a 101 line script with 66 lines of sed -
hmmm. Although awk is undoubtedly slower than sed, I use it in
preference for solving editing problems which can be defined on a field
basis, as I find it much easier to conceptualise solutions; I do not
find the sed syntax or operation conducive to an intuitive
problem/solution association ( obviously some peculiarity in how my
brain, errrm, works ).
I aimed for conciseness and a simple, balanced structure in the code
(rather than maximum efficiency, or universal application), as this is
easier for people (including me) to understand, and therefore
alter/improve, if they wish; especially for novice users, who would
probably feel safe in tinkering with zapcom.sh, but would probably have
to be restrained and sedated after seeing Cstrip :-)
Also, zapcom.sh is not universally applicable, in that it requires
comment delimiters to be themselves delimited by white space/EOL (so awk
can treat them as individual fields); and it won't handle correctly
comment delimiters embedded in quotes. There obviously comes a point
where the effort required to handle a special case outweighs the benefit
achieved; I considered these cases to come into that category.
We have now had a reasonable number of constructive postings on this
subject to give all interested parties a good set of approaches from
which to choose. Thankyou and goodnight ...
--
<Path: mcvax!ukc!icdoc!bilpin!jim> <UUCP: jim@bilpin.uucp>
Programmers' maxim : If it's not aesthetically pleasing, it's probably wrong.