[comp.lang.perl] /lib/cpp equivalent in perl anyone?

Harald.Eikrem@elab-runit.sintef.no (03/04/91)

is the question.  That is, have anyone written a perl program that at
least does #include, #define, #undef, #if, #ifdef, #elif, #else ...
and of course #endif down to arbitrary levels of inclusion?

Whether or not #line directives are being inserted does not matter.

Thanks.   --Harald E

schaefer@ogicse.ogi.edu (Barton E. Schaefer) (03/12/91)

In article <1991Mar3.214406*Harald.Eikrem@elab-runit.sintef.no> Harald.Eikrem@elab-runit.sintef.no writes:
} is the question.  That is, have anyone written a perl program that at
} least does #include, #define, #undef, #if, #ifdef, #elif, #else ...
} and of course #endif down to arbitrary levels of inclusion?
} 
} Whether or not #line directives are being inserted does not matter.
} 
} Thanks.   --Harald E


Attached is a perl program I wrote quite some time ago to check that
#ifs match properly with #endifs and so forth.  It understands the
defined() CPP "function" and the && and || connectives.  However, it's
only a sort of syntax checker as it stands -- it needs work to actually
perform the text substitutions.  Someone with more time than I have to
play with it might make it a starting point ...

#! /usr/bin/perl
#
# Usage: ckcpp [-dvx] [-Ddefinition ...] [files ...]
#	-d	Print debugging output
#	-v	Toggle Verbose output (on by default)
#	-x	Echo every line scanned
#	-Ddef	As the C preprocessor
#
# Reads STDIN if no files named.

$HANDLE = 'FILE00000';

sub incl {
    local($file) = @_;
    local($FILE) = $HANDLE++;
    if ($file =~ m:\<(.*)\>:) { $file = "/usr/include/$1"; }
    elsif ($file =~ m:"([^"]*)":) { $file = $1; }
    print "Reading $file\n" if $verbose;
    unless (open($FILE, "<$file")) {
	warn "Can't read $file\n";
	return;
    }
    do scan($FILE);
    close($FILE);
    print "Finished $file\n" if $verbose;
}

sub endif {
    local($FILE) = @_;
    local($_);
    while (<$FILE>) {
	print if $echo;
	if ( /^#[ \t]*if/ ) {
	    do endif($FILE);
	}
	elsif ( /^#[ \t]*endif/ ) {
	    return;
	}
    }
    warn "#if with no #endif !!!\n";
}

sub hashelse {
    local($FILE) = @_;
    local($_);
    while (<$FILE>) {
	print if $echo;
	if ( /^#[ \t]*if/ ) {
	    do endif($FILE);
	}
	elsif ( /^#[ \t]*else/ || /^#[ \t]*endif/ ) {
	    return;
	}
    }
    warn "#if with no #endif !!!\n";
}

sub hashif {
    local($_) = $_[0];
    local($not,$token,$symbol,$conj);
    if ( /^(n{0,1})def[ \t]*([_A-Za-z][_A-Za-z0-9]*)/ ) {
	if ($1 ne "n" && ! $defined{$2} || $1 eq "n" && $defined{$2}) {
	    print "Skipping to #else or #endif /* $2 */\n" if $debug;
	    do hashelse($FILE);
	}
    }
    elsif ( /^[ \t]*(!{0,1})[ \t]*defined\(([_A-Za-z][_A-Za-z0-9]*)\)/ ) {
	$not = $1;
	$token = $symbol = $2;
	$_ = $';
	if ( /(\&\&){1}/ || /(\|\|){1}/ ) {
	    $conj = $1;
	    $_ = $';
	}
	if ($not eq "!") {
	    if ($defined{$symbol}) {
		undef $symbol;
	    }
	} else {
	    if (! $defined{$symbol}) {
		undef $symbol;
	    }
	}
	if ($symbol && $conj eq "&&" || ! $symbol && $conj eq "||") {
	    print "Testing $conj$_\n" if $debug;
	    do hashif($_);
	} elsif (! $symbol) {
	    print "Skipping to #endif /* $token */\n" if $debug;
	    do hashelse($FILE);
	}
    }
    elsif ( /([0-9][0-9]*)/ ) {
	if ($1 == 0) {
	    print "Skipping to #endif\n" if $debug;
	    do hashelse($FILE);
	}
    }
    else {
	if ($debug || $echo || $verbose) {
	    print "\tUnrecognized #if expression: $_\n";
	}
	else {
	    warn "Unrecognized #if expression $_\n";
	}
    }
}

sub scan {
    local($FILE) = @_;
    local($_,$if);
    while (<$FILE>) {
	print if $echo;
	if ( /^#[ \t]*if(.*)/ ) {
	    print "Testing $_" if $debug;
	    do hashif($1);
	}
	elsif ( /^#[ \t]*else/ ) {
	    do endif($FILE);
	}
	elsif ( /^#[ \t]*undef[ \t]*([_A-Za-z][_A-Za-z0-9]*)/ ) {
	    if ($defined{$1}) {
		print "\tUndefining $1\n" if $debug;
		$defined{$1} = 0;
		if ($max < $define) { $max = $define; }
		$define--;
	    }
	}
	elsif ( /^#[ \t]*define[ \t]([_A-Za-z][_A-Za-z0-9]*)/ ) {
	    if (! $defined{$1}) {
		$define++;
		print "\tDefining $1\n" if $debug;
		$defined{$1} = $file;
	    }
	    else {
		if ($debug || $echo || $verbose) {
		    print "\t$1 redefined (seen in $defined{$1})\n";
		}
		else {
		    warn "$1 redefined (seen in $defined{$1})\n";
		}
	    }
	} elsif ( /^#[ \t]*include[ \t]+([^ \t]*)/ ) {
	    do incl($1);
	}
    }
}

$verbose = 1;

while ($_ = shift(@ARGV)) {
    if ( /^-D(.*)={0,1}/ ) { $predef++; $predefined{$1} = 1; }
    elsif ( /^-d/ ) { $debug = 1; }
    elsif ( /^-x/ ) { $echo = 1; }
    elsif ( /^-v/ ) { $verbose = ! $verbose; }
    else { push(@Files,$_); }
}
if ($#Files < $[) { unshift(@Files,"&STDIN"); }
print "Predefined $predef symbols:\n\t", join("\n\t",keys(%predefined)), "\n";

$FILE = $HANDLE;

while ($_ = shift(@Files)) {
    open($FILE,"<$_") || warn "Can't read $_\n" && next;
    %defined = %predefined;
    $max = $define = $predef;
    do scan($FILE);
    if ($define > $max) { $max = $define; }
    print "Total of $define symbols defined, max was $max.\n";
}
-- 
Bart Schaefer                                           schaefer@zipcode.com
Z-Code Software Corporation                             schaefer@cse.ogi.edu