lgy@phys.washington.edu (Laurence G. Yaffe) (03/13/91)
I frequently use a recursive directory comparison program to compare the contents of entire directory trees. I recently decided to rewrite my previous shell based 'dcmp' program in Perl (and of course, add various bells and whistles). Appended below is the result (my first serious Perl programming). If others find it useful, great. Comments on Perl programming style welcome. By default, this program reports on files present in one directory tree, but absent in the other, and files present in both trees but with differing contents, ownerships or permissions. 'Diff' may run on differing text files to display differences. Its output is designed to be terse but informative. (I find the sysV "dircmp" unpleasantly verbose and insufficiently flexible.) It's wrapped with its own man page (but not shar'ed). Enjoy -------------------------------------------------------------------------- Laurence G. Yaffe Internet: lgy@newton.phys.washington.edu University of Washington Bitnet: yaffe@uwaphast.bitnet -------------------------------------------------------------------------- #!/usr/local/bin/perl 'di'; 'ig00'; # # Usage: dcmp [options] dir1 dir2 [pattern] # # Compare all (or selected files in) two directories trees # # Options: -r Don't recursively descend into subdirectories. # -a Don't report absent files. # -o Don't report differing ownerships. # -p Don't report differing permissions. # -t Don't report differing file types. # -f Don't follow symbolic links. (Default is to follow.) # -n Don't run 'diff' on differing text files, w/o querying. # -y Do run 'diff' on differing text files, w/o querying. # -q Query about running 'diff', even if stdin/out not ttys. # -v Run 'verbosely'. # # Laurence G. Yaffe # lgy@phys.washington.edu # 3/12/91 # # Check args && fetch directory names. require "stat.pl" ; require "getopts.pl" ; $options = 'afnopqrtvy' ; (&Getopts ($options) && (@ARGV >= 2)) || die "Usage: $0 [-$options] dir1 dir2 [pattern]\n"; $statcmd = ($opt_f) ? 'lstat' : 'stat' ; $rpt_absent = !$opt_a ; $rpt_owner = !$opt_o ; $rpt_perm = !$opt_p ; $rpt_type = !$opt_t ; $recurse = !$opt_r ; $diff_all = $opt_y ; $verbose = $opt_v ; $query = $opt_q || (!$opt_y && !$opt_n && (-t STDIN && -t STDOUT)) ; $pat = splice (@ARGV,2,1) || '^' ; &dcmp (@ARGV) ; sub gripe { local ($cond,$file,$left,$right) = @_ ; local ($lt,$rt) ; if (length ($file) < 20) { $lt = -16 ; $rt = -19 ; } else { $lt = -7 ; $rt = 28 ; } printf (" %-7s %-${lt}s %${rt}s %s\n", $cond, $left, $file, $right) ; } sub dcmp { local ($dir1, $dir2) = @_; local (@recurse,$_) ; undef (%files) ; return unless opendir(DIR1,$dir1) || !warn " Can't open directory $dir1\n"; return unless opendir(DIR2,$dir2) || !warn " Can't open directory $dir2\n"; $rem = (length ($dir1) > 37) ? 0 : 30 ; printf ("Comparing %-37s %-$rem.${rem}s\n", $dir1, $dir2) ; while ($_ = readdir (DIR1)) { $files {$_}++ ; } while ($_ = readdir (DIR2)) { $files {$_}+=2 ; } closedir (DIR1) ; closedir (DIR2) ; if ($rpt_absent) { foreach (sort grep ($files {$_} != 3 && /$pat/o, keys (%files))) { if ($files{$_} == 1) { &gripe ('absent:', $_, '::::::', '') ;} else { &gripe ('absent:', $_, '', '::::::') ;} } } $dir1 .= '/' ; $dir2 .= '/' ; compare: foreach $file (sort grep ($files {$_} == 3, keys (%files))) { next if $file eq "." ; next if $file eq ".." ; $SIG {'INT'} = 'interrupt' ; $flag = &comp ($dir1, $dir2, $file) ; push (@recurse, $file) if ($flag > 0) ; &interrupt if ($flag < 0) ; sub interrupt { if (-t STDIN && -t STDOUT) { print "\nInterrupt: quit, continue, ascend? " ; $reply = <STDIN> ; next if $reply =~ /^c/io ; @recurse = () ; last compare if $reply =~ /^a/io ; } exit 1 ; } } if ($recurse) { foreach $file (@recurse) { &dcmp ($dir1.$file, $dir2.$file, '') ; } } } sub comp { local ($dir1, $dir2, $file) = @_ ; local ($file1) = $dir1 . $file ; local ($file2) = $dir2 . $file ; ($type1, $mod1, $uid1, $gid1, $size1) = &filetype ($file1) ; ($type2, $mod2, $uid2, $gid2, $size2) = &filetype ($file2) ; if ($file =~ /$pat/o) { $ne = 0 ; $ne = &gripe ('owners:', $file, $uid1.'/'.$gid1, $uid2.'/'.$gid2) if ($rpt_owner && ($uid1 != $uid2 || $gid1 != $gid2)) ; $ne = &gripe ('types:',$file,sprintf("%lo",$mod1),sprintf("%lo",$mod2)) if ($rpt_type && ($mod1 | 07777) != ($mod2 | 07777)) ; $ne = &gripe ('perms:',$file,sprintf("%lo",$mod1),sprintf("%lo",$mod2)) if ($rpt_perm && ($mod1 & 07777) != ($mod2 & 07777) && ($mod1 | 07777) == ($mod2 | 07777)) ; if ($type1 eq 'f' && $type2 eq 'f') { $differ = ($size1 != $size2) || &cmp ($file1, $file2) ; if ($differ) { if ($differ > 0 && ($do_diff = $diff_all) || $query) { if ($query) { printf (" %-25s %s ? ", 'diff [ny]', $file) ; $reply = <STDIN> ; exit 0 if $reply =~ /^q/io ; $do_diff = $reply =~ /^y/io ; return 0 if !$do_diff ; } } else { $do_diff = 0 ; } if ($do_diff) { return -1 if (system ("diff $file1 $file2") & 255) ; } else { &gripe ('differ:', $file, '', '') ; } } elsif ($verbose && !$ne) { &gripe ('same:', $file, '', '') ; } return 0 ; } } ($type1 eq 'd' && $type2 eq 'd') ; } sub cmp { local ($FILE1, $FILE2) = @_ ; if (!open FILE1) { warn " Can't open $FILE1\n" ; } elsif (!open FILE2) { warn " Can't open $FILE2\n" ; } else { while () { $len1 = sysread (FILE1, $buf1, 16384) ; $len2 = sysread (FILE2, $buf2, 16384) ; if ((!defined $len1) || (!defined $len2)) { next if $! =~ /^Interrupted/ ; warn " System read error: $!\n" ; last ; } last if ($len1 == 0 && $len2 == 0) ; next if ($len1 == $len2 && $buf1 eq $buf2) ; $wierd = ($buf1 =~ tr/\000\200-\255/\000\200-\255/) ; return $wierd ? -1 : 1 ; } } 0 ; } sub filetype { local ($file) = @_ ; local (@st) = eval "$statcmd (\$file)" ; local ($type) = -d _ ? 'd' : -f _ ? 'f' : -p _ ? 'p' : -S _ ? 'S' : 's' ; return ($type, $st[$ST_MODE], $st[$ST_UID], $st[$ST_GID], $st[$ST_SIZE]) ; } ######################## .00; 'di .nr nl 0-1 .nr % 0 '; __END__ .TH DCMP 1 "March 7, 1991" .AT 3 .SH NAME dcmp \- recursive directory comparison .SH SYNOPSIS .B dcmp [options] dir1 dir2 [pattern] .SH DESCRIPTION .I Dcmp compares all (or selected files within) two directories trees. By default, .I dcmp reports files which are present in one directory tree but not the other, and files present in both trees with differing contents, permissions, or ownerships. When run from a terminal, the user is queried about each differing text file, and may choose to view the differences (produced by 'diff'). The optional .I pattern may be any regular expression (in the style of 'ed' or 'perl'). If given, only files with names matching the pattern are compared. The default bahavior may be modified by the following options: -r Don't recursively descend into subdirectories. -a Don't report absent files. -o Don't report differing ownerships. -p Don't report differing permissions. -t Don't report differing file types. -f Don't follow symbolic links. (Default is to follow symbolic links.) -n Don't run 'diff' on differing text files. -y Do run 'diff' on differing text files. -q Query about running 'diff', even if stdin or stdout are not connected to ttys. -v Run 'verbosely'. Report identical files. .SH ENVIRONMENT No Environment variables used. .SH FILES None. .SH AUTHOR Laurence G. Yaffe (lgy@phys.washington.edu) .SH "SEE ALSO" dircmp(1) .SH DIAGNOSTICS Complains if files or directories are unreadable. -- -------------------------------------------------------------------------- Laurence G. Yaffe Internet: lgy@newton.phys.washington.edu University of Washington Bitnet: yaffe@uwaphast.bitnet
rbj@uunet.UU.NET (Root Boy Jim) (03/13/91)
<lgy.668812268@newton> lgy@phys.washington.edu (Laurence G. Yaffe) writes: > > I frequently use a recursive directory comparison program to compare >the contents of entire directory trees. I recently decided to rewrite >my previous shell based 'dcmp' program in Perl (and of course, add various >bells and whistles). Appended below is the result (my first serious Perl >programming). If others find it useful, great. Comments on Perl programming >style welcome. Funny how things tend to get invented at about the same time. I am planning to post my version soon. It does have one other feature, in that it removes identical versions from one half of the tree. > By default, this program reports on files present in one directory tree, >but absent in the other, and files present in both trees but with differing >contents, ownerships or permissions. 'Diff' may run on differing text files >to display differences. Its output is designed to be terse but informative. >(I find the sysV "dircmp" unpleasantly verbose and insufficiently flexible.) >It's wrapped with its own man page (but not shar'ed). So get GNU diff instead. It does recursive comparison nicer. I will look at your version more thoroughly soon. However, one thing SCREAMS for attention. DON'T FOLLOW SYMBOLIC LINKS BY DEFAULT! This is a dangerous thing to do in any program that deals with the filesystem as a recursive structure. -- [rbj@uunet 1] stty sane unknown mode: sane
lgy@phys.washington.edu (Laurence G. Yaffe) (03/13/91)
rbj@uunet.UU.NET (Root Boy Jim) writes: >I will look at your version more thoroughly soon. However, one >thing SCREAMS for attention. DON'T FOLLOW SYMBOLIC LINKS BY DEFAULT! >This is a dangerous thing to do in any program that deals >with the filesystem as a recursive structure. Some people like to live dangerously! :-) Still, your point is well taken. -- -------------------------------------------------------------------------- Laurence G. Yaffe Internet: lgy@newton.phys.washington.edu University of Washington Bitnet: yaffe@uwaphast.bitnet