[comp.lang.perl] Bug -- variable corruption inside sub/foreach

brian@hpausla.aso.hp.com (Brian Coogan) (02/06/90)

I seem to have managed to find another bug in perl.  A variable $file is
corrupted by reading from a filehandle (variable name doesn't seem
to matter, nor does it matter whether $file is local or not, or the
first/most recent variable use beforehand.)  I've worked around this
bug, but it lost me a lot of time.  The variable gets corrupted to
the value of $_.

Unfortunately, I haven't been able to get the bug to appear in reduced
versions of the code, so I'm including the original code below.  To
reproduce the bug, try
	rcslocks -vvv
in a directory containing some RCS files.  If you get any messages
about corruption, you reproduced the bug successfully (or try perl -d with
a breakpoint on line 210, $file should be eq $savefile, a reasonable file
name).

Corrections appreciated.  Perl version 3.0 pl 8 with JMPCLOBBER.

regards,

Brian Coogan,
Hewlett-Packard ASO.

#---------------------------------- cut here ----------------------------------
# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by Brian Coogan <brian@hpausln> on Tue Feb  6 14:29:05 1990
#
# This archive contains:
#	rcslocks	rcslocks.1	testit		oldcheck	
#
# Error checking via wc(1) will be performed.

unset LANG

echo x - rcslocks
cat >rcslocks <<'@EOF'
#! /usr/bin/perl

# $Header: rcslocks,v 1.5 90/02/06 14:25:05 brian Exp $
#
#	Lists names of locked RCS files on stdout.
#	You may give it as arguments RCS file names or directories.
#	If a directory argument is given, the locks in all directories
#	under that directory are recursively reported.
#	The name of either an RCS or working file may be given. 
#	By default, only the locks held by the current user are listed.
#
#	Usage: rcslocks [-alv | -u user[,user...]] [directory|file]...
#
#	Options:
#		none		list only locks that current user
#				holds (by uid) in or under .
#		-u user...	list locks held by user(s)
#		-a		list files with any locks
#		-l		long listing - list who holds the locks
#		-v		verbose (trace find starts) (debug)
#		-vv		trace file names as processed (debug)
#
#	Supports RCS style symbolic links, though not seamlessly -
#	the RCS file name is listed rather than the working file name.
#
#					Brian Coogan and others,
#					Hewlett-Packard ASO, Jan 90.
#	Examples:
#		ci -u `rcslocks`
#		rcslocks -al
#
# $check_users flag is necessitated by a perl bug where defined(%userlist)
# is always true, even when it hasn't been referenced yet.
#

#
#	findexp: find expression to return names of RCS dirs and *,v
#	findfoll: true if find follows sym links, false if we should
#	          follow the sym link (to one level only) ourselves.
#
$findexp = '\\( -name RCS -o -name "*,v" \\)';
$findfoll = 0;

($me = $0) =~ s%.*/%%;
$USAGE = "Usage: $me [-alv | -u user[,user...]] [directory|file]...";

#
#	Interpret options
#		-u user
#		-a
#		-v
#		-l
#

while ($_ = $ARGV[0], /^-(.)(.*)/ && shift(@ARGV)) {
	($f,$r) = ($1,$2);
	last if $f eq '-';
	if ($f eq 'v')			# VERBOSE
	{
		$verbose++;
		$r =~ /^(.)(.*)/,redo if $r ne '';
	}
	elsif ($f eq 'a')		# ALL
	{
		$all++;
		$r =~ /^(.)(.*)/,redo if $r ne '';
	}
	elsif ($f eq 'l')		# LONG listing
	{
		$long++;
		$r =~ /^(.)(.*)/,redo if $r ne '';
	}
	elsif ($f eq 'u')		# USERS
	{
		$users = $r eq '' ? shift(@ARGV) : $r;
		for $n (split(/,/, $users))
		{
			$users{$n} = 1;
		}
		$check_users++;
	}
	else{
		# usage error
		print "$USAGE\n";
		exit(1);
	}
}

print stderr "$me: Warning: -a given, -u ignored\n" if ($all && $check_users);

if (! $all && ! $check_users)
{
	@pwline = getpwuid($<);
	$myname = $pwline[0];
	$users{$myname} = 1;
	$check_users++;
}

push(ARGV, ".") if $#ARGV < $[;		# default to current directory


#
#	Process each argument
#
for $arg (@ARGV)
{
	#
	#	If it is a directory, recurse with a find.
	#
	if (-d $arg)
	{
		print "Running find on directory $arg\n" if $verbose;
		# HP-UX find doesn't return anything across symlinks.
		open(FIND, "find $arg $findexp -print|")
			|| die "$me: can't find $arg: $!\n";
		while (<FIND>)
		{
			chop;
			s%^\./%%;  # strip leading ./
			if (-d $_)
			{
				#
				#	If it's a symlink to a directory,
				#	and find doesn't follow symlinks,
				#	follow it ourselves, one level deep.
				#
				do checkfiles(<$_/*,v>) if ! $findfoll && -l _;
			}
			elsif (m=(^|/)RCS$=)	# RCS pseudo-symlink
			{
				unless (open(RCS, "$_"))
				{
					print stderr "$me: cannot open $_\n";
					next;
				}
				chop($path = <RCS>);
				close(RCS);
				next if ! -d $path;
				do checkfiles(<$path/*,v>);
			}
			else
			{
				do checkfiles($_);
			}
		}
		close(FIND);
		next;
	}

	#
	#	If not an RCS file, look for the corresponding
	#	RCS file.
	#
	if ($arg !~ /,v$/)
	{
		#
		#	Add ,v and look for that
		#
		$try = $arg . ",v";
		-f $try && do checkfiles($try) && next;

		#
		#	Add RCS/ and look for that
		#
		$try =~ s%/([^/]+)%/RCS/$1% || $try =~ s%^%RCS/%;
		-f $try && do checkfiles($try) && next;
		if (-f $arg)
		{
			print stderr "$me: $arg -- no corresponding RCS file\n";
			next;
		}
	}
	if (! -f $arg)
	{
		print stderr "$me: $arg -- No such file\n";
	}
	do checkfiles($arg);
}


sub checkfiles
{
	local($file);

	foreach $file (@_)
	{
		chop($file) if $file =~ /\n$/;
		print "$file\n" if $verbose >= 2;
		next if ! $file || $file !~ /,v$/; # de-bug
		next if $seen{$file}++;
		$savefile = $file; # perl bug

		unless (open(file, "<$file"))
		{
			print stderr "$me: cannot read $file: $!\n";
			next;
		}

		#
		#	Look for the locks line, which appears in the header
		#
		#	PERL BUG: $file gets mangled to be $_ in
		#	the following loop
		#
		while (<file>)
		{
			last if /^locks\s+/;
		}

		print "\$file corrupted from $savefile to $file\n"
		      if ($file ne $savefile && $verbose >= 3);

		#
		#	Quit unless there are locks
		#
		if (eof(file) || ! /^locks\s+([^;]*);/)
		{
			print stderr "$file: RCS file may be corrupted\n";
			next;
		}
		next unless $1;
		@locks = split(' ', $1);

		#
		#	Delete all the locks we arent interested in
		#
		if ($check_users && defined(%users)) # perl bug
		{
			@locks = grep(/^([^:]+):/ && defined($users{$1}),
					@locks);
		}
		next unless $#locks >= $[; # no applicable locks

		#
		#	Print out the working file name
		#	and the locks (if requested)
		#	If the file doesn't appear to be from a local
		#	RCS directory, print the RCS file name.
		#
		$file = $savefile;	# perl bug
		($wfile = $file) =~ s%(^|/)RCS/%$1% &&
			$file =~ s%,v$%%;
		if ($long)
		{
			print "$wfile: locked by @locks\n";
		}
		else
		{
			print "$wfile\n";
		}
	}
	close(file);
	return 1;
}
@EOF
set `wc -lwc <rcslocks`
if test $1$2$3 != 2549035262
then
	echo ERROR: wc results of rcslocks are $* should be 254 903 5262
fi

chmod 555 rcslocks

echo x - rcslocks.1
cat >rcslocks.1 <<'@EOF'
.\" $Header: rcslocks.1,v 1.2 90/02/06 13:58:22 brian Exp $
.if t .ds ' \h@.05m@\s+4\v@.333m@\'\v@-.333m@\s-4\h@.05m@
.if n .ds ' '
.if t .ds ` \h@.05m@\s+4\v@.333m@\`\v@-.333m@\s-4\h@.05m@
.if n .ds ` `
.TH RCSLOCKS 1 "" "" ASO
.SH NAME
rcslocks \- list details of RCS locks
.SH SYNOPSIS
.B rcslocks
[
.B \-alv
|
.B -u
.IR user [ ,user... ]
]
[
.I file
|
.I directory
]
.I ...
.br
.SH DESCRIPTION
.I Rcslocks\^
lists files with RCS locks.
By default,
.I rcslocks\^
lists just the file names of the file locks held by the current user
on standard output.
.PP
If a directory argument is given, RCS directories and
files
are searched for recursively and any locks found are reported.
If no file or directory argument is given,
.I rcslocks
looks in the current directory for looked files.
.SS Options
.TP 8
.BI -u " user"
The
.B -u
option limits the locks reported to those held by
.IR user .
.I user
may be a single user name or a comma separated list of users.
If neither
.B -b
or
.B -a
are given,
.I rcslocks
only reports on locks held by the current user.
.TP 8
.B -a
prints all locks found.
.TP 8
.B -l
lists all locks in long format.
The locked files are listed, along with the locked versions and who
holds the locks.
.TP 5
.B -v
Provides trace output for debugging.
One
.B -v
traces
.I find (1)
commands as they are executed;
.B -vv
prints file names as they are checked.
.SH EXAMPLES
The following command will print all locks under the directory /aso/source:
.PP
.RS
rcslocks -al /aso/source
.RE
.PP
The following command checks in all the files you have locked in the
current directory:
.PP
.RS
ci -u `rcslocks`
.RE
.SH RETURNS
Returns 1 for fatal errors.
Returns 0 for all other situations.
Non-fatal errors are indicated by a message and do not affect
exit status.
.SH NOTES
.I Rcsmerge\^
supports RCS style pseudo-symbolic links.
.SH SEE ALSO
perl(1), rcs(1), rlog(1).
@EOF
set `wc -lwc <rcslocks.1`
if test $1$2$3 != 943421880
then
	echo ERROR: wc results of rcslocks.1 are $* should be 94 342 1880
fi

chmod 444 rcslocks.1

echo x - testit
cat >testit <<'@EOF'
: use /bin/sh

if [ ! -d try -o ! -d try/RCS ]
then
	mkdir try try/RCS
	cd try
	cat > file <<!
	To be or not to be, that is the question.
	Whether 'tis noble to suffer the slings and arrows of
	outrageous fortune.
	or to take arms against a sea of troubles, and by opposing,
	conquer 'em
!
	cp file mylock
	cp file hislock
	cp file bothlock
	cp file nowkgfile
	# better than stuffing with whoami/id
	me=`perl -e '@pw = getpwuid($<); print $pw[0];'`
	if test -z "$me"
	then
		echo Could not work out name for current user id
		exit 1
	fi

	ci -l nowkgfile < /dev/null
	/bin/rm -f nowkgfile
	ci -l mylock < /dev/null
	ci -l hislock < /dev/null
	ci -l bothlock < /dev/null
	echo A small change >> bothlock
	ci -m'A small change' -l bothlock
	for file in hislock bothlock
	do
		sed -e "s/$me/root/" < RCS/$file,v > RCS/$file,vt
		/bin/rm -f RCS/$file,v
		mv RCS/$file,vt RCS/$file,v
		chmod -w RCS/$file,v
	done
	rcs -l1.1 bothlock
	mv file norcsfile
else
	cd try
fi
set +x

(
	echo '+ ../rcslocks -al `pwd`'
	../rcslocks -al `pwd` | sed -e "s!^`pwd`!!"
	set -x
	../rcslocks -a
	../rcslocks -al
	../rcslocks -u root -l
	../rcslocks bothlock
	../rcslocks hislock
	../rcslocks -l bothlock
	../rcslocks -al bothlock mylock hislock
	../rcslocks -al RCS
	../rcslocks nonexist
	../rcslocks norcsfile		# exists but no RCS file
	../rcslocks nowkgfile		# RCS exists but no working file
) > ../newcheck 2>&1

cd ..
if diff newcheck oldcheck
then
	echo Tests succeeded.
	/bin/rm -f newcheck
else
	echo "TEST failed!  Check differences output"
	exit 1
fi
@EOF
set `wc -lwc <testit`
if test $1$2$3 != 722571539
then
	echo ERROR: wc results of testit are $* should be 72 257 1539
fi

chmod 664 testit

echo x - oldcheck
cat >oldcheck <<'@EOF'
+ ../rcslocks -al `pwd`
/hislock,v: locked by root:1.1
/nowkgfile,v: locked by brian:1.1
/mylock,v: locked by brian:1.1
/bothlock,v: locked by brian:1.1 root:1.2
+ ../rcslocks -a 
hislock,v
nowkgfile,v
mylock,v
bothlock,v
+ ../rcslocks -al 
hislock,v: locked by root:1.1
nowkgfile,v: locked by brian:1.1
mylock,v: locked by brian:1.1
bothlock,v: locked by brian:1.1 root:1.2
+ ../rcslocks -u root -l 
hislock,v: locked by root:1.1
bothlock,v: locked by root:1.2
+ ../rcslocks bothlock 
bothlock,v
+ ../rcslocks hislock 
+ ../rcslocks -l bothlock 
bothlock,v: locked by brian:1.1
+ ../rcslocks -al bothlock mylock hislock 
bothlock,v: locked by brian:1.1 root:1.2
mylock,v: locked by brian:1.1
hislock,v: locked by root:1.1
+ ../rcslocks -al RCS 
hislock,v: locked by root:1.1
nowkgfile,v: locked by brian:1.1
mylock,v: locked by brian:1.1
bothlock,v: locked by brian:1.1 root:1.2
+ ../rcslocks nonexist 
rcslocks: nonexist -- No such file
+ ../rcslocks norcsfile 
rcslocks: norcsfile -- no corresponding RCS file
+ ../rcslocks nowkgfile 
nowkgfile,v
@EOF
set `wc -lwc <oldcheck`
if test $1$2$3 != 381391050
then
	echo ERROR: wc results of oldcheck are $* should be 38 139 1050
fi

chmod 664 oldcheck

exit 0

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (02/08/90)

In article <4080010@hpausla.aso.hp.com> brian@hpausla.aso.hp.com (Brian Coogan) writes:
: I seem to have managed to find another bug in perl.  A variable $file is
: corrupted by reading from a filehandle (variable name doesn't seem
: to matter, nor does it matter whether $file is local or not, or the
: first/most recent variable use beforehand.)  I've worked around this
: bug, but it lost me a lot of time.  The variable gets corrupted to
: the value of $_.

Oddly enough, it's not really a perl bug in this case.  What you've got is

	do checkfiles($_);

	sub checkfiles {
	    foreach $file (@_) {
		while (<file>) {
		    ...
		}
	    }
	}

What you have to remember is that @_ is an array of references to the
actual parameters (not copies, as in 2.0), and that foreach iterates over
an array by making the variable ($file, in this case) to be a reference
to the actual array elements.  Hence, when checkfiles is called with $_,
it ends up aliased to $file.  So reading into $_ then clobbers $file too.

That's the fun of passing parameters by reference.  I still think it's
worth it for the efficiency gain.  If it worries you, just be consistent
about copying your parameters out of @_ into something local.  Or tell
yourself to worry about aliasing whenever you don't.  Or something like that.

Larry