[comp.unix.wizards] Interactive 2.2 File zapper

smitty@essnj1.ESSNJAY.COM (Hibbard T. Smith JR) (07/25/90)

Within the past 2 weeks, we've upgraded several systems from 2.0.2 to 2.2.
On one of those systems, on Sunday morning at 05:17 or thereabouts most of
the files on the system were deleted.  The problem was caused by a root
crontab driven execution of /etc/cleanup.  This system's /lost+found 
directory was inadvertently lost during the upgrade installation, and we
we're planning on recreating it on Monday morning.


The last two lines of the distributed /etc/cleanup are as follows:
--	cd /lost+found
--	find . -mtime +14 -exec rm -rf {} \;
If there's no lost and found directory in the root file system, this deletes
everything in the system that's older than 14 days. Two possible fixes exist:
-- cd /lost+found && find . -mtime +14 -exec rm -rf {} \;
-- find /lost+found -mtime +14 -exec -rm {} \;
Either of these is much safer than the distributed code.  This bad code is 
different than 2.0.2, so beware!

I hope this saves someone the grief of starting over, or worse yet, losing
a whole system when you're not prepared to rebuild it.

-- 
		Smitty
-------------------------------------------
Hibbard T. Smith JR                 smitty@essnj1.ESSNJAY.COM	
ESSNJAY Systems Inc.                uunet!hsi!essnj1!smitty

dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (08/02/90)

>--	cd /lost+found
>--	find . -mtime +14 -exec rm -rf {} \;
>If there's no lost and found directory in the root file system, this deletes
>everything in the system that's older than 14 days.

The last time I looked, it was an undocumented feature in sh and csh
(and probably in ksh though I didn't check) that a cd that failed would
abort the rest of the script.  In fact, sh and csh (but not ksh) went a
bit too far, and the statement

     cd dir || exit 1

would never execute the exit 1.

It looks like the sh you are using has had this undocumented feature
removed, resulting in disaster.

Standard practice in cleanup scripts is to do a cd followed by
something else on the same line:

     cd /lost+found; find . -mtime +14 -exec rm -rf {} \;

If the cd fails, no damage is done, because the rest of the line is not
executed.  Any sensible shell ought to let at least this work, even if
it doesn't abort the entire script.
--
Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com>
UUCP:  oliveb!cirrusl!dhesi

mpl@pegasus.ATT.COM (Michael P. Lindner) (08/02/90)

In article <2108@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes:
	deleted
>The last time I looked, it was an undocumented feature in sh and csh
>(and probably in ksh though I didn't check) that a cd that failed would
>abort the rest of the script.  In fact, sh and csh (but not ksh) went a
	deleted
>--
>Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com>
>UUCP:  oliveb!cirrusl!dhesi

I don't know of any undocumented feature wrt. "cd", but for safety's sake,
all my shell scripts start with the line

	set -e

which says "exit on error".  Anyplace where I expect a command to fail
but it's OK to go on, I put either

	# do something special if the command fails
	if command
	then
		:
	else
		echo >&2 "command failed -- exit code $?"
	fi

	# or

	# ignore the code - useful for those commands which
	# don't return a meaningful exit code
	command || :

	# or

	# ignore the failure - useful for things like
	mkdir -p $dir 2> /dev/null || :
	# or
	mv -f $files 2> /dev/null || :

Mike Lindner
AT&T Bell Labs
attmail!mplindner

walter@mecky.UUCP (Walter Mecky) (08/03/90)

In article <2108@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes:
< >--	cd /lost+found
< >--	find . -mtime +14 -exec rm -rf {} \;
< >If there's no lost and found directory in the root file system, this deletes
< >everything in the system that's older than 14 days.

Guys, you talked about very many aspects of the problem and missed the
most important one. It was discussed here in november last year:

If fsck links a file in /lost+found, then its mtime is left unchanged.
That's true too for all the files in a directory tree if fsck links
in a directory. So, you MUST NOT use the mtime to decide if deleting
files in /lost+found because find deletes files in your filesystem you
have not changed the last 14 days. The idea behind the "find ..." 
seemed to be: delete the files and directory trees, which are longer 
than 14 days in /lost+found. 

In the november discussion were some solutions posted. I dont't
remember and don't trust anyone. In my /etc/cleanup there is
only mail produced for user root and no deletions of files:

   for i in `/etc/mount | cut -d' ' -f1`
   do
	 [ "`echo $i/lost+found/*`" = "$i/lost+found/*" ] || 
		   echo "There is something in $i/lost+found.\nLook at it!" | 
		   mail -s 'File(s) in /lost+found' root
   done
-- 
Walter Mecky	[ walter@mecky.uucp	or  ...uunet!unido!mecky!walter ]

Dan_Jacobson@ATT.COM (08/03/90)

dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes:
>     cd /lost+found; find . -mtime +14 -exec rm -rf {} \;
>If the cd fails, no damage is done, because the rest of the line is not
>executed.  Any sensible shell ought to let at least this work, even if
>it doesn't abort the entire script.

Saying that there should be a special case just for the cd command, and
just for the rest of this line is ripping up the whole uniformity and
generality of the shell [/bin/sh family of shells assumed].  If you want
a failed cd to kill the script, then do "set -e" or "cd dir || exit 1".
For just missing the rest of the line: "cd dir && bla bla bla".

[I'm speaking from a general UNIX view, and don't even read the i386
newsgroup, Followup-To: comp.unix.wizards]
-- 
Dan_Jacobson@ATT.COM +1-708-979-6364

daveh@marob.masa.com (Dave Hammond) (08/04/90)

In article <2108@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com writes:
>>--	cd /lost+found
>>--	find . -mtime +14 -exec rm -rf {} \;
>>If there's no lost and found directory in the root file system, this deletes
>>everything in the system that's older than 14 days.
>
>The last time I looked, it was an undocumented feature in sh and csh
>(and probably in ksh though I didn't check) that a cd that failed would
>abort the rest of the script.

The /bin/sh in both Xenix 386 and Altos Unix V/386 only aborts the
script on a failed cd, if invoked as `sh script'.  If the script has
been made executable, and is invoked as simply `script', then sh does
not abort on a failed cd:

Script started [typescript] at Fri Aug  3 17:27:24 1990
daveh$ cat >foo
cd /fred/ethel/wilma ; who
daveh$ sh foo
foo: /fred/ethel/wilma: bad directory
daveh$ chmod +x foo
daveh$ ./foo
./foo: /fred/ethel/wilma:  not found
daveh      tty5E        Aug  3 17:27
clifford   tty02        Aug  2 00:21
daveh$ 
Script ended [typescript] at Fri Aug  3 17:28:04 1990

BTW, I just checked the action taken when /bin/sh sources (as in
`. ./foo') the script -- there also, the script is not aborted on cd
failure.

--
Dave Hammond
daveh@marob.masa.com
uunet!masa.com!marob!daveh

guy@auspex.auspex.com (Guy Harris) (08/05/90)

>If you want a failed cd to kill the script, then do...

If you want a failed "cd" to kill the script, don't bother doing
anything.  The SunOS 4.0.3 Bourne shell, based on the S5R3.1 one, will
kill the script if a "cd" fails; I checked the source code to the 4.3BSD
Bourne shell, based on the V7 one, and it appears as if it'll do the
same.

Given that, and given that, as far as I know, neither Sun nor Berkeley
introduced this feature, it's probably in most if not all UNIX Bourne
shells, going back at least as far as V7 (it existed, at least within
Bell Labs, before V7 came out; I can't speak for those versions).

davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (08/07/90)

In article <3819@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:

| If you want a failed "cd" to kill the script, don't bother doing
| anything.  The SunOS 4.0.3 Bourne shell, based on the S5R3.1 one, will
| kill the script if a "cd" fails; I checked the source code to the 4.3BSD
| Bourne shell, based on the V7 one, and it appears as if it'll do the
| same.

  Yes, only ksh gives you the choice of catching the failure. In ksh you
can check status by doing something like

	cd $1 || break; do_more

where the cd will return bad status but still continue. You learn to be
VERY careful about typing "cd xxx;rm *" and other dangerous things!
-- 
bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen)
    sysop *IX BBS and Public Access UNIX
    moderator of comp.binaries.ibm.pc and 80386 mailing list
"Stupidity, like virtue, is its own reward" -me

D. Allen [CGL]) (08/07/90)

In article <1438@sixhub.UUCP> davidsen@sixhub.UUCP (bill davidsen) writes:
>  Yes, only ksh gives you the choice of catching the failure.

No, I think most any sh or csh shell will let you catch the failure,
but you have to put the failing command in a subshell.  Even if you're
stuck with a cd that kills your shell, you can get by using:

	( cd "$1" ) && cd "$1"

This puts the first cd in a subshell, which may well die but you don't
care since you're only interested in the return code.  Of course, this
has a small window between the first cd and the second, where things
might change, and so your shell may get blown away anyway; but, you
hope that happens rarely.

The above trick is the only way to test for failure in various other
built-in shell commands.  I often use:

	( trap "" 18 22 ) >/dev/null 2>&1 && trap "" 18 22

because many sh shells don't handle signals above 16, but some do.
-- 
-IAN! (Ian! D. Allen) idallen@watcgl.uwaterloo.ca idallen@watcgl.waterloo.edu
 [129.97.128.64]  Computer Graphics Lab/University of Waterloo/Ontario/Canada

chip@tct.uucp (Chip Salzenberg) (08/11/90)

According to davidsen@sixhub.UUCP (bill davidsen):
>Yes, only ksh gives you the choice of catching the failure.

Bash 1.05 also continues after a "cd" failure.
-- 
Chip Salzenberg at ComDev/TCT     <chip@tct.uucp>, <uunet!ateng!tct!chip>

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (08/12/90)

In article <26C2F1A0.205B@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
: According to davidsen@sixhub.UUCP (bill davidsen):
: >Yes, only ksh gives you the choice of catching the failure.
: 
: Bash 1.05 also continues after a "cd" failure.

Likewise Perl.  The idiom to catch the failure is

	chdir $dir || die "Can't cd to $dir: $!\n";

Larry Wall
lwall@jpl-devvax.jpl.nasa.gov

les@chinet.chi.il.us (Leslie Mikesell) (08/14/90)

In article <9118@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
>In article <26C2F1A0.205B@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>: According to davidsen@sixhub.UUCP (bill davidsen):
>: >Yes, only ksh gives you the choice of catching the failure.
>: 
>: Bash 1.05 also continues after a "cd" failure.

>Likewise Perl.  The idiom to catch the failure is

>	chdir $dir || die "Can't cd to $dir: $!\n";

This is reasonable behaviour for perl since it doesn't claim any
compatibility with /bin/sh scripts.  Those other two mentioned above
will cause serious problems when executing scripts that are
perfectly valid for /bin/sh.  They could (should) have required a "set"
option to be done to make them operate differently.

Les Mikesell
  les@chinet.chi.il.us

chip@tct.uucp (Chip Salzenberg) (08/17/90)

[ Discussion cross-posted to gnu.bash.bug. ]

According to les@chinet.chi.il.us (Leslie Mikesell):
>[Bash and ksh] will cause serious problems when executing scripts that are
>perfectly valid for /bin/sh.  They could (should) have required a "set"
>option to be done to make them operate differently.

What an excellent idea!  I intend to change my bash sources to exit on
a failed "cd" unless the shell is interactive or the variable
"no_exit_on_failed_cd" is set.

(Yes, the name is awkward, but it is a logical companion to the
already-implemented "no_exit_on_failed_exec" variable.)
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
 "Most of my code is written by myself.  That is why so little gets done."
                 -- Herman "HLLs will never fly" Rubin