smitty@essnj1.ESSNJAY.COM (Hibbard T. Smith JR) (07/25/90)
Within the past 2 weeks, we've upgraded several systems from 2.0.2 to 2.2. On one of those systems, on Sunday morning at 05:17 or thereabouts most of the files on the system were deleted. The problem was caused by a root crontab driven execution of /etc/cleanup. This system's /lost+found directory was inadvertently lost during the upgrade installation, and we we're planning on recreating it on Monday morning. The last two lines of the distributed /etc/cleanup are as follows: -- cd /lost+found -- find . -mtime +14 -exec rm -rf {} \; If there's no lost and found directory in the root file system, this deletes everything in the system that's older than 14 days. Two possible fixes exist: -- cd /lost+found && find . -mtime +14 -exec rm -rf {} \; -- find /lost+found -mtime +14 -exec -rm {} \; Either of these is much safer than the distributed code. This bad code is different than 2.0.2, so beware! I hope this saves someone the grief of starting over, or worse yet, losing a whole system when you're not prepared to rebuild it. -- Smitty ------------------------------------------- Hibbard T. Smith JR smitty@essnj1.ESSNJAY.COM ESSNJAY Systems Inc. uunet!hsi!essnj1!smitty
dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (08/02/90)
>-- cd /lost+found >-- find . -mtime +14 -exec rm -rf {} \; >If there's no lost and found directory in the root file system, this deletes >everything in the system that's older than 14 days. The last time I looked, it was an undocumented feature in sh and csh (and probably in ksh though I didn't check) that a cd that failed would abort the rest of the script. In fact, sh and csh (but not ksh) went a bit too far, and the statement cd dir || exit 1 would never execute the exit 1. It looks like the sh you are using has had this undocumented feature removed, resulting in disaster. Standard practice in cleanup scripts is to do a cd followed by something else on the same line: cd /lost+found; find . -mtime +14 -exec rm -rf {} \; If the cd fails, no damage is done, because the rest of the line is not executed. Any sensible shell ought to let at least this work, even if it doesn't abort the entire script. -- Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com> UUCP: oliveb!cirrusl!dhesi
mpl@pegasus.ATT.COM (Michael P. Lindner) (08/02/90)
In article <2108@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes: deleted >The last time I looked, it was an undocumented feature in sh and csh >(and probably in ksh though I didn't check) that a cd that failed would >abort the rest of the script. In fact, sh and csh (but not ksh) went a deleted >-- >Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com> >UUCP: oliveb!cirrusl!dhesi I don't know of any undocumented feature wrt. "cd", but for safety's sake, all my shell scripts start with the line set -e which says "exit on error". Anyplace where I expect a command to fail but it's OK to go on, I put either # do something special if the command fails if command then : else echo >&2 "command failed -- exit code $?" fi # or # ignore the code - useful for those commands which # don't return a meaningful exit code command || : # or # ignore the failure - useful for things like mkdir -p $dir 2> /dev/null || : # or mv -f $files 2> /dev/null || : Mike Lindner AT&T Bell Labs attmail!mplindner
walter@mecky.UUCP (Walter Mecky) (08/03/90)
In article <2108@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes:
< >-- cd /lost+found
< >-- find . -mtime +14 -exec rm -rf {} \;
< >If there's no lost and found directory in the root file system, this deletes
< >everything in the system that's older than 14 days.
Guys, you talked about very many aspects of the problem and missed the
most important one. It was discussed here in november last year:
If fsck links a file in /lost+found, then its mtime is left unchanged.
That's true too for all the files in a directory tree if fsck links
in a directory. So, you MUST NOT use the mtime to decide if deleting
files in /lost+found because find deletes files in your filesystem you
have not changed the last 14 days. The idea behind the "find ..."
seemed to be: delete the files and directory trees, which are longer
than 14 days in /lost+found.
In the november discussion were some solutions posted. I dont't
remember and don't trust anyone. In my /etc/cleanup there is
only mail produced for user root and no deletions of files:
for i in `/etc/mount | cut -d' ' -f1`
do
[ "`echo $i/lost+found/*`" = "$i/lost+found/*" ] ||
echo "There is something in $i/lost+found.\nLook at it!" |
mail -s 'File(s) in /lost+found' root
done
--
Walter Mecky [ walter@mecky.uucp or ...uunet!unido!mecky!walter ]
Dan_Jacobson@ATT.COM (08/03/90)
dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes: > cd /lost+found; find . -mtime +14 -exec rm -rf {} \; >If the cd fails, no damage is done, because the rest of the line is not >executed. Any sensible shell ought to let at least this work, even if >it doesn't abort the entire script. Saying that there should be a special case just for the cd command, and just for the rest of this line is ripping up the whole uniformity and generality of the shell [/bin/sh family of shells assumed]. If you want a failed cd to kill the script, then do "set -e" or "cd dir || exit 1". For just missing the rest of the line: "cd dir && bla bla bla". [I'm speaking from a general UNIX view, and don't even read the i386 newsgroup, Followup-To: comp.unix.wizards] -- Dan_Jacobson@ATT.COM +1-708-979-6364
daveh@marob.masa.com (Dave Hammond) (08/04/90)
In article <2108@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com writes: >>-- cd /lost+found >>-- find . -mtime +14 -exec rm -rf {} \; >>If there's no lost and found directory in the root file system, this deletes >>everything in the system that's older than 14 days. > >The last time I looked, it was an undocumented feature in sh and csh >(and probably in ksh though I didn't check) that a cd that failed would >abort the rest of the script. The /bin/sh in both Xenix 386 and Altos Unix V/386 only aborts the script on a failed cd, if invoked as `sh script'. If the script has been made executable, and is invoked as simply `script', then sh does not abort on a failed cd: Script started [typescript] at Fri Aug 3 17:27:24 1990 daveh$ cat >foo cd /fred/ethel/wilma ; who daveh$ sh foo foo: /fred/ethel/wilma: bad directory daveh$ chmod +x foo daveh$ ./foo ./foo: /fred/ethel/wilma: not found daveh tty5E Aug 3 17:27 clifford tty02 Aug 2 00:21 daveh$ Script ended [typescript] at Fri Aug 3 17:28:04 1990 BTW, I just checked the action taken when /bin/sh sources (as in `. ./foo') the script -- there also, the script is not aborted on cd failure. -- Dave Hammond daveh@marob.masa.com uunet!masa.com!marob!daveh
guy@auspex.auspex.com (Guy Harris) (08/05/90)
>If you want a failed cd to kill the script, then do...
If you want a failed "cd" to kill the script, don't bother doing
anything. The SunOS 4.0.3 Bourne shell, based on the S5R3.1 one, will
kill the script if a "cd" fails; I checked the source code to the 4.3BSD
Bourne shell, based on the V7 one, and it appears as if it'll do the
same.
Given that, and given that, as far as I know, neither Sun nor Berkeley
introduced this feature, it's probably in most if not all UNIX Bourne
shells, going back at least as far as V7 (it existed, at least within
Bell Labs, before V7 came out; I can't speak for those versions).
davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (08/07/90)
In article <3819@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: | If you want a failed "cd" to kill the script, don't bother doing | anything. The SunOS 4.0.3 Bourne shell, based on the S5R3.1 one, will | kill the script if a "cd" fails; I checked the source code to the 4.3BSD | Bourne shell, based on the V7 one, and it appears as if it'll do the | same. Yes, only ksh gives you the choice of catching the failure. In ksh you can check status by doing something like cd $1 || break; do_more where the cd will return bad status but still continue. You learn to be VERY careful about typing "cd xxx;rm *" and other dangerous things! -- bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen) sysop *IX BBS and Public Access UNIX moderator of comp.binaries.ibm.pc and 80386 mailing list "Stupidity, like virtue, is its own reward" -me
D. Allen [CGL]) (08/07/90)
In article <1438@sixhub.UUCP> davidsen@sixhub.UUCP (bill davidsen) writes: > Yes, only ksh gives you the choice of catching the failure. No, I think most any sh or csh shell will let you catch the failure, but you have to put the failing command in a subshell. Even if you're stuck with a cd that kills your shell, you can get by using: ( cd "$1" ) && cd "$1" This puts the first cd in a subshell, which may well die but you don't care since you're only interested in the return code. Of course, this has a small window between the first cd and the second, where things might change, and so your shell may get blown away anyway; but, you hope that happens rarely. The above trick is the only way to test for failure in various other built-in shell commands. I often use: ( trap "" 18 22 ) >/dev/null 2>&1 && trap "" 18 22 because many sh shells don't handle signals above 16, but some do. -- -IAN! (Ian! D. Allen) idallen@watcgl.uwaterloo.ca idallen@watcgl.waterloo.edu [129.97.128.64] Computer Graphics Lab/University of Waterloo/Ontario/Canada
chip@tct.uucp (Chip Salzenberg) (08/11/90)
According to davidsen@sixhub.UUCP (bill davidsen):
>Yes, only ksh gives you the choice of catching the failure.
Bash 1.05 also continues after a "cd" failure.
--
Chip Salzenberg at ComDev/TCT <chip@tct.uucp>, <uunet!ateng!tct!chip>
lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (08/12/90)
In article <26C2F1A0.205B@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
: According to davidsen@sixhub.UUCP (bill davidsen):
: >Yes, only ksh gives you the choice of catching the failure.
:
: Bash 1.05 also continues after a "cd" failure.
Likewise Perl. The idiom to catch the failure is
chdir $dir || die "Can't cd to $dir: $!\n";
Larry Wall
lwall@jpl-devvax.jpl.nasa.gov
les@chinet.chi.il.us (Leslie Mikesell) (08/14/90)
In article <9118@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes: >In article <26C2F1A0.205B@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes: >: According to davidsen@sixhub.UUCP (bill davidsen): >: >Yes, only ksh gives you the choice of catching the failure. >: >: Bash 1.05 also continues after a "cd" failure. >Likewise Perl. The idiom to catch the failure is > chdir $dir || die "Can't cd to $dir: $!\n"; This is reasonable behaviour for perl since it doesn't claim any compatibility with /bin/sh scripts. Those other two mentioned above will cause serious problems when executing scripts that are perfectly valid for /bin/sh. They could (should) have required a "set" option to be done to make them operate differently. Les Mikesell les@chinet.chi.il.us
chip@tct.uucp (Chip Salzenberg) (08/17/90)
[ Discussion cross-posted to gnu.bash.bug. ] According to les@chinet.chi.il.us (Leslie Mikesell): >[Bash and ksh] will cause serious problems when executing scripts that are >perfectly valid for /bin/sh. They could (should) have required a "set" >option to be done to make them operate differently. What an excellent idea! I intend to change my bash sources to exit on a failed "cd" unless the shell is interactive or the variable "no_exit_on_failed_cd" is set. (Yes, the name is awkward, but it is a logical companion to the already-implemented "no_exit_on_failed_exec" variable.) -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> "Most of my code is written by myself. That is why so little gets done." -- Herman "HLLs will never fly" Rubin