[comp.sys.sun] cron/find sting! SUN OS 3.5 HELP!!!

mike@uunet.uu.net (Mike Thompson) (03/11/89)

FORGROUND: (See BACKGROUND below for info on the organization of our
system)

I came in this morning and discovered that every file that hadn't been
accessed in the past 2 days on /usr/mercury had been deleted (none of the
other file systems had been modified in this way. By examining the
lastcomm file I discovered on one of the clients "rm"s (hundreds of them)
starting at 4:30am and running through to 4:58am followed by a find with
about 300 CPU seconds time, the find had started at 4:30am, it was the
only find run by root at or around 4:30am. In every system's crontab file
are the following three lines:

	15 4 * * * find /usr/preserve/ -mtime +7 -a -exec rm -f - {} \;
	30 4 * * * find /tmp/ -atime +2 \! -type d -exec rm -f - {} \;
	45 4 * * * find /usr/tmp/ -atime +2 \! -type d -exec rm -f - {} \;

All systems run the same version of cron and find, none of the other
systems exhibited this, all their 4:30 finds executed in under a CPU
second as is to be expected. /tmp is not a symbolic link. I suspect that
there is some kind of bug in cron altho it could be anything from a virus
or a malicious user to a bad block on the swap partition.

Help!!!, I've managed to recover the files but I don't want it to happen
again, has anyone encountered this before, does anyone know what is going
on???????? Thanks in advance. I am also going to call sun software support
to see if they can help.

BACKGROUND:

We are running 1 diskfull SUN 3/60 and 4 diskless SUN 3/50's here with OS
3.5. File systems are arranged thus:

			ON THE SERVER
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/sd0a               7608    5000    1847    73%    /
/dev/sd0f               7855    3800    3269    54%    /pub.MC68020
/dev/sd0h              77025   56997   12325    82%    /usr.MC68020
/dev/sd2h             188292  116019   53443    68%    /usr.MC68020/mercury
/dev/sd0d              89461   52574   27940    65%    /usr.MC68020/mercury/scratch
/dev/sd2g              53308   20545   27432    43%    /usr.MC68020/mercury/spool
/dev/sd2a               9693    5088    3635    58%    /broot

			ON THE CLIENT's
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/nd0                7608    2196    4651    32%    /
/dev/ndp1               7855    3800    3269    54%    /pub (ro)
mercury:/usr.MC68020   77025   56997   12325    82%    /usr (ro)
mercury:/usr/mercury  188292  116019   53443    68%    /usr/mercury
mercury:/usr/mercury/scratch
                       89461   52574   27940    65%    /usr/mercury/scratch
mercury:/usr/mercury/spool
                       53308   20545   27432    43%    /usr/mercury/spool
-- 

Michael A. Thompson, Iotek Inc,  E-Mail: mike@iotek.uucp
1127 Barrington St., Suite 100,  Fax:    (902)420-0674
Halifax, N.S., B3H 2P8, Canada   Phone:  (902)420-1890

rlk@think.com (Robert L. Krawitz) (03/24/89)

I suspect that someone on the client is/was using 'on'.  If the working
directory isn't actually mounted on the machine that 'on' is trying to run
on, it will mount it -- in /tmp!  Either a problem with on or a
long-running command can leave the filesystem mounted for a long while (or
of course, if someone's running on right when the find goes off, you'll
lose).

In any event, it certainly sounds like someone's mounting filesystems in
/tmp by whatever means.  This isn't necessarily wrong, but your find
scripts could be better.  Try putting a -xdev in your find script before
the rm, like this.  It cuts off the find if it tries to cross a filesystem
boundary.

15 4 * * * find /usr/preserve/ -xdev -mtime +7 -a -exec rm -f - {} \;
30 4 * * * find /tmp/ -xdev -atime +2 \! -type d -exec rm -f - {} \;
45 4 * * * find /usr/tmp/ -xdev -atime +2 \! -type d -exec rm -f - {} \;

We got bitten by this once in 3.x.  Turns out that the standard crontab
distributed with at least some 3.x systems doesn't have the -xdev.  We
reported it back to Sun, and I don't know when they fixed it.

These automatic find scripts can be awfully dangerous if you have a weird
configuration.  When I was at Project Athena, a few of us got bitten by a
similar problem on private workstations (Microvax II's).

ames >>>>>>>>>  |	Robert Krawitz <rlk@think.com>	245 First St.
bloom-beacon >  |think!rlk	(postmaster)		Cambridge, MA  02142
harvard >>>>>>  .	Thinking Machines Corp.		(617)876-1111

pvo1478@oce.orst.edu (Paul V. O'Neill) (03/24/89)

> I came in this morning and discovered that every file that hadn't been
> accessed in the past 2 days on /usr/mercury had been deleted (none of the

Gee..., you're not running ``rexd'' and ``on'' are you?  They can and will
mount other machines' file systems in /tmp.  See sunspots v6n31, v6n39,
v6n51.

"The most innocent looking hack can blow you head clean off."--Ron Hitchens
								v6n39
Paul O'Neill                 pvo@oce.orst.edu
Coastal Imaging Lab
OSU--Oceanography
Corvallis, OR  97331         503-754-3251

mike@uunet.uu.net (Mike Thompson) (03/24/89)

Thank you to those that responded to my call for help.

As was pointed out to me, this problem has been discussed in sun-spots
previously (v6n31, v6n39, v6n51). The problem arises due to the fact that
the on/rpc.rexd client/server pair will NFS mount directories in /tmp, so
if someone happens to be running an on command at the same time that the
"find /tmp/ -mtime +2 \! -type d -exec rm -f {} \;" is run the find will
quite happly go off through the NFS mounted file system. The solution is
to add the -xdev option to the find command to keep it on the same file
system.

I had contacted sun about this problem, and they didn't seem to know
anything about this problem, they did suggest adding the -xdev option to
the find command but they didn't seem to know what might be causing the
problem in the first place.

The only mention of this that I was able to find in the documentation was
"This daemon may use the NFS to mount file systems specified in the remote
execution request." in the documentation for rexd, no mention of where the
filesystems would be mounted, no comment about the dangers that this
implies.  I think that, that sentence should be down in the BUGS section
of the manual. What is in the BUGS section is also disturbing "Should be
better access control", what is wrong with the access control?  One of the
people that responed to my request for help mentioned this as well,
according to them there seems to be no host verification only uid
verification (i.e. is this a valid uid on this machine, not is this a
valid host with a valid uid) not having source code I can't check this
out.

Michael A. Thompson, Iotek Inc, |*| E-Mail: mike@iotek.uucp
1127 Barrington St., Suite 100, |*| Fax:    (902)420-0674
Halifax, N.S., B3H 2P8, Canada  |*| Phone:  (902)420-1890