[comp.bugs.sys5] mkdir

wcs@skep2.ATT.COM (Bill.Stewart.[ho95c]) (12/22/88)

nice(-255);	/* always win race condition  - fixes security bug */
		/* nice(-255) is not very nice, but root has its privileges */
		/* works with official mkdir and Doug's */

The original articles in news.admin pointed out a race
condition in the /bin/mkdir used by V7 and System V
older releases.  (mkdir runs setuid root, mknod's a
directory then chowns it to getuid() - if you're fast you
can rmdir the directory and ln /etc/passwd, especially if
you nice -19 mkdir.)  Doug Davis posted a public-domain
mkdir which he claimed fixed the problem.  Other people have
pointed out that it helps, but doesn't prevent the race
condition, and it's free source for people who only have
binary systems.

Put a nice(-255) as the first line of your mkdir, and you'll
"never" lose the race!!  The race condition worked, because
on a loaded system, you could guarantee that sometimes your
rm-and-link hack would happen between the mknod() and the chown(),
as long as you ran the mkdir niced so it had a low priority.

But remember that you're setuid root, so you don't have to be
nice() just because some luser told you to be.
Nice(-255) truncates to the nastiest priority a user-level
process can have, no matter how nice() the mkdir had
originally been.

Under System V, nice() returns -1 if it can't get the nice value you
asked for; my 4.1BSD manual doesn't say what it does.  It's good
programming practice to do

	if (nice(-255) == -1) { perror("mkdir fails:"); exit(1); }

but it really can't happen since you're root.

====== commentary =======
The security bug was REAL, for systems using a setuid-root mkdir
command instead of mkdir() system call.  SVR3.0 and 4.2BSD both
have mkdir(), but earlier System V, V7, and 4.1BSD all use the
older approach.  The nice(-255) isn't a 100% cure, but I've been
unable to break any of our systems since we installed it, and
it doesn't break anything.

	Multiprocessor systems may still have some risk.
	
We looked at some alternatives like Doug's and Dan's that
might have the potential to cause trouble - mkdir is a critical
enough program that I'd want to thoroughly test anything that
wasn't really simple before I used it, but this change is as
transparent as any I could think of.  We may still try some more
hacks, but this is cheap, easy, and works for uniprocessors.

-- 
#				Thanks;
# Bill Stewart, AT&T Bell Labs 2G218 Holmdel NJ 201-949-0705 ho95c.att.com!wcs
#
#	News.  Don't ask me about News.

ddl@husc6.harvard.edu (Dan Lanciani) (12/22/88)

In article <379@skep2.ATT.COM>, wcs@skep2.ATT.COM (Bill.Stewart.[ho95c]) writes:
| nice(-255);	/* always win race condition  - fixes security bug */
| 		/* nice(-255) is not very nice, but root has its privileges */
| 		/* works with official mkdir and Doug's */
| 
| The original articles in news.admin pointed out a race
| condition in the /bin/mkdir used by V7 and System V
| older releases.  (mkdir runs setuid root, mknod's a
| directory then chowns it to getuid() - if you're fast you
| can rmdir the directory and ln /etc/passwd, especially if
| you nice -19 mkdir.)  Doug Davis posted a public-domain
| mkdir which he claimed fixed the problem.  Other people have
| pointed out that it helps, but doesn't prevent the race
| condition, and it's free source for people who only have
| binary systems.
| 
| Put a nice(-255) as the first line of your mkdir, and you'll
| "never" lose the race!!  The race condition worked, because
| on a loaded system, you could guarantee that sometimes your
| rm-and-link hack would happen between the mknod() and the chown(),
| as long as you ran the mkdir niced so it had a low priority.

	Unfortunately, this analysis is incorrect.  The real window
occurs while the mkdir process is blocked on disk I/O to, e.g., look
up elements of the path name of the file to chown().  No matter how
good a priority the process has, other processes will run while it
waits for the disk.  In fact, it is doubtful that you would ever
see the mkdir attack succeed if it actually required that the system
preempt the mkdir process "between" the mknod() and the chown().  Tests
with a well-crafted mkdir-breaker program showed that success was
nearly independent of process priority.
	Incidentally, the fix proposed by jfh@rpp386 (using dir/./.
as the target of the chown()) doesn't help either.  It was a good
try (and happened to be included in the mkdir test mentioned above)
but breaks down since link() itself is not atomic.

				Dan Lanciani
				ddl@harvard.*

jfh@rpp386.Dallas.TX.US (The Beach Bum) (12/22/88)

In article <379@skep2.ATT.COM> wcs@skep2.UUCP (46323-Bill.Stewart.[ho95c],2G218,x0705,) writes:
>nice(-255);	/* always win race condition  - fixes security bug */
>		/* nice(-255) is not very nice, but root has its privileges */
>		/* works with official mkdir and Doug's */

Nope, this fails.  Consider - nice() does not insure you are always first,
it only insures that you are preferred.  After some period of execution,
the priority of the process will drop low enough for the user to slip
in.  Instead of doing a single directory per mkdir, stuff the command line
FULL of directories.

Also, the lowest NICE is 0.  The default NICE is 20.  This only means
that proc.p_cpu for your mkdir process needs to be 20 more than p_cpu for
the bad guys process.  One full second of execution should do this.  Once
that is accomplished, the bad guy should be able to slip in between.  A
C program may be needed to get the timing information correct, but it
should be VERY possible.
-- 
John F. Haugh II                        +-Quote of the Week:-------------------
VoiceNet: (214) 250-3311   Data: -6272  |"Unix doesn't have bugs,
InterNet: jfh@rpp386.Dallas.TX.US       | Unix is a bug"
UucpNet : <backbone>!killer!rpp386!jfh  +--              -- author forgotten --

arosen@hawk.ulowell.edu (MFHorn) (12/22/88)

> | nice(-255);	/* always win race condition  - fixes security bug */
> 
> 	Unfortunately, this analysis is incorrect.
> The real window
> occurs while the mkdir process is blocked on disk I/O

Still not right.

> 	Incidentally, the fix proposed by jfh@rpp386 (using dir/./.
> as the target of the chown()) doesn't help either.

But it's the right idea.

A couple years ago, I had to fix this bug in one of our systems.  I had
source to mkdir.c, but not to the kernel, and was able to successfully
close the hole completely.


  mknod(dirname);       /* Irrelevant arguments omitted */
  link(".");
  link("..");
  chown(dirname);

The real problem is mkdir trusts dirname to be the directory it just
created, which is not necessarily the case.  Nicing the process only
shrinks the window of vunlerability, but it doesn't close it.

The proper fix is to change 'chown(dirname);' to 'chown(".");' and
add a chdir(dirname); in the right place (with proper error checking).

  mknod(dirname);
  link(".");
  link("..");
  chdir(dirname);
  chown(".");

Now, it doesn't matter where the rmdir/ln magic is performed.  If it's
between chdir and chown, you're too late.  If it's between mknod and
link, you're too early.  If it's between link and chdir, then the chdir
fails with ENOTDIR (remember only root can ln directories).

Andy Rosen           | arosen@hawk.ulowell.edu | "I got this guitar and I
ULowell, Box #3031   | ulowell!arosen          |  learned how to make it
Lowell, Ma 01854     |                         |  talk" -Thunder Road
		RD in '88 - The way it should've been

jfh@rpp386.Dallas.TX.US (The Beach Bum) (12/22/88)

In article <871@husc6.harvard.edu> ddl@husc6.harvard.edu (Dan Lanciani) writes:
>	Incidentally, the fix proposed by jfh@rpp386 (using dir/./.
>as the target of the chown()) doesn't help either.  It was a good
>try (and happened to be included in the mkdir test mentioned above)
>but breaks down since link() itself is not atomic.

It is time for this Dan Lanciani person to shut up, or produce proof that
these bug fixes do not work.  I challenge him to produce a test which will
break the mkdir Doug Davis provided with the patch I suggested.  Furthermore,
I am willing to let that test pound on my system for a day or more if needed.
Failing this, I suggest we all add Mr. Laniciani to our official list of
crackpots and throw him in the KILL file.

The basis for my patch is that the link() call is PRIVILEGED.  Since '.'
in the context of the above referenced chown() MUST be a directory, the
bad guy would have to be root.  If Mr. Lanciani is assuming one may
become root to break this program, then all bets are off, since Doug's
entire assumption is based on the bad guy not becoming root.  The bad guy
simply can't create a forged directory structure without first BEING root.
-- 
John F. Haugh II                        +-Quote of the Week:-------------------
VoiceNet: (214) 250-3311   Data: -6272  |"Unix doesn't have bugs,
InterNet: jfh@rpp386.Dallas.TX.US       | Unix is a bug"
UucpNet : <backbone>!killer!rpp386!jfh  +--              -- author forgotten --

lmb@vicom.COM (Larry Blair) (12/23/88)

In article <871@husc6.harvard.edu> ddl@husc6.harvard.edu (Dan Lanciani) writes:
=In article <379@skep2.ATT.COM>, wcs@skep2.ATT.COM (Bill.Stewart.[ho95c]) writes:
=| nice(-255);	/* always win race condition  - fixes security bug */
=| 		/* nice(-255) is not very nice, but root has its privileges */
=| 		/* works with official mkdir and Doug's */
=| 
=	Unfortunately, this analysis is incorrect.  The real window
=occurs while the mkdir process is blocked on disk I/O to, e.g., look
=up elements of the path name of the file to chown().

I don't what version of Unix you run, but I don't know of one that would
do ANY disk I/O for the chown, since the mknod would have brought all of
the elements into core and, being -255, no one else could have caused the
kernel to de-cache the inodes and buffers.  I'm also pretty sure that there
is no danger in this case of losing the CPU due to timeout.  I'm sure that
some kernel hacker can tell us whether nice(-255) can be preempted.
-- 
Larry Blair   ames!vsi1!lmb   lmb@vicom.com

ddl@husc6.harvard.edu (Dan Lanciani) (12/23/88)

In article <10845@swan.ulowell.edu>, arosen@hawk.ulowell.edu (MFHorn) writes:
| > | nice(-255);	/* always win race condition  - fixes security bug */
| > 
| > 	Unfortunately, this analysis is incorrect.
| > The real window
| > occurs while the mkdir process is blocked on disk I/O
| 
| Still not right.

	Eh?

| > 	Incidentally, the fix proposed by jfh@rpp386 (using dir/./.
| > as the target of the chown()) doesn't help either.
| 
| But it's the right idea.
| 
| A couple years ago, I had to fix this bug in one of our systems.  I had
| source to mkdir.c, but not to the kernel, and was able to successfully
| close the hole completely.
| 
| 
|   mknod(dirname);       /* Irrelevant arguments omitted */
|   link(".");
|   link("..");
|   chown(dirname);
| 
| The real problem is mkdir trusts dirname to be the directory it just
| created, which is not necessarily the case.  Nicing the process only
| shrinks the window of vunlerability, but it doesn't close it.

	Correct.

| The proper fix is to change 'chown(dirname);' to 'chown(".");' and
| add a chdir(dirname); in the right place (with proper error checking).
| 
|   mknod(dirname);
|   link(".");
|   link("..");
|   chdir(dirname);
|   chown(".");
| 
| Now, it doesn't matter where the rmdir/ln magic is performed.  If it's
| between chdir and chown, you're too late.  If it's between mknod and
| link, you're too early.  If it's between link and chdir, then the chdir
| fails with ENOTDIR (remember only root can ln directories).

	But consider that you make *two* switches, one between the lookup
of the arguments of the link(dirname, dirname/.) call and one immediately
thereafter.  (This is what I meant about link itself not being atomic.)
This is easier than it sounds because you can sit there switching links
between the file you want to chown and the directory.  About half the time
the link() will fail because it trys to link into your file instead
of a directory.  The other half the time you will have "." in the newly
made directory pointing to the file you want to chown.  The subsequent
chdir() will succeed about half the time also, for a total success rate
of about 1/4.  In reality, the granularity of the scheduler makes for
a much worse success rate, but it does work.  (Yes, I ignored the
order of the link() for ".." in your example; it would presumably cut
things down to 1/8.)

				Dan Lanciani
				ddl@harvard.*

jfh@rpp386.Dallas.TX.US (The Beach Bum) (12/23/88)

In article <1313@vsi1.COM> lmb@vicom.COM (Larry Blair) writes:
>I don't what version of Unix you run, but I don't know of one that would
>do ANY disk I/O for the chown, since the mknod would have brought all of
>the elements into core and, being -255, no one else could have caused the
>kernel to de-cache the inodes and buffers.

This is an unwarranted assumption, as we shall see later ...  If you want
to insure the inode will remain cached [ the inode is the only thing
you can really get a handle on, paths are very slippery as it were ] you
would have to open the directory.  The inode would remain resident so
long as the directory were open.  Or, you could chdir() to the new
directory as current directories are also kept in core.  Simply praying
you have enough inode table entries don't cut it.  Inodes are only kept
resident for open file, current directories, root directories, pure
text files currently under execution or with the sticky bit set, and
mounted on directories.  [ Plus the one or two others I forgot, like the
system accounting file ;-) ]

> I'm also pretty sure that there
>is no danger in this case of losing the CPU due to timeout.  I'm sure that
>some kernel hacker can tell us whether nice(-255) can be preempted.

There is obviously a VERY severe misunderstanding of the function of
nice() under System V.  NICE only makes you a preferred process.  The
actual selection for scheduling is based on your CPU usage, your NICE
value, and your priority.  Your priority is adjusted after event wakeups
and during execution every second or so by the clock interrupt.  Your
milage may vary.

First, some accounting numbers to illustrate the point -

  START: Thu Dec 22 18:19:08 1988
  END:   Thu Dec 22 18:22:06 1988
  COMMAND                      START    END          REAL     CPU    MEAN
  NAME         USER   TTYNAME  TIME     TIME       (SECS)  (SECS) SIZE(K)
  ogetty     root     ?        18:08:38 18:30:35  1317.12    0.62  369.80
  #ogetty    root     ?        18:15:23 18:27:43   740.16    0.70  363.66
  #sh        news     ?        18:10:00 18:26:22   982.40    0.08  699.50
  sh         news     ?        18:10:00 18:26:22   981.92    0.16  530.50
  rnews      news     ?        18:18:33 18:23:52   319.20   69.48 1433.70
  #csh       jfh      tty02    23:27:53 18:23:20 68126.72    4.92  623.34
  csh        jfh      tty01    18:15:45 18:23:17   452.32    1.10  745.59
  #csh       root     tty01    18:16:25 18:23:06   400.80    1.94  637.69
  #mkdir     root     tty01    18:19:08 18:22:06   177.60   82.08  386.98
. lc         jfh      tty02    18:21:54 18:21:57     2.94    1.12 1044.84
. #sh        news     ?        18:20:00 18:20:45    45.12    0.08  529.00
. sh         news     ?        18:20:01 18:20:45    44.28    0.26  446.30
. uux        news     ?        18:20:04 18:20:45    40.96    2.08  769.84
. sh         news     ?        18:20:04 18:20:23    18.72    0.02  727.88
. compress   news     ?        18:20:04 18:20:23    18.58    0.22 6068.36
. batch      news     ?        18:20:04 18:20:13     9.12    0.14  463.14
. sh         news     ?        18:20:10 18:20:12     2.04    0.12  393.83
. expr       news     ?        18:20:03 18:20:04     0.72    0.04  526.00

All of the commands marked with a '.' in the first column began and
ended while mkdir was running.  mkdir was running with a NICE of 0,
which for System V is as low as it gets.  You CAN'T set NICE < 0 under
System V [ maybe SVR3? SVR4??? ].  All of the other commands listed were
loaded and executing [ except maybe the getty's ] at the same time as
mkdir.  However, what is most important to notice is that the REAL TIME
and the CPU TIME for mkdir were not equal.  So long as this is true, a
window exists during which another command can enter the job mix and be
executed.

If you would like to know what arguments I used, try mkdir {A..Z}{A..Z},
that should do the trick [ modulo csh syntax ].  The 'lc' in there
should prove interesting as lc had to execute a filename lookup on the
300 or so directories which existed at the time.  Clearly if lc can
execute 300 stat() calls, some bad guy's program would have little
trouble slipping an unlink()/link() pair in there ...

Oh well, so much for the nice() mythology.  Another UNIX legend down the
tubes.  Now excuse me as I go run off and rmdir all 676 of those
directories I just created ;-(
-- 
John F. Haugh II                        +-Quote of the Week:-------------------
VoiceNet: (214) 250-3311   Data: -6272  |"Unix doesn't have bugs,
InterNet: jfh@rpp386.Dallas.TX.US       | Unix is a bug"
UucpNet : <backbone>!killer!rpp386!jfh  +--              -- author forgotten --

wcs@skep2.ATT.COM (Bill.Stewart.[ho95c]) (12/23/88)

I had suggested using nice(-255) to make sure "never" lose the
race condition, with the obvious implication that there's no
guarantee that you really never lose - the suggestion of
mkdiring a large number of files at once to make sure you run
for a long time is one I should have seen.  It was meant to be
a safe fix for most of the problem, and does seem to help the
immediate threat.

For what it's worth, I tested the breaking-in
script from the net for about two hours without it losing, vs.
the typical 30 seconds I got with the original mkdir.  I've
done less extensive with mkdir foo foo foo foo...
but the nice() gives it enough advantage in the race that
tests of less than a few million iterations aren't very convincing.

Andy Rosen's solution looks right - has anyone found holes in it?
  mknod(dirname); link("."); link(".."); chdir(dirname); chown(".");
To make it a bit more general, I assume you would pwd() at the
beginning, and chdir(origdir) before each call to the mkdir routine.

Several people have quibbled about the -255 value - my System V
and 4.1BSD manuals both say that nice() will truncate if the
nice value is out of bounds, so this value makes it truncate
at the least niceness, regardless of system-dependent
differences in what the values mean.

Don writes about my approach:
> =	Unfortunately, this analysis is incorrect.  The real window
> =occurs while the mkdir process is blocked on disk I/O to, e.g., look
> =up elements of the path name of the file to chown().

In article <1313@vsi1.COM> lmb@vicom.COM (Larry Blair) writes:
> I don't what version of Unix you run, but I don't know of one that would
> do ANY disk I/O for the chown, since the mknod would have brought all of
> the elements into core and, being -255, no one else could have caused the
> kernel to de-cache the inodes and buffers.  

Any comments on who's right?  I suppose that if you mkdir
enough files and there's a lot of disk thrashing on the
system, it's possible you'll age enough to occasionally
need disk i/o, which means there's some risk of losing the race.
Adding a sync() before each mknod may help.

From: arosen@hawk.ulowell.edu (MFHorn) (Andy Rosen)
> The real problem is mkdir trusts dirname to be the directory it just
> created, which is not necessarily the case.  Nicing the process only
> shrinks the window of vunlerability, but it doesn't close it.
	Of course, though it makes it much safer, at least for
	uniprocessors.
> The proper fix is to change 'chown(dirname);' to 'chown(".");' and
> add a chdir(dirname); in the right place (with proper error checking).
> Now, it doesn't matter where the rmdir/ln magic is performed.  If it's

From: ddl@husc6.harvard.edu (Dan Lanciani)
> 	Incidentally, the fix proposed by jfh@rpp386 (using dir/./.
> as the target of the chown()) doesn't help either.  It was a good
> try (and happened to be included in the mkdir test mentioned above)
> but breaks down since link() itself is not atomic.

In article <871@husc6.harvard.edu> ddl@husc6.harvard.edu (Dan Lanciani) writes:
>	Incidentally, the fix proposed by jfh@rpp386 (using dir/./.
>as the target of the chown()) doesn't help either.  ...
>but breaks down since link() itself is not atomic.

From: John F. Haugh II                        
> Failing this, I suggest we all add Mr. Laniciani to our official
> list of crackpots and throw him in the KILL file.

Let's not flame here - Don often expresses opinions contrary to the
conventional wisdom, but he knows what he's doing at least as
often as I do :-)

The chdir(dname); chown(".", getuid()) solution looks good -
I'd been looking at variations on 
	mknod(dname); link dname . ; link parent .. ;
	stat the new directory to get permissions
	chmod 000 directory
	create a subdirectory dname/temp, make . and ..
	chown dname/temp/..  /* *must* be a directory */
	chmod original-permissions dname

-- 
#				Thanks;
# Bill Stewart, AT&T Bell Labs 2G218 Holmdel NJ 201-949-0705 ho95c.att.com!wcs
#
#	News.  Don't ask me about News.

jfh@rpp386.Dallas.TX.US (The Beach Bum) (12/24/88)

In article <876@husc6.harvard.edu> ddl@husc6.harvard.edu (Dan Lanciani) writes:
>| The real problem is mkdir trusts dirname to be the directory it just
>| created, which is not necessarily the case.  Nicing the process only
>| shrinks the window of vunlerability, but it doesn't close it.
>
>	Correct.

In the case of the posted patch which I suggested, this is immaterial.
If the directory being chown()'d is NOT the directory which was just
created, then the person doing the spoofing must have created the
bogus directory with some help by becoming root, since only root could
have made the bogus directory links.

Given THAT piece of information, it SHOULD be obvious that either the
patch works, or the bad guy was root already in which case it doesn't
matter what the hell happens.  Only root can create arbitrary directory
structures.  Only a clever manipulation of the directory structure
could cause mkdir to chown the wrong directory.
-- 
John F. Haugh II                        +-Quote of the Week:-------------------
VoiceNet: (214) 250-3311   Data: -6272  |"Unix doesn't have bugs,
InterNet: jfh@rpp386.Dallas.TX.US       | Unix is a bug"
UucpNet : <backbone>!killer!rpp386!jfh  +--              -- author forgotten --

daveb@geaclib.UUCP (David Collier-Brown) (12/25/88)

From article <10845@swan.ulowell.edu>, by arosen@hawk.ulowell.edu (MFHorn):
> A couple years ago, I had to fix this bug in one of our systems.  I had
> source to mkdir.c, but not to the kernel, and was able to successfully
> close the hole completely.

  Just a sidebar on the presumed safety of Berkeley 4.2:  
while creating old-fashioned lock files on an Ultrix system, I noted
that at least the rename system call was non-atomic (ie, I could
observe both the old and new names for a few repetitions of ls -l),
and wonder if a similar situation applies to the mkdir operation.
  --dave (I'll try it on my sun next year (:-)) c-b
-- 
 David Collier-Brown.  | yunexus!lethe!dave
 Interleaf Canada Inc. |
 1550 Enterprise Rd.   | He's so smart he's dumb.
 Mississauga, Ontario  |       --Joyce C-B