[comp.unix.wizards] System V job control idea

brandon@tdi2.UUCP (Brandon Allbery) (04/13/87)

Recently it occurred to me that there exists a form of simple job control
under every version of UNIX since the Seventh Edition (at least).  It's
called ptrace().

A shell could fork a child which does a ptrace(0) then execs the intended
target program.  The parent (after continuing the child from the SIGTRAP
stop caused by the exec) then can wait() for the child; SIGQUIT (for example)
can be usurped as the stop character and used by the shell as a special
case for continuation.

Aside from the inhibition of setuid (which should be reconsidered for this
application, maybe; what kind of ``fraud'' is it designed to prevent?), are
there any reasons this won't work?  (I know, no SIGTTIN/SIGTTOU.  Oh well.)

++Brando
-- 
Brandon S. Allbery	           UUCP: cbatt!cwruecmp!ncoast!tdi2!brandon
Tridelta Industries, Inc.         CSNET: ncoast!allbery@Case
7350 Corporate Blvd.	       INTERNET: ncoast!allbery%Case.CSNET@relay.CS.NET
Mentor, Ohio 44060		  PHONE: +1 216 255 1080 (home) +1 216 974 9210

henry@utzoo.UUCP (Henry Spencer) (05/02/87)

> Aside from the inhibition of setuid (which should be reconsidered for this
> application, maybe; what kind of ``fraud'' is it designed to prevent?)...

The obvious kind:  modifying the code of a setuid program.

Note that being able to suspend a setuid program is in itself a security
defect (the program may be in the middle of updating a database, may have
things locked, etc.), so being unable to run setuid programs in such a
setup isn't necessarily a flaw.
-- 
"If you want PL/I, you know       Henry Spencer @ U of Toronto Zoology
where to find it." -- DMR         {allegra,ihnp4,decvax,pyramid}!utzoo!henry

hutch@sdcsvax.UCSD.EDU (Jim Hutchison) (05/03/87)

In article <7987@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes:
>> Aside from the inhibition of setuid (which should be reconsidered for this
>> application, maybe; what kind of ``fraud'' is it designed to prevent?)...

>The obvious kind:  modifying the code of a setuid program.

Adb!  No wonder sysV does not come with it, it's a security hole! :-)

>Note that being able to suspend a setuid program is in itself a security
>defect (the program may be in the middle of updating a database, may have
>things locked, etc.), so being unable to run setuid programs in such a
>setup isn't necessarily a flaw.

On the other hand, setuid programs which allow you to suspend themselves
during a crucial period are already flawed.  If it is important to do something
without getting interrupted by something of a lesser nature, kerneloids put
in an spl() of the appropriate level.  In setuid programs you put in a signal
handler which says "Just a minute, I'm in the bathroom.", or some such, in
order to not get caught with your shorts down. :-)

Agreed, modifying the external environment does change the environment in which
the programs where originally targeted for, and thus makes this an unfair
request. :-( The creature you see as a flaw is not really in the ability to
suspend a setuid program, just doing it at a bad time (ignoring adb & c.).
To nail it down, you can "suspend" an su'd shell, you aren't proposing rm'ing
that are you?  Ofcourse not.

-- 
    Jim Hutchison   		UUCP:	{dcdwest,ucbvax}!sdcsvax!hutch
		    		ARPA:	Hutch@sdcsvax.ucsd.edu
Disklame'r:
    One greater than the greatest signature representable with 184 symbols.

mouse@mcgill-vision.UUCP (05/06/87)

In article <337@tdi2.UUCP>, brandon@tdi2.UUCP (Brandon Allbery) writes:
> Recently it occurred to me that there exists a form of simple job
> control under every version of UNIX since the Seventh Edition (at
> least).  It's called ptrace().

A very interesting notion.  Probably worth at least following up
somewhat.

> Aside from the inhibition of setuid (which should be reconsidered for
> this application, maybe; what kind of ``fraud'' is it designed to
> prevent?),

Fraud like you run a setuid program, eg /bin/passwd, stop it before it
does anything, patch the text segment to do execl("/bin/csh"), and thus
get yourself a root shell.  The whole point of a setuid program is that
it is trusted to not abuse its privileges.  The ability to scribble on
the text segment, or, for most programs, the data and/or stack
segments, without inhibiting the setuid property, opens up security
holes you have to hang onto the edge to keep from falling through.

Possibly the shell could check for setuid and do the ptrace() trick
only for non-setuid programs.  There are comparatively few setuid
programs, and anyway, it's a lot better than nothing.  (If it works,
that is.)

					der Mouse

				(mouse@mcgill-vision.uucp)

jlo@elan.UUCP (Jeff Lo) (05/12/87)

In article <757@mcgill-vision.UUCP>, mouse@mcgill-vision.UUCP (der Mouse) writes:
> In article <337@tdi2.UUCP>, brandon@tdi2.UUCP (Brandon Allbery) writes:
> > Recently it occurred to me that there exists a form of simple job
> > control under every version of UNIX since the Seventh Edition (at
> > least).  It's called ptrace().
> 
> A very interesting notion.  Probably worth at least following up
> somewhat.

When I was working at HP, and was using HP-UX, a System V variant, I tried
this idea out. I didn't have time to work out all the bugs, but I could
stop running jobs by starting jobs from a shell with ptrace(). I wuold then
intercept any signals, and restart the process with the same signal unless
it was SIGQUIT. The signal can be found by checking the status from wait().
In this case I would just stop the job. It could then be restarted, ignoring
the signal. In case I really wanted a core dump, which couldn't be generated
by from the keyboard without SIGQUIT, I added a function to the shell which
would restart a stopped job with SIGQUIT. I never had time to really clean
everything up, and now I'm back to a BSD system, so I'll probably never
finish it.

If anyone decides to try this themselves, I'd be interested in seeing how
it works out. Here are some things to watch out for. Just allowing a user
to run processes, stop them, and restart them in the foreground isn't that
hard. If you want to be able to move a job from running in the foreground
to running in the background there are some problems. There are no SIGTTOU
or SIGTTIN signals in SysV. The first one isn't really necessary, not having
it will just allow a background process to write all over the terminal while
something else is running, as with "stty -tostop". However, if a process
tries to read something from the terminal, what happens? Without SIGTTIN to
tell you this is happening, my guess is it will fight with the other
processes reading from the terminal for input. The one way I have thought of
to get around this is to use ptrace() again. Set a breakpoint at _read and
check the stack for file descriptor 0. You will also have to parse the
command line to see if stdin was redirected from somewhere else, and you
still won't know if fd 0 was closed and then reopened to something other than
the terminal. And this much works only if there is a symbol table in the
binary to find the address of _read from. Things like select() may also get
confused. Using ptrace() in this way may also slow down execution a lot.
There is no penalty to just run and intercept signals, but breakpoints may
slow it down considerably.

					Jeff Lo
					Elan Computer Group
					..!{ames,hplabs}!elan!jlo

Karl.Kleinpaste@cbstr1.UUCP (05/13/87)

I really wish the original article had gotten here; I haven't any
record of it at all in my history file.

I implemented this job control emulation scheme in a SysV-compatible
version of csh.  It's been running on my dept's machines for over 2
years now.  It is far from perfect; as has been mentioned, lack of
SIGTTIN/OUT makes life difficult for fg/bg jobs competing for terminal
input.  I have not attempted to resolve that (and breakpointing every
read(2) is not a thought which thrills me, sorry).  However, for most
general issues, it works quite well.

Some other limitations which I have found include: Lots of programs
don't cope with EINTR returns from system calls (especially read(2)
and wait(2)) well.  Cat(1) is the most obvious example - it just rolls
over dead if it ever gets a non-positive result from read(2).  If I
have cat(1) reading from stdin, and I stop it using SIGQUIT as an
emulated SIGTSTP on ^Z, it stops fine, but then on restart it just
dies.

Also, the entire job control concept in this implementation fails for
processes which fork subprocesses.  The shell can only control its
immediate children.  If those children create grandchildren, they are
susceptible to SIGQUIT core dumps just as any ordinary process might
be.  This makes the use of make(1) difficult; I turn job control
emulation off when using make(1).  (Make(1) is also a program which
doesn't cope with EINTR returns from wait(2) - "bad wait status.")

Lastly, SIGCLD is not generated properly.  There is a simple
modification which I have made to our VAXen kernels to correct this (I
consider it a bona fide bug), but the problem is that the stop()
routine in os/ptrace.c doesn't actually issue a signal; rather, it
just wakes up the parent, in the apparent hope that the parent is
already doing something wake-able (presumably wait(2)) and will
notice.  I changed it to a psignal(pp, SIGCLD) call, which is The
Right Way.

The question of fraud-defense on setuid programs is very easy to
conquer, though: When exec'ing the program, the last thing done before
exec is to check for setuid-ness.  If the setuid/setgid bits are on,
no ptrace(2) call is made to "turn on job control" for the process,
and then the process behaves just like any other setuid program, and
is not stoppable.  Su(1) and uucico(1) run just fine and don't lose
their setuid attributes.

The best part is that, when using GNU Emacs, when I hit ^Z, it doesn't
just fork a subshell like GNU Emacs is supposed to under SysV - for me
it actually stops.  I like that an awful lot.  (That's what the
USG_JOBCTRL #ifdef in sysdep.c is for.)

Karl

brandon@tdi2.UUCP (Brandon Allbery) (05/25/87)

I see the Great Net Timewarp has recycled a message of mine.  (sigh.)

Quoted from <165@elan.UUCP> ["Re: System V job control idea"], by jlo@elan.UUCP (Jeff Lo)...
+---------------
| In article <757@mcgill-vision.UUCP>, mouse@mcgill-vision.UUCP (der Mouse) writes:
| > In article <337@tdi2.UUCP>, brandon@tdi2.UUCP (Brandon Allbery) writes:
| > > Recently it occurred to me that there exists a form of simple job
| > > control under every version of UNIX since the Seventh Edition (at
| > > least).  It's called ptrace().
| > 
| > A very interesting notion.  Probably worth at least following up
| > somewhat.
| 
| tell you this is happening, my guess is it will fight with the other
| processes reading from the terminal for input. The one way I have thought of
| to get around this is to use ptrace() again. Set a breakpoint at _read and
| check the stack for file descriptor 0. You will also have to parse the
| command line to see if stdin was redirected from somewhere else, and you
| still won't know if fd 0 was closed and then reopened to something other than
| the terminal. And this much works only if there is a symbol table in the
| binary to find the address of _read from. Things like select() may also get
| confused. Using ptrace() in this way may also slow down execution a lot.
| There is no penalty to just run and intercept signals, but breakpoints may
| slow it down considerably.
+---------------

(1) I was informed in no uncertain terms that using ptrace() at all imposed
an execution penalty.  (I should take it with a grain of salt, as I was also
told in no uncertain terms that all signal handling would be screwed up; but
I did read the manual on ptrace(), and it can be made to handle all signals
very easily.

(2) Your read trick is, to say the least, a kludge.  Unfortunately, short of
SIGTTIN there is no fix.

(3) No select().  Poll(), maybe, iff sVr3 (we're sVr2).

I have decided to skip sV job control in favor of pushing harder on a certain
computer company to get off its *ss and implement sxt's.  Also I am planning
to add sxts to Minix, and may add some form of job control IF I can work out
a clean way of doing it.  (If that certain computer company chooses not to
respond, we may end up doing business with their net-neighbor instead....)

++Brando
-- 
Brandon S. Allbery	           UUCP: cbatt!cwruecmp!ncoast!tdi2!brandon
Tridelta Industries, Inc.         CSNET: ncoast!allbery@Case
7350 Corporate Blvd.	       INTERNET: ncoast!allbery%Case.CSNET@relay.CS.NET
Mentor, Ohio 44060		  PHONE: +1 216 255 1080 (home +1 216 974 9210)

xsimon@its63b.UUCP (05/26/87)

In article <359@tdi2.UUCP> brandon@tdi2.UUCP (Brandon Allbery) writes:
>
>I have decided to skip sV job control in favor of pushing harder on a certain
>computer company to get off its *ss and implement sxt's.  Also I am planning
>to add sxts to Minix, and may add some form of job control IF I can work out
>a clean way of doing it.
>
>++Brando

Well, getting sxt's isn't really *skipping* job control as such, its just
an alternative (to signals) way of implementing it. In fact, I did post a
set of context diffs to comp.sources.unix a while back for converting the
System V Bourne shell into a sxt job-control version (functionally compatible
with csh's BSD job control, almost), but it doesn't seem to have escaped
from Rich Salz yet. Perhaps it will sometime...

*=Simon


-- 
----------------------------------
| Simon Brown 		         | UUCP:  seismo!mcvax!ukc!{its63b,cstvax}!simon
| Department of Computer Science | JANET: simon@uk.ac.ed.{its63b,cstvax}
| University of Edinburgh,       | ARPA:  simon%{its63b,cstvax}.ed.ac.uk ...
| Scotland, UK.			 |				@cs.ucl.ac.uk
----------------------------------	 "Life's like that, you know"

kre@munnari.UUCP (05/30/87)

In article <434@its63b.ed.ac.uk>, xsimon@its63b.ed.ac.uk (Simon Brown) writes:
> Well, getting sxt's isn't really *skipping* job control as such, its just
> an alternative (to signals) way of implementing it.

No its not.  What you can do with sxt's is implement poor mans
windows.  You can't implement job control.  You can use job
control to implement poor mans windows too, which is what csh
does, and because of that a lot of people equate job contol
and poor mans windows.

If you have job control, a suitably authorised user (ie: root)
can pick a random process and stop it, to be continued later.

I do this from time to time when we're suffering from excessive
paging, or some such .. just find a big process, and stop it.
After the load diminishes, it can be continued without realizing
it was ever touched.  if it happened to be interactive, then
the user just gets a "stopped" message, probably continues it
immediately, and I go pick on someone else...

This is job control, please don't confuse one application of
job control for the real thing.

kre

gnu@hoptoad.UUCP (06/04/87)

In article <1662@munnari.oz>, kre@munnari.oz (Robert Elz) writes:
> If you have job control, a suitably authorised user (ie: root)
> can pick a random process and stop it, to be continued later.

I wish this was true.  Trouble is, the parent of that process will
receive a report that it has stopped.  If the parent is a csh, there is
no problem; but if the parent is init, or a Bourne shell, I have seen
it end up killing the job that I had stopped.  I never ended up poking
around enough to find out why.  (This on various versions of SunOS.)
-- 
Copyright 1987 John Gilmore; you may redistribute only if your recipients may.
(This is an effort to bend Stargate to work with Usenet, not against it.)
{sun,ptsfa,lll-crg,ihnp4,ucbvax}!hoptoad!gnu	       gnu@ingres.berkeley.edu

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/05/87)

In article <2245@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
>In article <1662@munnari.oz>, kre@munnari.oz (Robert Elz) writes:
>> If you have job control, a suitably authorised user (ie: root)
>> can pick a random process and stop it, to be continued later.
>I wish this was true.  Trouble is, the parent of that process will
>receive a report that it has stopped.

Elz may have had in mind sensible job control facilities, not 4BSD.
E.g., /proc filesystem.
After several years of fighting more than a half-dozen small but
important changes to the way terminals, process groups, vhangup,
etc. operate and more importantly interact, I'm now convinced
that the 4BSD implementation of job control is not the right approach.

xsimon@its63b.UUCP (06/06/87)

In article <1662@munnari.oz>, kre@munnari.oz (Robert Elz) writes:
> If you have job control, a suitably authorised user (ie: root)
> can pick a random process and stop it, to be continued later.

Well, surely the phrase "job control" implies an ability to ``control'' jobs -
not just a simple stop/resume ability. In particular, any system call that
at present can affect only the current process should, in a "full" job-control
environment, be able to control any named process or job. (I guess "process
group" is the closest thing to the concept of "job" that there is at the
moment). - For a suitably authorized controlling process, that is, of course.
For example, things like setuid(), signal(), open(), dup(), close(), etc...
would all have to have job-control variants to "induce" the system-call in
the named process or job. 
Of course, the complexity of implementing such a scheme would be incredible!
-- 
----------------------------------
| Simon Brown 		         | UUCP:  seismo!mcvax!ukc!{its63b,cstvax}!simon
| Department of Computer Science | JANET: simon@uk.ac.ed.{its63b,cstvax}
| University of Edinburgh,       | ARPA:  simon%{its63b,cstvax}.ed.ac.uk ...
| Scotland, UK.			 |				@cs.ucl.ac.uk
----------------------------------	 "Life's like that, you know"