[net.unix-wizards] generalized suspend wanted

budd@arizona.UUCP (02/27/84)

        The following problem seems conceptually easy, but subtle to
implement - before I consider it further - has anybody else done anything
similar?

        Generally speaking, the idea is to suspend a program and then
pick it up later, even if later is across login boundaries (three weeks
later, for example).  The general solution would be to hack the shell to
give you these capabilities.  Slightly less bothersome would be to
have some routine, eg, suspend_me, which when called produces a something
and then gracefully dies.  That something can somehow be
started up again, acting as if suspend_me returned and all was normal.
I originally thought suspend_me could produce a new a.out file, but
it appears one cannot initialize the stack and registers with that approach.
I believe suspend_me will have to produce a core dump, and to restart somethig
will have to rummage through that core dump and set everything up.  adb and
sdb do this using ptrace, which would mean creating a child in the image
that you would like, resetting the stack and malloc'ed
areas one word at a time, starting the child and going away.
- there must be an easier way.  Anybody have any leads?

mark@umcp-cs.UUCP (02/29/84)

The big problem with generalized suspend is what to do with the
file descripters.  We have a hack here, written by Rehmi Post,
that can reach into the kernel and re-attatch file descripters
to previously suspended jobs.  But its real dangerous and
no one uses it.  If you think of an elegant solution to the
file re-attachment problem, let's hear it.
-- 
Spoken: Mark Weiser 	ARPA:	mark@maryland
CSNet:	mark@umcp-cs 	UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!mark

dan%sri-tsc@sri-unix.UUCP (02/29/84)

From:  Dan Chernikoff <dan@sri-tsc>

It's a little more complicated than that.  What you want is a "detached job"
capability like tops-20.  The complexity comes in because many programs "know"
what tty you are on, and what the current modes associated with that tty are.
Probably the simplest way to get around this would be to use pty's (pseudo
tty's) on every login, in such a way that when a pty get's detached, you can
not assign it to anyone else until the rightful owner logs on again and does
an "attach" to it.  With this scheme, all you have to do is suspend all the
processes associated with that pty (assuming you have the Berkeley job control
code -- if not you are in big trouble), and leave all those processes hanging
around out there until the user reattaches the pty.  The problem with this,
obviously, is that it will eat up slots in your process table very fast, sigh.

It's a great idea, but might be beyond the scope of the UNIX environment, 
alas.

Good luck!

	-Dan Chernikoff

walsh%bbn-unix@sri-unix.UUCP (02/29/84)

From:  Bob Walsh <walsh@bbn-unix>


One can convert a core dump, image in the swap area, or the
current process into an executable with an initialized data
region.  However, such an executable cannot always be started
and simulate the continuation of the original job since the
meaning of file descriptors other than stdin, stdout, and stderr
will have been lost.  With shell i/o redirection, even the
meaning of those file descriptors may have changed.

So, the idea will not work for arbitrary programs, but will work
for programs prepared to deal with it.  I once write a cross 
assembler that did just this in order to avoid re-running initialization
code for things like the symbol table each time it was started.

bob walsh

sjc@mordor.UUCP (03/01/84)

While it does not provide a generalized suspend capability, a program
called "undump" sometimes suffices. To use it, you coredump the process
that you want to suspend, and then run "undump", converting the core file
back to "a.out" format. The disadvantages are:

	1. It takes time to coredump and convert a program, so one
	does not lightly resort to this.

	2. The reconstituted program starts at the beginning, no matter
	where it was executing when it coredumped. Thus, you must
	design the program with this in mind. It should catch the QUIT
	signal, continue running until it is in a well-defined state
	(e.g. with files flushed and closed), record in a static
	variable the information it will need to restart, and then
	coredump.  At the beginning of the program, one can check this
	static variable to see whether the program is actually being
	restarted and, if so, one can "branch forward" (e.g. reopen
	files, set flags, etc.) to resume.

	For an interactive program, another solution is to prohibit QUIT
	signals, but provide a user command which causes the program to
	put itself in a well-defined state and coredump. (This can make
	restarting particularly easy, if your program happens to be in
	a well-defined, fairly quiescent state when awaiting user input.)

Despite these substantial restrictions, I know of at least two programs
which make profitable use of this scheme. One is the TeX text formatter
distributed by Richard Furuta (Furuta@WASHINGTON.ARPA, or
...decvax!uw-beaver!uw-june!furuta); the "undump" program comes with it.
To install the formatter, you run it, load the standard macro package
from a file, coredump it, and undump it. Then you give the undumped
version to users, who are spared the nuisance and delay of loading the
macro package each time they format a document.  Another example is a
program here which lets you load an enormous but rarely changed
database, format a display to your liking, and then type a command
called "bedtime" to coredump the result. Then you undump that to make a
customized version for routine use.--Steve

	(S-1 Project, Lawrence Livermore National Laboratory)
	MILNET: sjc@s1-c	UUCP: ...!decvax!decwrl!mordor!sjc

ed@unisoft.UUCP (03/06/84)

The problems with a generalized restart are not with setting up the stack
and such; that's pretty easy if you have a suspend routine to save it all.
The real problem is with the external state of the process - open
files and such.  They're difficult to recreate.

-- 
Ed Gould
ucbvax!mtxinu!ed

gwyn%brl-vld@sri-unix.UUCP (03/07/84)

From:      Doug Gwyn (VLD/VMB) <gwyn@brl-vld>

It's even worse than that.  The resources such as files being accessed
(especially /etc/passwd, which may be partially buffered in the process's
data space) wil in general have changed by the time you restart the
process.  In other words, this idea has some merit in appropriate cases
but is not useful in the general case.

Imagine leaving a lock on a database for several days until the program
finishes making its update...

edhall%rand-unix@sri-unix.UUCP (03/07/84)

From:  Ed_Hall <edhall@rand-unix>

    From:     Doug Gwyn (VLD/VMB) <gwyn@brl-vld>
    To:       Bob Walsh <walsh@bbn-unix>
    cc:       arizona!budd@ucb-vax, Unix-Wizards@brl-vgr
    Subject:  Re:  generalized suspend wanted

    It's even worse than that.  The resources such as files being accessed
    (especially /etc/passwd, which may be partially buffered in the process's
    data space) wil in general have changed by the time you restart the
    process.  In other words, this idea has some merit in appropriate cases
    but is not useful in the general case.

    Imagine leaving a lock on a database for several days until the program
    finishes making its update...

This is a problem with Berkeley-style job-control as it now stands.
Basically, a program that can't be safely stopped should be set up
to hold the stop signals during critical sections.  As for frequently-
changed files, I would suppose that something like an /etc/passwd
lookup should also hold stop signals until it reaches a stoppable
point.  And screen-oriented programs need to reset terminal modes
before stopping, and set them again prior to repainting when re-
started.

Of course, this means that some programs can't be naive about job-
control (making it necessary for some programmers not to be naive
about job-control).

In general, it looks like a generalized suspend would be doable
assuming that the shell is made smart enough to store such things
as environment and history in a temporary file, and programs which
depend upon that environment variables are smart enough to look at
them again when restarted from such a suspend; restart would take
place from a new login shell with such things as terminal type and
so forth set appropriately for the new session, and the rest taken
from the temporary file.  Kernel mods would be fairly straight-
forward, and would essentially involve adding an attach() system
call along with new signals for disconnects and restarts.

A virtual terminal interface would simplify things enormously for
screen programs, but that is more kernel-hacking than most mortals
might wish to attempt.  The termcap/curses hacking needed is probably
easier.

		-Ed Hall
ARPA:           edhall@rand-unix
UUCP:           decvax!randvax!edhall