budd@arizona.UUCP (02/27/84)
The following problem seems conceptually easy, but subtle to implement - before I consider it further - has anybody else done anything similar? Generally speaking, the idea is to suspend a program and then pick it up later, even if later is across login boundaries (three weeks later, for example). The general solution would be to hack the shell to give you these capabilities. Slightly less bothersome would be to have some routine, eg, suspend_me, which when called produces a something and then gracefully dies. That something can somehow be started up again, acting as if suspend_me returned and all was normal. I originally thought suspend_me could produce a new a.out file, but it appears one cannot initialize the stack and registers with that approach. I believe suspend_me will have to produce a core dump, and to restart somethig will have to rummage through that core dump and set everything up. adb and sdb do this using ptrace, which would mean creating a child in the image that you would like, resetting the stack and malloc'ed areas one word at a time, starting the child and going away. - there must be an easier way. Anybody have any leads?
mark@umcp-cs.UUCP (02/29/84)
The big problem with generalized suspend is what to do with the file descripters. We have a hack here, written by Rehmi Post, that can reach into the kernel and re-attatch file descripters to previously suspended jobs. But its real dangerous and no one uses it. If you think of an elegant solution to the file re-attachment problem, let's hear it. -- Spoken: Mark Weiser ARPA: mark@maryland CSNet: mark@umcp-cs UUCP: {seismo,allegra,brl-bmd}!umcp-cs!mark
dan%sri-tsc@sri-unix.UUCP (02/29/84)
From: Dan Chernikoff <dan@sri-tsc> It's a little more complicated than that. What you want is a "detached job" capability like tops-20. The complexity comes in because many programs "know" what tty you are on, and what the current modes associated with that tty are. Probably the simplest way to get around this would be to use pty's (pseudo tty's) on every login, in such a way that when a pty get's detached, you can not assign it to anyone else until the rightful owner logs on again and does an "attach" to it. With this scheme, all you have to do is suspend all the processes associated with that pty (assuming you have the Berkeley job control code -- if not you are in big trouble), and leave all those processes hanging around out there until the user reattaches the pty. The problem with this, obviously, is that it will eat up slots in your process table very fast, sigh. It's a great idea, but might be beyond the scope of the UNIX environment, alas. Good luck! -Dan Chernikoff
walsh%bbn-unix@sri-unix.UUCP (02/29/84)
From: Bob Walsh <walsh@bbn-unix> One can convert a core dump, image in the swap area, or the current process into an executable with an initialized data region. However, such an executable cannot always be started and simulate the continuation of the original job since the meaning of file descriptors other than stdin, stdout, and stderr will have been lost. With shell i/o redirection, even the meaning of those file descriptors may have changed. So, the idea will not work for arbitrary programs, but will work for programs prepared to deal with it. I once write a cross assembler that did just this in order to avoid re-running initialization code for things like the symbol table each time it was started. bob walsh
sjc@mordor.UUCP (03/01/84)
While it does not provide a generalized suspend capability, a program called "undump" sometimes suffices. To use it, you coredump the process that you want to suspend, and then run "undump", converting the core file back to "a.out" format. The disadvantages are: 1. It takes time to coredump and convert a program, so one does not lightly resort to this. 2. The reconstituted program starts at the beginning, no matter where it was executing when it coredumped. Thus, you must design the program with this in mind. It should catch the QUIT signal, continue running until it is in a well-defined state (e.g. with files flushed and closed), record in a static variable the information it will need to restart, and then coredump. At the beginning of the program, one can check this static variable to see whether the program is actually being restarted and, if so, one can "branch forward" (e.g. reopen files, set flags, etc.) to resume. For an interactive program, another solution is to prohibit QUIT signals, but provide a user command which causes the program to put itself in a well-defined state and coredump. (This can make restarting particularly easy, if your program happens to be in a well-defined, fairly quiescent state when awaiting user input.) Despite these substantial restrictions, I know of at least two programs which make profitable use of this scheme. One is the TeX text formatter distributed by Richard Furuta (Furuta@WASHINGTON.ARPA, or ...decvax!uw-beaver!uw-june!furuta); the "undump" program comes with it. To install the formatter, you run it, load the standard macro package from a file, coredump it, and undump it. Then you give the undumped version to users, who are spared the nuisance and delay of loading the macro package each time they format a document. Another example is a program here which lets you load an enormous but rarely changed database, format a display to your liking, and then type a command called "bedtime" to coredump the result. Then you undump that to make a customized version for routine use.--Steve (S-1 Project, Lawrence Livermore National Laboratory) MILNET: sjc@s1-c UUCP: ...!decvax!decwrl!mordor!sjc
ed@unisoft.UUCP (03/06/84)
The problems with a generalized restart are not with setting up the stack and such; that's pretty easy if you have a suspend routine to save it all. The real problem is with the external state of the process - open files and such. They're difficult to recreate. -- Ed Gould ucbvax!mtxinu!ed
gwyn%brl-vld@sri-unix.UUCP (03/07/84)
From: Doug Gwyn (VLD/VMB) <gwyn@brl-vld> It's even worse than that. The resources such as files being accessed (especially /etc/passwd, which may be partially buffered in the process's data space) wil in general have changed by the time you restart the process. In other words, this idea has some merit in appropriate cases but is not useful in the general case. Imagine leaving a lock on a database for several days until the program finishes making its update...
edhall%rand-unix@sri-unix.UUCP (03/07/84)
From: Ed_Hall <edhall@rand-unix> From: Doug Gwyn (VLD/VMB) <gwyn@brl-vld> To: Bob Walsh <walsh@bbn-unix> cc: arizona!budd@ucb-vax, Unix-Wizards@brl-vgr Subject: Re: generalized suspend wanted It's even worse than that. The resources such as files being accessed (especially /etc/passwd, which may be partially buffered in the process's data space) wil in general have changed by the time you restart the process. In other words, this idea has some merit in appropriate cases but is not useful in the general case. Imagine leaving a lock on a database for several days until the program finishes making its update... This is a problem with Berkeley-style job-control as it now stands. Basically, a program that can't be safely stopped should be set up to hold the stop signals during critical sections. As for frequently- changed files, I would suppose that something like an /etc/passwd lookup should also hold stop signals until it reaches a stoppable point. And screen-oriented programs need to reset terminal modes before stopping, and set them again prior to repainting when re- started. Of course, this means that some programs can't be naive about job- control (making it necessary for some programmers not to be naive about job-control). In general, it looks like a generalized suspend would be doable assuming that the shell is made smart enough to store such things as environment and history in a temporary file, and programs which depend upon that environment variables are smart enough to look at them again when restarted from such a suspend; restart would take place from a new login shell with such things as terminal type and so forth set appropriately for the new session, and the rest taken from the temporary file. Kernel mods would be fairly straight- forward, and would essentially involve adding an attach() system call along with new signals for disconnects and restarts. A virtual terminal interface would simplify things enormously for screen programs, but that is more kernel-hacking than most mortals might wish to attempt. The termcap/curses hacking needed is probably easier. -Ed Hall ARPA: edhall@rand-unix UUCP: decvax!randvax!edhall