[comp.unix.internals] Undeliverable mail

Postmaster%BRFAPESP.BITNET@uicvm.uic.edu (PMDF Mail Server) (09/04/90)

Your message could not be delivered to:

    Multiple recipients of list UNIX-WIZ <UNIX-WIZ@NDSUVM1.BITNET>

Your message has been enqueued and undeliverable for 12 days.
No further attempts will be made to deliver your messsage.

Your entire message follows:

Please add me to your user list.  I am with the US Army
Computer Science School.

Thank You
Info Center
USA Computer Science School
AV 780-3245 COMM(404)791-3245

Postmaster%BRFAPESP.BITNET@uicvm.uic.edu (PMDF Mail Server) (09/04/90)

Your message could not be delivered to:

    Multiple recipients of list UNIX-WIZ <UNIX-WIZ@NDSUVM1.BITNET>

Your message has been enqueued and undeliverable for 12 days.
No further attempts will be made to deliver your messsage.

Your entire message follows:

Wizards:

I've felt the need for need for a new tool that Sun doesn't have.

Ever have a process that's been running for six weeks, and will
need another week to finish when you need to make level 0 backups
or would like to shut the computer down for a bad storm.

I need a tool that would stop a running process and let it be
restarted at a later date.

I've thought of a couple of ways it might be done:

1.  Processes that you know will run a long time might be run
from a parent or with an other that periodically saves an image
of the program.  If the machine is stopped, the job can be
restarted from the last checkpoint.

2.  Similarly, a signal could be sent to a process (like SIGINT)
which would stop the process and save a current image.  When you
wished, you could start the process again with a load and continue
to the process.

3.  Or you could have a ROLLOUT instruction instead of issuing a
halt.  A "picture" of memory is taken (minus space needed to run
the process doing this).  At boot up, you would be given the option
of starting the machine where you stopped.



reply to:
                mth@rolf.stat.uga.edu

on the internet, thanx.

Postmaster%BRFAPESP.BITNET@uicvm.uic.edu (PMDF Mail Server) (09/04/90)

Your message could not be delivered to:

    Multiple recipients of list UNIX-WIZ <UNIX-WIZ@NDSUVM1.BITNET>

Your message has been enqueued and undeliverable for 12 days.
No further attempts will be made to deliver your messsage.

Your entire message follows:

/*
 * This is being run on a Sun SS1 under 4.0.3.
 *  Theoretically, according to the Design & Implementation of 4.3BSD Unix,
 * this should print out the ascii version of each process's current
 * directory...instead, it chokes on u->u_cwd->cw_dir, which is in the
 * u struct in sys/user.h .. any help, suggestions, etc would be greatly
 * appreciated.

 */

/*
 * cc -o cwd cwd.c -lkvm
 */

#include <stdio.h>
#include <kvm.h>
#include <fcntl.h>
#include <ctype.h>
#include <pwd.h>
#include <sys/param.h>
#include <sys/time.h>
#include <sys/proc.h>
#include <sys/user.h>

main (argc, argv)


{






                        *argv);
                exit (1);
        }

        (void) printf("Name\t\tDir\n");
        kvm_setproc (kd);
        while ((proc = kvm_nextproc (kd)))
                if (proc->p_stat != SZOMB && proc->p_uid) {
                        if (!(user = kvm_getu(kd, proc)))
                                continue;
                        (void) printf("%s\n", (getpwuid(proc->p_uid))->pw_name);
/* Curtains & Slow Music */
                        (void) printf("%s\n", user->u_cwd->cw_dir);
/* It dies, but the user structure's fine (printing user->u_comm works); I
   stepped thru it with gdb & discovered that the pointer user->u_cwd is off in
   never-never-land; is it a valid entry in
the user structure? */
                }
}

Postmaster%BRFAPESP.BITNET@uicvm.uic.edu (PMDF Mail Server) (09/04/90)

Your message could not be delivered to:

    Multiple recipients of list UNIX-WIZ <UNIX-WIZ@NDSUVM1.BITNET>

Your message has been enqueued and undeliverable for 12 days.
No further attempts will be made to deliver your messsage.

Your entire message follows:

Mark Holcomb <mth@ROLF.STAT.UGA.EDU> writes:

> I've felt the need for need for a new tool that Sun doesn't have.

> Ever have a process that's been running for six weeks, and will
> need another week to finish when you need to make level 0 backups
> or would like to shut the computer down for a bad storm.

> I need a tool that would stop a running process and let it be
> restarted at a later date.

> I've thought of a couple of ways it might be done:

[A number of suggestions deleted]

...

A general solution would have to re-establish and re-position all open
files, sockets, message queues, pipes, semphores, shared-memory segments,
environment variables, (add your favourite externally visible entity) to
exactly the same state as they were previously.

Once you've done this, it's a simple matter of re-constructing your memory
image.

Finally, you have to hope that none of the code in your program has
stashed the PID or date away in memory somewhere as these may be different
when you next restart the prog :-)

Seriously, doing this in any substantive manner is difficult and I'm sure
it would be virtually impossible to bullet-proof it on UNIX.

When confronted with this array of problems, most people opt for
individualized, per-program solutions for those progs that run for long
periods.


Mark D.