Postmaster%BRFAPESP.BITNET@uicvm.uic.edu (PMDF Mail Server) (09/04/90)
Your message could not be delivered to: Multiple recipients of list UNIX-WIZ <UNIX-WIZ@NDSUVM1.BITNET> Your message has been enqueued and undeliverable for 12 days. No further attempts will be made to deliver your messsage. Your entire message follows: Please add me to your user list. I am with the US Army Computer Science School. Thank You Info Center USA Computer Science School AV 780-3245 COMM(404)791-3245
Postmaster%BRFAPESP.BITNET@uicvm.uic.edu (PMDF Mail Server) (09/04/90)
Your message could not be delivered to: Multiple recipients of list UNIX-WIZ <UNIX-WIZ@NDSUVM1.BITNET> Your message has been enqueued and undeliverable for 12 days. No further attempts will be made to deliver your messsage. Your entire message follows: Wizards: I've felt the need for need for a new tool that Sun doesn't have. Ever have a process that's been running for six weeks, and will need another week to finish when you need to make level 0 backups or would like to shut the computer down for a bad storm. I need a tool that would stop a running process and let it be restarted at a later date. I've thought of a couple of ways it might be done: 1. Processes that you know will run a long time might be run from a parent or with an other that periodically saves an image of the program. If the machine is stopped, the job can be restarted from the last checkpoint. 2. Similarly, a signal could be sent to a process (like SIGINT) which would stop the process and save a current image. When you wished, you could start the process again with a load and continue to the process. 3. Or you could have a ROLLOUT instruction instead of issuing a halt. A "picture" of memory is taken (minus space needed to run the process doing this). At boot up, you would be given the option of starting the machine where you stopped. reply to: mth@rolf.stat.uga.edu on the internet, thanx.
Postmaster%BRFAPESP.BITNET@uicvm.uic.edu (PMDF Mail Server) (09/04/90)
Your message could not be delivered to: Multiple recipients of list UNIX-WIZ <UNIX-WIZ@NDSUVM1.BITNET> Your message has been enqueued and undeliverable for 12 days. No further attempts will be made to deliver your messsage. Your entire message follows: /* * This is being run on a Sun SS1 under 4.0.3. * Theoretically, according to the Design & Implementation of 4.3BSD Unix, * this should print out the ascii version of each process's current * directory...instead, it chokes on u->u_cwd->cw_dir, which is in the * u struct in sys/user.h .. any help, suggestions, etc would be greatly * appreciated. */ /* * cc -o cwd cwd.c -lkvm */ #include <stdio.h> #include <kvm.h> #include <fcntl.h> #include <ctype.h> #include <pwd.h> #include <sys/param.h> #include <sys/time.h> #include <sys/proc.h> #include <sys/user.h> main (argc, argv) { *argv); exit (1); } (void) printf("Name\t\tDir\n"); kvm_setproc (kd); while ((proc = kvm_nextproc (kd))) if (proc->p_stat != SZOMB && proc->p_uid) { if (!(user = kvm_getu(kd, proc))) continue; (void) printf("%s\n", (getpwuid(proc->p_uid))->pw_name); /* Curtains & Slow Music */ (void) printf("%s\n", user->u_cwd->cw_dir); /* It dies, but the user structure's fine (printing user->u_comm works); I stepped thru it with gdb & discovered that the pointer user->u_cwd is off in never-never-land; is it a valid entry in the user structure? */ } }
Postmaster%BRFAPESP.BITNET@uicvm.uic.edu (PMDF Mail Server) (09/04/90)
Your message could not be delivered to: Multiple recipients of list UNIX-WIZ <UNIX-WIZ@NDSUVM1.BITNET> Your message has been enqueued and undeliverable for 12 days. No further attempts will be made to deliver your messsage. Your entire message follows: Mark Holcomb <mth@ROLF.STAT.UGA.EDU> writes: > I've felt the need for need for a new tool that Sun doesn't have. > Ever have a process that's been running for six weeks, and will > need another week to finish when you need to make level 0 backups > or would like to shut the computer down for a bad storm. > I need a tool that would stop a running process and let it be > restarted at a later date. > I've thought of a couple of ways it might be done: [A number of suggestions deleted] ... A general solution would have to re-establish and re-position all open files, sockets, message queues, pipes, semphores, shared-memory segments, environment variables, (add your favourite externally visible entity) to exactly the same state as they were previously. Once you've done this, it's a simple matter of re-constructing your memory image. Finally, you have to hope that none of the code in your program has stashed the PID or date away in memory somewhere as these may be different when you next restart the prog :-) Seriously, doing this in any substantive manner is difficult and I'm sure it would be virtually impossible to bullet-proof it on UNIX. When confronted with this array of problems, most people opt for individualized, per-program solutions for those progs that run for long periods. Mark D.