wgsiemel@praxis.cs.ruu.nl (Willem Siemelink) (10/03/90)
I have got a process that takes days to complete. However, the System Administration does not want me to run it in daytime. So now I am looking for a way to stop a process and later continue it. We have HP UX 7.0 running on the workstations here. I can do this by hand by typing ^Z on the running process followed by 'bg' and 'fg' but that is only when I'm on the keyboard at the very moment. Obviously that isn't good enough. I've had a suggestion using 'kill' but I couldn't figure it out. ('kill -3 <pid> gives a core-dump but I can't get it started again.) Any (clear) suggestion would be much appreciated. If responses are posted here or if you mail me at <wgsiemel@praxis.cs.ruu.nl> I'll be able to do something with them. (I mean to say that I am not going to track responses in different groups). If I happened to break some local curtesy I'm sorry, I didn't mean to. Have a day, Willem. -- The good thing about death is that it is preceded by life. <me>
gt0178a@prism.gatech.EDU (Jim Burns) (10/04/90)
in article <3940@ruuinf.cs.ruu.nl>, wgsiemel@praxis.cs.ruu.nl (Willem Siemelink) says: > I have got a process that takes days to complete. However, the System > Administration does not want me to run it in daytime. So now I am looking for How about running it in a script that does something like this: user-process& kill -STOP $! #suspend last background job echo kill -CONT $!|at 1900 #resume job at 7pm echo kill -STOP $!|at 900 #suspend next morning #repeat variations of above 2 lines #for as many days as estimated need echo $!>$HOME/.batch-job #record pid in case still running #after above at's exhausted Really long jobs could be handled similarly w/cron(1). Make sure you read the 'man at' pages for admin files that must be setup to allow you to run at jobs, and other gotchas. -- BURNS,JIM Georgia Institute of Technology, Box 30178, Atlanta Georgia, 30332 uucp: ...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!prism!gt0178a Internet: gt0178a@prism.gatech.edu
dan@kfw.COM (Dan Mick) (10/04/90)
In article <3940@ruuinf.cs.ruu.nl> wgsiemel@praxis.cs.ruu.nl (Willem Siemelink) writes: >I can do this by hand by typing ^Z on the running process followed by 'bg' and >'fg' but that is only when I'm on the keyboard at the very moment. Obviously >that isn't good enough. I've had a suggestion using 'kill' but I couldn't >figure it out. ('kill -3 <pid> gives a core-dump but I can't get it started >again.) The fact that you can ^Z and bg/fg means you must be on a BSD-derived system with job control; you can send the same signals to the process with kill -TSTP <pid> to stop the process, and kill -CONT <pid> to resume it.
wayne@teemc.UUCP (Michael R. Wayne) (10/04/90)
In article <3940@ruuinf.cs.ruu.nl> wgsiemel@praxis.cs.ruu.nl (Willem Siemelink) writes: >I have got a process that takes days to complete. However, the System >Administration does not want me to run it in daytime. So now I am looking for >a way to stop a process and later continue it. We have HP UX 7.0 running on >the workstations here. >I can do this by hand by typing ^Z on the running process followed by 'bg' and >'fg' but that is only when I'm on the keyboard at the very moment. Obviously >that isn't good enough. I've had a suggestion using 'kill' but I couldn't >figure it out. ('kill -3 <pid> gives a core-dump but I can't get it started >again.) Follows a shar file which contains a program doing exactly what you want to do without relying on BSD job control at all. A couple of commands to cron will allow you to start and stop your process on command. You might wish to remove the printf's once you understand what is going on (or you might not). Commands can alternately be placed into your .login and .logout to run jobs only when you are not logged in. /\/\ \/\/ #! /bin/sh # This is a shell archive. Remove anything before this line, then unpack # it by saving it into a file and typing "sh file". To overwrite existing # files, type "sh file -c". You can also feed this as standard input via # unshar, or by typing "sh <file", e.g.. If this archive is complete, you # will see the following message at the end: # "End of shell archive." # Contents: signal_tst.c # Wrapped by wayne@teemc.tmc.mi.org on Thu Oct 4 10:25:48 1990 PATH=/bin:/usr/bin:/usr/ucb ; export PATH if test -f signal_tst.c -a "${1}" != "-c" ; then echo shar: Will not over-write existing file \"signal_tst.c\" else echo shar: Extracting \"signal_tst.c\" \(1525 characters\) sed "s/^X//" >signal_tst.c <<'END_OF_signal_tst.c' X#include <stdio.h> X#include <sys/signal.h> X#include <time.h> X Xint resume(), suspend(); Xlong t; X Xchar *ctime(); Xlong time(); X X/* X * Simple program to demonstrate use of user signals to suspend X * and resume a job which tends to run for a long time. X * X * Note that this really isn't a true "checkpoint" as the program X * status is not preserved, execution is merely interrupted. X * X * To use: X * compile w/ cc X * run the executable in the background X * % kill -USR2 will suspend execution X * % kill -USR1 will resume execution X */ Xmain() X{ X /* X * Grab the signals X */ X if (((int) signal(SIGUSR1, resume) == -1) || X ((int) signal(SIGUSR2, suspend) == -1) || X ((int) signal(SIGHUP, SIG_IGN) == -1)) { X printf("can't catch signals.\n"); X exit(-1); X } X /* X * Normally we want processes that run for a long time to be X * very, very nice. X */ X if (nice(19) == -1) { X time(&t); X printf("$.24s - can't be nice. continuing...\n", ctime(&t)); X } X (void) time(&t); X printf("%.24s - STARTED\n", ctime(&t)); X /* X * Your code that runs a long time goes here X */ X for(;;) X ; /* Loop forever for example purposes */ X} X Xresume(i) Xint i; X{ X if ((int) signal(SIGUSR1, resume) == -1) { X printf("can't recatch signal USR1 .\n"); X exit(-1); X } X (void) time(&t); X printf("%.24s - RESUMED.\n", ctime(&t)); X return(0); X} X Xsuspend(i) Xint i; X{ X if ((int) signal(SIGUSR2, suspend) == -1) { X printf("can't recatch signal USR2.\n"); X exit(-1); X } X (void) time(&t); X printf("%.24s - SUSPENDED.\n", ctime(&t)); X pause(); X return(0); X} END_OF_signal_tst.c if test 1525 -ne `wc -c <signal_tst.c`; then echo shar: \"signal_tst.c\" unpacked with wrong size! fi # end of overwriting check fi echo shar: End of shell archive. exit 0 -- Michael R. Wayne --- TMC & Associates --- wayne@teemc.tmc.mi.org Operator of the only 240 Horsepower UNIX machine in Michigan
decot@hpisod2.HP.COM (Dave Decot) (10/05/90)
The signal SIGSTOP can be sent to the process (or process group) from some other process at any time in the future, and the process (group) will be suspended. To get it going again later, it can send the SIGCONT signal to the process (group). You can have the job itself save the process (group) ID by having it open some file and write its process ID there (or process group ID, from calling getpgid()), and picking it up later. From a shell script or cron job (see crontab(1)), this would be: kill -24 `cat /tmp/foopid` # stop the foo job kill -26 `cat /tmp/foopid` # resume the foo job Dave Decot (This is not necessarily HP's opinion and no warranty is expressed or implied.)
decot@hpisod2.HP.COM (Dave Decot) (10/05/90)
You and your "System Administration" should also find out about nice(1), since this is probably much more appropriate to your situation than stopping and starting your process. There should be no problem with running your process during the day if it is set to run at a lower priority. To use nice(1) to run a program "prog" that takes arguments "arg1 arg2 ...", do: nice -5 prog arg1 arg2 ... This runs prog at priority likely to be lower than most other processes. It will only run when there is nothing more "important" to do. Dave
tmh@prosun.first.gmd.de (Thomas Hoberg) (10/13/90)
I have thought a bit about that problem, too, some time ago. I never implemented a solution, but I had the following ideas: Programs like TeX, GNU-Emacs and some SCHEME interpreter I know use mechnisms to either dump themselves (after having loaded some libraries) in a form that can be restarted later, or some tool, to turn a core dump back into an executable program. For the first variant, your program should install a signal handler for some user defined signal to dump it, for the second, send a SIG_QUIT to the programm to get a core dump, then manually 'undump' it with that utility you hopefully found in the TeX or Emacs distribution, to get an executable that can be restarted. I guess this is a bit messy, and I would like to see a facility to force processes to the disk on the run in UNIX, because it might generally make automatic powerfail restarts possible. ;-) Tom PS. Tell me, if you got it to work! ---- Thomas M. Hoberg tmh@prosun.first.gmd.de GMD Berlin, Hardenbergplatz 2, 1000 Berlin 12, Germany +49-30-254 99-160
tmh@prosun.first.gmd.de (Thomas Hoberg) (10/13/90)
Oops, rereading the original posting, I found, that you merely wanted to stop during the daytime to reduce the load. Workable solutions have been presented so I won't add any. Still a facility of the kind I described would be nice, in order to be able to shut down the system (maintenance or other), and restart it without getting hate mail from users, whose batch monosters got trashed in the process... ---- Thomas M. Hoberg tmh@prosun.first.gmd.de GMD Berlin, Hardenbergplatz 2, 1000 Berlin 12, Germany +49-30-254 99-160
jkimble@bally.Bally.COM (The Programmer Guy) (10/16/90)
If your flavor of UNIX allows you to vary the "nice" values plus/minus, why not just bump the priority way, way, way down during the day and then bring it back up to normal during the evening. This would allow the process to keep chugging along in the day time without getting too much in the way, and it might be able to grab some "free" time during non-active hours (like lunch time). This seems a lot less messy than starting/stopping; is there any badness associated with dynamic changes to the NICE values that I've overlooked? -- --Jim Kimble, jkimble@bally.bally.com Consulting for Bally Systems uunet!bally!jkimble "ALPO is 99 cents a can. That's over SEVEN dog dollars!!"