marwood@ncs.dnd.ca (Gordon Marwood) (02/03/90)
The following is a code fragment from get21.sh, which is part of the autoftp software available from simtel20: # Create two sub-shell processes, one for FTP, one for time-out. # Each sub shell has two parts. If the first part is succeeded, a KILL # action will be taken to abort either the TIME-OUT or the FTP process. { ftp "$RemoteHost" <ftp.get.$$ && { ps -x > tmp0$$; grep sleep tmp0$$ > tmp$$; pd=`$_pid < tmp$$`; kill -9 $pd ; } } & { sleep $ALARM && { ps -x > tmp0$$; grep ftp tmp0$$ > tmp$$; pd=`$_pid < tmp$$`; kill -9 $pd ; } } The problem with this is that "grep sleep" does not necessarily select the "sleep" which belongs to the autoftp process, if there are other "sleep"s running at the same time (from some other unrelated process). Similarly, "grep ftp" is ambiguous if other "ftp"s are running concurrently. At the moment I have put in a workaround which looks for the "sleep" with the correct delay time, and an ftp using the numeric address rather than the hostname. However, this is not a very elegant way of doing things and not 100% foolproof. I would appreciate any assistance that anyone can offer to do these "kill"s more specifically. Gordon Marwood marwood@ncs.dnd.ca
maart@cs.vu.nl (Maarten Litmaath) (02/06/90)
In article <22332@adm.BRL.MIL>, marwood@ncs.dnd.ca (Gordon Marwood) wants to timeout ftp. How about using the following general purpose script? If that's out of the question, the script might still give you a hint how to solve your problem. ----------8<----------8<----------8<----------8<----------8<---------- #!/bin/sh # @(#)timeout 4.1 90/01/10 Maarten Litmaath prog=`basename $0` verbose=0 case $1 in -v) verbose=1 shift esac expr $# \< 2 \| 0"$1" : '.*[^0-9]' > /dev/null && { echo "Usage: $prog [-v] <timeout in seconds> <command>" >&2 exit 2 } timeout=$1 shift exec 3>&1 4>&2 2> /dev/null pid=` sh -c ' (sleep '$timeout' > /dev/null & echo $!; exec >&-; wait; kill -9 $$) & exec "$@" >&3 3>&- 2>&4 4>&- ' $prog "$@" ` kill -9 $pid && exit 0 test $verbose = 1 && echo TIMEOUT >&4 exit 1 -- The meek get the earth, Henry the moon, the rest of us have other plans. | Maarten Litmaath @ VU Amsterdam: maart@cs.vu.nl, uunet!mcsun!botter!maart
gwc@root.co.uk (Geoff Clare) (02/09/90)
In article <5312@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: >In article <22332@adm.BRL.MIL>, > marwood@ncs.dnd.ca (Gordon Marwood) wants to timeout ftp. > >How about using the following general purpose script? If that's out of the >question, the script might still give you a hint how to solve your problem. [ extremely complicated script deleted ] There is a much simpler way than Maarten's. The basic strategy is: (sleep $time; kill $$) & exec "$@" Here is my "timeout" command, using this method with added frills. An important feature which Maarten's script lacks, is that mine kills the process with SIGTERM, allowing it to clean up. It only goes for SIGKILL if the SIGTERM doesn't do the job. Maarten dives straight for SIGKILL, which could leave a mess. ---------------------- cut here ---------------------- : # execute a command with timeout # Geoff Clare <gwc@root.co.uk>, Feb 1990 USAGE="usage: $0 [-v] [seconds] command args ..." SIGKILL=9 # may be system dependant time=10 # default 10 seconds verbose=n for i in 1 2 do case $1 in -v) verbose=y shift ;; [0-9]*) time=$1 shift ;; ""|-*) echo >&2 "$USAGE" exit 1 ;; esac done pid=$$ ( sleep "$time" # use SIGTERM first to allow process to clean up kill $pid >/dev/null 2>&1 rc=$? sleep 2 # if process hasn't died yet, use SIGKILL kill -$SIGKILL $pid >/dev/null 2>&1 case "$rc$verbose" in 0y) echo >&2 " TIMED OUT \"$*\"" ;; esac ) & exec "$@" ---------------------- cut here ---------------------- -- Geoff Clare, UniSoft Limited, Saunderson House, Hayne Street, London EC1A 9HH gwc@root.co.uk (Dumb mailers: ...!uunet!root.co.uk!gwc) Tel: +44-1-315-6600 (from 6th May 1990): +44-71-315-6600
maart@cs.vu.nl (Maarten Litmaath) (02/10/90)
In article <1212@root44.co.uk>, gwc@root.co.uk (Geoff Clare) writes: )In article <5312@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: )>In article <22332@adm.BRL.MIL>, )> marwood@ncs.dnd.ca (Gordon Marwood) wants to timeout ftp. )> )>How about using the following general purpose script? If that's out of the )>question, the script might still give you a hint how to solve your problem. ) )[ extremely complicated script deleted ] ^^^^^^^^^^^ There were reasons for that, you know! )There is a much simpler way than Maarten's. ) )The basic strategy is: ) ) (sleep $time; kill $$) & exec "$@" Here's an example using your timeout command: $ timeout -v 0 cat Terminated $ TIMED OUT "cat" Sic! The verbose message appears ASYNCHRONOUSLY! That's not what I want! To get a synchronous message I had to have a *child* execute the command supplied, while the *parent* reports the status. In fact your strategy is followed between the backquotes in my script. Another bug in your script is shown by the following example: $ timeout 333 date Sat Feb 10 02:56:10 MET 1990 $ sps Ty User Proc# Command [stuff deleted] p1.maart 1561 sh p1. | 1562 sleep 333 [stuff deleted] $ You leave useless processes hanging around! )Here is my "timeout" command, using this method with added frills. )An important feature which Maarten's script lacks, is that mine kills )the process with SIGTERM, allowing it to clean up. It only goes for )SIGKILL if the SIGTERM doesn't do the job. [...] How do you determine that the job has finished cleaning up? Turning back to my script - the only problem with it: the given command might have created children which keep hanging around after their parent has died. This problem could be solved if there were a variant of `kill(2)' to destroy a complete `process tree'; such a tree is NOT the same as a process *group*. Killing the process group (kill -9 -$pid) doesn't work if your shell is sh, since it doesn't put a `job' into its own process group; you would destroy unrelated processes as well. -- The meek get the earth, Henry the moon, the rest of us have other plans. | Maarten Litmaath @ VU Amsterdam: maart@cs.vu.nl, uunet!mcsun!botter!maart
rock%warp@Sun.COM (Bill Petro (SunOS Marketing)) (02/13/90)
I don't know if this is appropriate, but here is a handy little tool I use: ------------------- Cut here, valuable coupon, collect 'em all -------- #!/bin/sh # # killjobs # # Do our best to locate the PIDs to kill from the give # process names. # MYNAME=`basename $0` Ask=yes case $1 in "") echo "Usage: $MYNAME process_name ..." exit 1 ;; -n) Ask=no; shift ;; esac PSAX=`ps ax` for pname do line= PID= PID=`echo "$PSAX" | grep -w $pname | awk '!/grep|'$MYNAME'/ {print $1}'` for pid in $PID do if [ $Ask = yes ] then line=`echo "$PSAX" | awk '$1 == '$pid' {print}'` echo -n "Kill ($line)? " read reply case $reply in [Yy]*) kill -9 $pid ;; esac else kill -9 $pid fi done done ------------ end ---------------------------------------------- Bill Petro {decwrl,hplabs,ucbvax}!sun!Eng!rock "UNIX for the sake of the kingdom of heaven" Matthew 19:12
gwc@root.co.uk (Geoff Clare) (02/14/90)
In article <5352@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: >In article <1212@root44.co.uk>, > gwc@root.co.uk (Geoff Clare) writes: > >)There is a much simpler way than Maarten's. >) >)The basic strategy is: >) >) (sleep $time; kill $$) & exec "$@" > >Here's an example using your timeout command: > > $ timeout -v 0 cat > Terminated > $ > TIMED OUT "cat" > >Sic! The verbose message appears ASYNCHRONOUSLY! That's not what I want! There is a big difference in the purpose of the '-v' option between our scripts. Yours provides the only message you get to say that the command timed out. With mine the shell has already informed me of that (synchronously) with the "Terminated" message seen above. The '-v' would not normally be used in this simple case. It is there to provide extra information about *which* command was timed out, in the event of several parallel "timeout" commands. Also, the delay is a direct result of allowing the process time to clean up. Using a straight 'kill -9' as your script did would mean the verbose message would appear before the prompt. The very slight delay in providing the additional information is a small price to pay for allowing the timed out process to tidy up. >To get a synchronous message I had to have a *child* execute the command >supplied, while the *parent* reports the status. But this has a very serious drawback - you lose the exit status of the executed command. The command could die horribly with no error message, and you would not know about it! All your script tells you is whether the command completed within the timeout period or not. >Another bug in your script is shown by the following example: > >[stuff deleted] > >You leave useless processes hanging around! Leaving a harmless sleep command behind will not usually cause any problems. It will get zapped when the user logs out, if it hasn't completed by then. There is no way round this if you want to have the parent process exec the command. This slight drawback is greatly outweighed by the advantage of getting the exit status passed back correctly. >)An important feature which Maarten's script lacks, is that mine kills >)the process with SIGTERM, allowing it to clean up. It only goes for >)SIGKILL if the SIGTERM doesn't do the job. [...] > >How do you determine that the job has finished cleaning up? The script allows 2 seconds. If this might not be enough it could be passed as an option to "timeout". Most commands only need to remove a few temporary files and maybe kill some children, which doesn't take very long. A chance to clean up in a short time is much better than no chance at all. >Turning back to my script - the only problem with it: the given command >might have created children which keep hanging around after their parent >has died. A well designed command will kill its children as part of the clean up procedure when it receives a SIGTERM. That is why it's important to use SIGTERM first rather than going straight for SIGKILL. -- Geoff Clare, UniSoft Limited, Saunderson House, Hayne Street, London EC1A 9HH gwc@root.co.uk (Dumb mailers: ...!uunet!root.co.uk!gwc) Tel: +44-1-315-6600 (from 6th May 1990): +44-71-315-6600
bph@buengc.BU.EDU (Blair P. Houghton) (02/16/90)
In article <1212@root44.co.uk> gwc@root.co.uk (Geoff Clare) writes: >In article <5312@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: >>In article <22332@adm.BRL.MIL> marwood@ncs.dnd.ca (Gordon Marwood) >>>[wants to timeout ftp.] >> >>How about using the following general purpose script? If that's out of the >>question, the script might still give you a hint how to solve your problem. > >[ extremely complicated script deleted ] > >There is a much simpler way than Maarten's. Edited for brevity... Called with "foo (time in seconds) (command and arguments)" >#! /bin/sh >pid=$$ >( > sleep "$time" > # use SIGTERM first to allow process to clean up > kill $pid >/dev/null 2>&1 At this point, and I'll admit it's a rare possibility, but not an impossibility, especially on multiprocessor machines, what happens if the $command (see below) has already exited and some other process (possibly on another processor) has begun with the same pid? Or won't that happen while this backgrounded process is tying up the process-group number? > sleep 2 > # if process hasn't died yet, use SIGKILL > kill -$SIGKILL $pid >/dev/null 2>&1 >) & > >exec "$command" I'm trying to implement exactly this control structure in C, now, and one of my least favorite problems is that of getting the status of a process I may not own without having to do system("ps -l##### > /tmp/foo"); /* ##### is the pid */ (or fork-and-exec, etc... Actually, the system() call wouldn't be much of a diseconomy, because it only has to execute the ps(1) once or twice after what may be several hours of sleep()'ing.) I mean, just how the heck does ps(1) do it? Everything I've seen implies it goes through /dev/mem one byte at a time. --Blair "Oh, lovely..."
maart@cs.vu.nl (Maarten Litmaath) (02/17/90)
In article <1221@root44.co.uk>, gwc@root.co.uk (Geoff Clare) writes: )... The '-v' would )not normally be used in this simple case. It is there to provide extra )information about *which* command was timed out, in the event of several )parallel "timeout" commands. I still want the verbose info to appear synchronously (see below). )Also, the delay is a direct result of allowing the process time to )clean up. Using a straight 'kill -9' as your script did would mean )the verbose message would appear before the prompt. The very slight )delay in providing the additional information is a small price to pay )for allowing the timed out process to tidy up. In your script the delay is always 2 seconds; in the latest version of my script it's a parameter. )>To get a synchronous message I had to have a *child* execute the command )>supplied, while the *parent* reports the status. ) )But this has a very serious drawback - you lose the exit status of the )executed command. The command could die horribly with no error message, )and you would not know about it! All your script tells you is whether )the command completed within the timeout period or not. Right; fixed in the current version (easy). )>Another bug in your script is shown by the following example: )> )>[stuff deleted] )> )>You leave useless processes hanging around! ) )Leaving a harmless sleep command behind will not usually cause any )problems. It will get zapped when the user logs out, if it hasn't )completed by then. [...] What if the command wasn't started from a terminal? Not nice. Unnecessary too. Another plus of timeout 5.0: the signal is now a parameter too. Now it's your turn again, Geoff! :-) --------------------cut here-------------------- #!/bin/sh # @(#)timeout 5.0 90/02/17 Maarten Litmaath prog=`basename $0` verbose=0 SIG=-KILL sigopt=0 sleep=: timeout=10 usage="Usage: $prog [-v] [-signal] [timeout] [+delay] [--] <command>" while : do case $1 in --) shift break ;; -v) verbose=1 ;; -*) SIG=$1 sigopt=1 ;; +*) EXPR='..\(.*\)' delay=`expr x"$1" : "$EXPR"` sleep="kill -0 \$\$ && sleep $delay && kill -KILL \$\$" case $sigopt in 0) SIG=-TERM esac ;; [0-9]*) timeout=$1 ;; *) break esac shift done case $# in 0) echo "$usage" >&2 exit 2 esac exec 3>&1 pid=` sh -c ' (sleep '$timeout' > /dev/null & echo $!; exec >&-; wait; kill '$SIG' $$ && '"$sleep"') 2> /dev/null & exec "$@" >&3 3>&- ' $prog "$@" ` status=$? kill -9 $pid 2> /dev/null || { test $verbose = 1 && echo "TIMEOUT: $*" >&2 } exit $status --------------------cut here-------------------- -- "Ever since the discovery of domain addresses in the French cave paintings ..." (Richard Sexton) | maart@cs.vu.nl, uunet!mcsun!botter!maart
maart@cs.vu.nl (Maarten Litmaath) (02/20/90)
In article <5382@buengc.BU.EDU>, bph@buengc.BU.EDU (Blair P. Houghton) writes: )... )> kill $pid >/dev/null 2>&1 ) )At this point, and I'll admit it's a rare possibility, but )not an impossibility, especially on multiprocessor machines, )what happens if the $command (see below) has already exited )and some other process (possibly on another processor) has )begun with the same pid? [...] Possible if your `$command' has been running for a *long* time and/or new processes have come and gone like crazy in the meantime... Normally it takes a few days for the pid to wrap around; conventionally MAXPID is 30,000. This value may have to be raised. )... one of my least favorite problems is that of getting )the status of a process I may not own without having to do ) ) system("ps -l##### > /tmp/foo"); /* ##### is the pid */ )... How about `fp = popen("ps -l#####", "r")'? To check if a process is alive use `kill(pid, 0)'. -- "Ever since the discovery of domain addresses in the French cave paintings [...]" (Richard Sexton) | maart@cs.vu.nl, uunet!mcsun!botter!maart
bph@buengc.BU.EDU (Blair P. Houghton) (02/21/90)
In article <5615@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: >In article <5382@buengc.BU.EDU> bph@buengc.BU.EDU (Blair P. Houghton) writes: >>what happens if the command has already exited >>and some other process (possibly on another processor) has >>begun with the same pid? > >Possible if your `$command' has been running for a *long* time and/or new >processes have come and gone like crazy in the meantime... Normally it >takes a few days for the pid to wrap around; I'm talking about timing-out someone's login, possibly 12-20 hours after the killer-process has been started. Plenty of time. >conventionally MAXPID is 30,000. This value may have to be raised. It's suspiciously low. I can live with it, though, as cave-sysadmins have had to do since the dawn of Unix. I've got another idea, though, which is rather specific to this problem. Basically, I'm writing Yet Another Access Scheduler, implementing it as a shell-wrapper to be placed as a user's default-shell in /etc/passwd. There's a dichotomy where the decision is to be made whether to fork-and-exec the process-killer (which then sleep()'s until it's time to kick the user off) and exec the shell, or to fork-and-exec the shell and exec the killer. So, if I fork the shell, it becomes the killer's child, and the killer will get SIGCLD if the shell is exited. If I do it the other way, I run the risk of zapping some other guy. I'll have to balance that fact against the three reasons I've switched this flow-of-control three times already, but it's somewhat stronger than the others, which involved convenience of I/O... (BTW, the killer has to run suid-root, so the user can't kill it first, so respecting interrupts is out.) >> system("ps -l##### > /tmp/foo"); /* ##### is the pid */ >How about `fp = popen("ps -l#####", "r")'? Alas, popen() ain't ANSI, but who am I kidding? Good idea. --Blair "I wonder if Stephen King has this sort of plotting trouble with his killers..."
maart@cs.vu.nl (Maarten Litmaath) (02/22/90)
In article <5405@buengc.BU.EDU>, bph@buengc.BU.EDU (Blair P. Houghton) writes: )... )I'm talking about timing-out someone's login, possibly 12-20 hours )after the killer-process has been started. Plenty of time. I think you need a new system call like `killtree()' or `killsession()' to solve your problem completely: - you cannot kill all his processes, as you must leave background processes and processes on other terminals alone - process groups aren't the answer either Therefore there's a time window between determining which processes have to be killed, and actually killing them. During this interval the user may have created new processes. -- "Ever since the discovery of domain addresses in the French cave paintings [...]" (Richard Sexton) | maart@cs.vu.nl, uunet!mcsun!botter!maart
bph@buengc.BU.EDU (Blair P. Houghton) (02/22/90)
In article <5636@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: >In article <5405@buengc.BU.EDU>, > bph@buengc.BU.EDU (Blair P. Houghton) writes: >)... >)I'm talking about timing-out someone's login, possibly 12-20 hours >)after the killer-process has been started. Plenty of time. > >I think you need a new system call like `killtree()' or `killsession()' to >solve your problem completely: Sending a kill -HUP to a login-shell in the console window of a uVAX workstation Running Ultrix clobbers all of a login's processes and cycles the X server. I'm going to have to implement something to duplicate the essential portions of the globalized .logout procedure (we've got some buggy CAD sw that refuses to die unless you've closed it carefully or kill -KILL'ed it. Actually, it's not so much buggy as it's poorly designed by some less-than-proficient people at a UCalifornian school that's rather well known for its software but must be eating dirt due to these guys but that shall remain nameless...) >- you cannot kill all his processes, as you must leave background > processes and processes on other terminals alone In _this_ case, I can justifiably kill all his processes, and remote logins and backgrounded jobs are verboten as well. The scheduling is designed for lab purposes, and excluding someone from the scheduling restriction is the default. Anyone with the right to run as a normal user will get it. It's only the great masses of students who only need one piece of CAD software and a little email who get scheduled. The scheduler was designed to control lab access, rather than to massage terminal usage. Use of the CPU is the important thing. I may yet go looking for all processes with the user's pid in them and "kill -f" them (I'll have to write an implementation of the -f flag, though...:) >- process groups aren't the answer either And it seems from the documentation that they would be... :-( >Therefore there's a time window between determining which processes have to >be killed, and actually killing them. During this interval the user may have >created new processes. It's a matter of getting the right one and making sure it's the same user. Right now I'm checking to see that the pid is running, that it's owned by the correct uid, on the same tty, and has the same name (e.g., "-sh", "-csh", "-ksh", etc.) as when the killer-program was started (i.e., at login time.) It's still a crap-shoot, but it's a crap-shoot with probability of failure that has an *upper* bound of 1/30,000. That is one in 30,000 times when the same person has logged out and logged back in on the same workstation on the same day. Such a situation is not too common in a big lab with tons of students and lots of workstations. So I'm reduced from perfection to sufficiency and a balancing of the relative costs. I can handle one in 30,000 users' getting angry. I can't handle the eleven professorial harangues per day when students can't get into the lab on time... --Blair "If you can't do it right, at least be successful at it."
maart@cs.vu.nl (Maarten Litmaath) (02/23/90)
In article <5410@buengc.BU.EDU>, bph@buengc.BU.EDU (Blair P. Houghton) writes: )... )Sending a kill -HUP to a login-shell in the console window )of a uVAX workstation Running Ultrix clobbers all of a )login's processes and cycles the X server. Ever heard of `signal(SIGHUP, SIG_IGN)'? )... )>Therefore there's a time window between determining which processes have to )>be killed, and actually killing them. During this interval the user may have )>created new processes. ) )It's a matter of getting the right one and making sure it's )the same user. Right now I'm checking to see that the pid )is running, that it's owned by the correct uid, on the same )tty, and has the same name (e.g., "-sh", "-csh", "-ksh", )etc.) as when the killer-program was started (i.e., at )login time.) It seems one can easily bypass your little scheme: execl("/bin/sh", "Will Blair find me?", (char *) 0); Furthermore: race conditions! )It's still a crap-shoot, but it's a crap-shoot with )probability of failure that has an *upper* bound of )1/30,000. That is one in 30,000 times when the same )person has logged out and logged back in on the same )workstation on the same day. Such a situation is not )too common in a big lab with tons of students and lots )of workstations. [...] I was talking of events with much higher probability. -- "Ever since the discovery of domain addresses in the French cave paintings [...]" (Richard Sexton) | maart@cs.vu.nl, uunet!mcsun!botter!maart
gwc@root.co.uk (Geoff Clare) (02/23/90)
In article <5448@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: >I still want the verbose info to appear synchronously (see below). I'll let you into a secret. The original version never had a "verbose" message at all - I never felt the need for it. I added it because I was posting my script as an alternative to yours, but using a better strategy, so I thought it ought to have the same facilities. As I said before, normally the verbose message isn't needed - the shell will report the termination of the process. However, if you really don't like the delay in the message, I offer the following change which causes it to be printed straight away (although still after the prompt) in cases where the command dies immediately on the first kill. Change: sleep 2 # if process hasn't died yet, use SIGKILL kill -$SIGKILL $pid >/dev/null 2>&1 to: if kill -0 $pid >/dev/null 2>&1 then sleep 2 # if process hasn't died yet, use SIGKILL kill -$SIGKILL $pid >/dev/null 2>&1 fi >In your script the delay is always 2 seconds; in the latest version of my >script it's a parameter. This isn't very useful - how do you predict how long the command will need to clean up? Two seconds is plenty for most commands. >)>To get a synchronous message I had to have a *child* execute the command >)>supplied, while the *parent* reports the status. >) >)But this has a very serious drawback - you lose the exit status of the >)executed command. The command could die horribly with no error message, >)and you would not know about it! All your script tells you is whether >)the command completed within the timeout period or not. > >Right; fixed in the current version (easy). So now you can tell when something went wrong, but you still aren't getting the full picture. If the command is terminated by a signal your script will instead exit with a non-zero exit code (usually 128 + signal number). Another big advantage in having the parent execute the command is that it is then a normal foreground process, so you can use the INTR and QUIT keys in the normal way. If the user interrupts your script, he gets a prompt back and may think he has killed the command, whereas in fact it's still running in the background. Face it Maarten, having the child execute the command is a total no-hoper. >)Leaving a harmless sleep command behind will not usually cause any >)problems. It will get zapped when the user logs out, if it hasn't >)completed by then. [...] > >What if the command wasn't started from a terminal? Not nice. >Unnecessary too. The script was designed for casual use from a terminal. If I ever wanted to put it to more serious use, I could get rid of the leftover sleep by adding another background process to monitor the other two. Or better still, I would use a C program. >Another plus of timeout 5.0: the signal is now a parameter too. Another unnecessary frill. SIGTERM is the right signal to use - that's why it's the default for the "kill" command. >Now it's your turn again, Geoff! :-) I'm not going to post my script again, because I think we've wasted enough net.bandwidth on this already. If anybody has saved a copy, they can apply the change I suggested above if they think it's worth it. (I don't think it is - I doubt if I will ever use the '-v' option). -- Geoff Clare, UniSoft Limited, Saunderson House, Hayne Street, London EC1A 9HH gwc@root.co.uk (Dumb mailers: ...!uunet!root.co.uk!gwc) Tel: +44-1-315-6600 (from 6th May 1990): +44-71-315-6600
maart@cs.vu.nl (Maarten Litmaath) (02/24/90)
In article <1381@root44.co.uk>, gwc@root.co.uk (Geoff Clare) writes: )... )This isn't very useful - how do you predict how long the command will )need to clean up? Two seconds is plenty for most commands. For *most* commands (according to you, anyway) - why not let the user specify the delay? It's no trouble at all and leads to higher generality. You shouldn't say too quickly: "The user doesn't need it." That's the approach which leads to things like: $ set a b c d e f g h i j $ echo $10 # echo parameter 10 a0 # oops! "The user doesn't need more than 9 arguments." )... )So now you can tell when something went wrong, but you still aren't )getting the full picture. If the command is terminated by a signal )your script will instead exit with a non-zero exit code (usually )128 + signal number). You're right again! I've posted another script to alt.sources, which does things your way (at last! :-), but having a few extras too. )Another big advantage in having the parent execute the command is that )it is then a normal foreground process, so you can use the INTR and QUIT )keys in the normal way. If the user interrupts your script, he gets )a prompt back and may think he has killed the command, whereas in fact )it's still running in the background. Only the sleep (and its wait()ing parent) keep running, *just* like in your approach! )... )The script was designed for casual use from a terminal. Yours, not mine. )If I ever )wanted to put it to more serious use, I could get rid of the leftover )sleep by adding another background process to monitor the other two. )Or better still, I would use a C program. One thing that's clear from our discussion: sh isn't powerful enough! :-( )>Another plus of timeout 5.0: the signal is now a parameter too. ) )Another unnecessary frill. SIGTERM is the right signal to use - that's )why it's the default for the "kill" command. Again I don't agree; first there's the generality, then there's the fact that SIGHUP is used to signal exceptions too, and lastly both SIGALRM and SIGXCPU seem normal to send on a *timeout*. -- "Belfast: a sentimental journey to the Dark Ages - Crusades & Witchburning - Europe's Lebanon - Book Now!" | maart@cs.vu.nl, uunet!mcsun!botter!maart
bph@buengc.BU.EDU (Blair P. Houghton) (02/25/90)
In article <5650@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: >In article <5410@buengc.BU.EDU>, > bph@buengc.BU.EDU (Blair P. Houghton) writes: >)... >)Sending a kill -HUP to a login-shell in the console window >)of a uVAX workstation Running Ultrix clobbers all of a >)login's processes and cycles the X server. > >Ever heard of `signal(SIGHUP, SIG_IGN)'? I use HUP because it gives the honest processes a chance to use `signal(SIGHUP, somethinguseful)'. If you want, you can try this: stream_type fp; char line[WAYBIG]; /* Already got user's uid via whatever means. */ fp = popen("/bin/ps axl","r"); while ( fgets(line,sizeof(line),fp) != (char *)NULL ) if ( uid == parse_for_uid(line) ) kill( parse_for_pid(line), SIGKILL ); pclose(fp); SIG_IGN that! (BTW, most of the time in that loop is spent waiting for the `ps axl' to get aroung to its first line of output, so it's more economical than one would think, at first.) Of course, this is overkill when the only people I plan to force the schedule on are almost uniformly without any knowledge of unix, much less esoterica like execl's. I'll know well anyone who knows what a SIG_IGN is, and I'll trust them not to bust my stuff, or I won't let them log in at all. It's obvious that unless you're trying to implement it as a convenience rather than a security measure you're going to have to rewrite the kernel, or add a killsession() call, or getcherself a Real Operating System (hah!)... >Furthermore: race conditions! Which never elaborate themselves... >)It's still a crap-shoot, but it's a crap-shoot with >)probability of failure that has an *upper* bound of >)1/30,000. That is one in 30,000 times when the same >)person has logged out and logged back in on the same >)workstation on the same day. Such a situation is not >)too common in a big lab with tons of students and lots >)of workstations. [...] > >I was talking of events with much higher probability. Wot? You can assign a PID to your own process? I'd like to see that... --Blair "Blair -v"
maart@cs.vu.nl (Maarten Litmaath) (03/01/90)
In article <5414@buengc.BU.EDU>, bph@buengc.BU.EDU (Blair P. Houghton) writes: )... ) fp = popen("/bin/ps axl","r"); ) while ( fgets(line,sizeof(line),fp) != (char *)NULL ) ) if ( uid == parse_for_uid(line) ) ) kill( parse_for_pid(line), SIGKILL ); ) pclose(fp); ) )SIG_IGN that! Between the `ps' and the kill() the user might have created new processes. Race conditions, babe. )... )>I was talking of events with much higher probability. ) )Wot? You can assign a PID to your own process? I'd like )to see that... See above. Talking about assigning your own pid: a friend of mine (Peter Valkenburg, valke@psy.vu.nl) once wrote a program called `snatch_pid'; you fed it the pid you wanted to grab, then waited an hour or so (ca. 30,000 fork()s on a VAX). The use: if some game initializes its pseudo random generator with `srand(getpid())'... ) --Blair ) "Blair -v" "Maarten -SEGV 666 :667 +1 /vmunix" -- "Belfast: a sentimental journey to the Dark Ages - Crusades & Witchburning - Europe's Lebanon - Book Now!" | maart@cs.vu.nl, uunet!mcsun!botter!maart
gwc@root.co.uk (Geoff Clare) (03/01/90)
In article <5669@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: > >You're right again! I've posted another script to alt.sources, which does >things your way (at last! :-), but having a few extras too. Glad to hear you've seen the light (at last :-). Your new script is the same as mine with one worthwhile addition and a few rather less useful (IMHO) ones. Thanks for saving me the effort of implementing my suggested method for tidying up the leftover sleep. >)>Another plus of timeout 5.0: the signal is now a parameter too. >) >)Another unnecessary frill. SIGTERM is the right signal to use - that's >)why it's the default for the "kill" command. > >Again I don't agree; first there's the generality, then there's the fact >that SIGHUP is used to signal exceptions too, and lastly both SIGALRM and >SIGXCPU seem normal to send on a *timeout*. Sorry, all three signals you mention are not right for this purpose. SIGHUP: you might want to do a "nohup timeout somecommand ... &" SIGALRM: is not for "timing out" a process, it's for use by a process, e.g. for timing out a system call or for sleeping. If the process is using SIGALRM, all your "time out" will do is wake it up early. SIGXCPU: is for limiting resource usage, and in any case is non-standard. The phrase "time out" when applied to a process really means "terminate before normal completion". When you want to *TERM*inate a process you use SIG*TERM*. Need I say more? -- Geoff Clare, UniSoft Limited, Saunderson House, Hayne Street, London EC1A 9HH gwc@root.co.uk (Dumb mailers: ...!uunet!root.co.uk!gwc) Tel: +44-1-315-6600 (from 6th May 1990): +44-71-315-6600
maart@cs.vu.nl (Maarten Litmaath) (03/02/90)
In article <1813@root44.co.uk>, gwc@root.co.uk (Geoff Clare) writes: )... )Glad to hear you've seen the light (at last :-). My original script explored the (portable/V7) sh boundaries. I wanted to distinguish between someone *else* killing the job and *timeout* killing it; this was the purpose of the `-v' option (I know, there's a race condition); as `timeout -v' already told me the job had timed out, I didn't want the shell's `Killed' message; therefore I diddled with file descriptor 2; as I wanted a synchronous message I had to invoke an `sh -c' to do the real work; to kill a leftover sleep I had to use the backquote construct. )Your new script is the same as mine with one worthwhile addition and a )few rather less useful (IMHO) ones. Thanks for saving me the effort of )implementing my suggested method for tidying up the leftover sleep. There's still a better way, something like: for t in $timeout $delay do while test $t -gt $interval do sleep $interval kill -0 $$ || exit t=`expr $t - $interval` done sleep $t kill $SIG $$ && kill -0 $$ || exit SIG=-KILL done )... )SIGHUP: you might want to do a "nohup timeout somecommand ... &" ^^^^^ Indeed. )SIGALRM: is not for "timing out" a process, it's for use by a process, e.g. ) for timing out a system call or for sleeping. If the process is ) using SIGALRM, all your "time out" will do is wake it up early. In general you're right; however, is it inconceivable that the process has been especially configured to cleanup on reception of a SIGALRM? )SIGXCPU: is for limiting resource usage, and in any case is non-standard. So what? From `man init' on SunOS 4.0.3c: init catches the hangup signal (SIGHUP) and interprets it to mean that the file /etc/ttytab should be read again. "Boo hiss! SIGHUP is for signaling a hangup on a terminal line!" )The phrase "time out" when applied to a process really means "terminate )before normal completion". When you want to *TERM*inate a process you use )SIG*TERM*. Need I say more? "No. You need to say less." -- Richard Sexton, richard@gryphon.COM Couldn't resist! :-) The phrase "time out" when applied to a process really means "kill before normal completion". When you want to *KILL* a process you use SIG*KILL*. Sic! -- "Belfast: a sentimental journey to the Dark Ages - Crusades & Witchburning - Europe's Lebanon - Book Now!" | maart@cs.vu.nl, uunet!mcsun!botter!maart
bph@buengc.BU.EDU (Blair P. Houghton) (03/03/90)
In article <5724@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: >So what? From `man init' on SunOS 4.0.3c: > > init catches the hangup signal (SIGHUP) and interprets it to > mean that the file /etc/ttytab should be read again. > >"Boo hiss! SIGHUP is for signaling a hangup on a terminal line!" Yeah. It's crufty all right, but when was the last time init made a phone call? --Blair "Holee-- ALASKA!?"
alex@impch.imp.com (Alex Hanselmann) (03/12/90)
In article <5669@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: >You shouldn't say too quickly: "The user doesn't need it." >That's the approach which leads to things like: > > $ set a b c d e f g h i j > $ echo $10 # echo parameter 10 > a0 # oops! > >"The user doesn't need more than 9 arguments." if you have a ksh or csh you can type: $ echo ${10} Alex +-------------------------------------------------------------------------+ | Alex Hanselmann, Laengistr 15, 4133 Pratteln, EMAIL: alex@imp.com | | ( UNIX && C ) makes it possible - ImproWare +41-61-82171-19 / 44 (FAX) | +-------------------------------------------------------------------------+
gwc@root.co.uk (Geoff Clare) (03/12/90)
In article <5724@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes: | |)SIGXCPU: is for limiting resource usage, and in any case is non-standard. | |So what? From `man init' on SunOS 4.0.3c: | | init catches the hangup signal (SIGHUP) and interprets it to | mean that the file /etc/ttytab should be read again. | |"Boo hiss! SIGHUP is for signaling a hangup on a terminal line!" Of course a program can choose to use any signal for it's own purposes. But that's not really relevant to the point under discussion, which was what signal should be used for terminating processes in general. The correct signal for this job is SIGTERM, because any well designed program will clean up and exit ASAP when it receives a SIGTERM. |)The phrase "time out" when applied to a process really means "terminate |)before normal completion". When you want to *TERM*inate a process you use |)SIG*TERM*. Need I say more? | |The phrase "time out" when applied to a process really means "kill |before normal completion". When you want to *KILL* a process you use |SIG*KILL*. Sic! Looks like we're going round in circles here. This was one of my original objections to your old method. SIGKILL should only be used as a last resort. Going straight for SIGKILL does not allow the process to clean up. "... we came in? <Pink Floyd - The Wall> Isn't this where ..." I think we're probably talking to ourselves here, Maarten. Everyone else put this subject in their KILL file ages ago. -- Geoff Clare, UniSoft Limited, Saunderson House, Hayne Street, London EC1A 9HH gwc@root.co.uk (Dumb mailers: ...!uunet!root.co.uk!gwc) Tel: +44-1-315-6600 (from 6th May 1990): +44-71-315-6600