bhoughto@hopi.intel.com (Blair P. Houghton) (02/06/91)
In article <BZS.91Feb5105600@world.std.com> bzs@world.std.com (Barry Shein) writes: >Just write a shell script loop which uses awk to suck out the TIME >field of the procs as reported by ps and see if it's changing or not. Gack. Scripts for daemons. Barry, you should be ashamed. :-) Write it in C and use popen(3). (Use YACC for the parsing, if you want it done to a crisp). (Uh-oh. My BSD is showing. Does SysV know popen?) A stronger reason to use C is that ps(1) often runs fields together; it can also munge columnar justification. You'll have to detect and work around those situations. (If you can characterize the various forms of run-together fields, lex(1) can tokenize them together and then yacc(1) can parse them apart). For example (output of "ps alx | egrep 'TIME|foo'"): F UID PID PPID CP PRI NI ADDR SZ RSS WCHAN STAT TT TIME COMMAND 94080012010 11236 11235255 96 41eba211194 9660 R N pa143:58 (foo) b0000013092 17289 244 7 1 017232 41 21 be598 S 10 0:00 egrep TIME| PID 17829 is fine. PID 11236 is a mess. awk(1) and perl(1) would barf up gnodes at this. (Those of you who smell challenge, smell aright.) >For a problem like this I'll bet you a nickel crafting the whole thing >in C using /dev/kmem etc will not be much faster than the above >described script, and will take a week to get right instead of 30 minutes. Better, use popen(3) and top(1). Top usually gets the data much faster than ps. Why? Who knows? Could be anything from superior skills among public-domain software developers to abuse of /dev/null. --Blair "My every keyword is copyright someone else, these days..."
emv@ox.com (Ed Vielmetti) (02/06/91)
In article <2304@inews.intel.com> bhoughto@hopi.intel.com (Blair P. Houghton) writes:
For example (output of "ps alx | egrep 'TIME|foo'"):
F UID PID PPID CP PRI NI ADDR SZ RSS WCHAN STAT TT TIME COMMAND
94080012010 11236 11235255 96 41eba211194 9660 R N pa143:58 (foo)
b0000013092 17289 244 7 1 017232 41 21 be598 S 10 0:00 egrep TIME|
PID 17829 is fine. PID 11236 is a mess. awk(1) and
perl(1) would barf up gnodes at this. (Those of you who
smell challenge, smell aright.)
awk and sed (and perl for that matter) would do OK as long as they
didn't assume that whitespace was a field delimiter; break on absolute
columns with substr() or unpack(). that's not to say that ps doesn't
have an interesting idea of how to jam fields together...
If portability is not an issue I'd stick something into top (ps,sps
etc) to print out the fields you want in a nice, tagged, easy to parse
format.
--Ed
emv@ox.com
ps. c vs. awk vs. perl isn't a wizards issue at all. hacking ps is.
jfh@rpp386.cactus.org (John F Haugh II) (02/06/91)
In article <2304@inews.intel.com> bhoughto@hopi.intel.com (Blair P. Houghton) writes: >In article <BZS.91Feb5105600@world.std.com> bzs@world.std.com (Barry Shein) writes: >>For a problem like this I'll bet you a nickel crafting the whole thing >>in C using /dev/kmem etc will not be much faster than the above >>described script, and will take a week to get right instead of 30 minutes. > >Better, use popen(3) and top(1). Top usually gets the data >much faster than ps. Why? Who knows? Could be anything >from superior skills among public-domain software developers >to abuse of /dev/null. Hmmm. I'm half tempted to take that bet. One problem I envision with the PS approach is that the CPU resolution is to the full second, and there are many processes which lurk about in the background and don't use much more than a second in a days time. Here's the PS output from this system with those processes selected - UID PID PPID C STIME TTY TIME COMMAND root 30 1 0 Jan 24 ? 0:00 /etc/syslogd /usr/adm/syslog lp 35 1 0 Jan 24 ? 0:00 /usr/lib/lpsched I'd wager that it is fairly easy to write a program that would do a considerable amount of work and never record a single CPU tick. PER PROCESS USER AREA: USER ID's: uid: 0, gid: 0, real uid: 0, real gid: 0 PROCESS TIMES: user: 6, sys: 17, child user: 0, child sys: 0 PROCESS MISC: proc slot: 8, cntrl tty: maj(4) min(2) IPC: locks: unlocked FILE I/O: user addr: 25708343, file offset: 357, bytes: 0, segment: user, umask: 26, ulimit: 2097152 ACCOUNTING: command: syslogd, memory: 931552, type: fork start: Thu Jan 24 19:30:02 1991 OPEN FILES: file desc: 0 1 2 3 file slot: 9 9 9 8 This is most of the user structure for the "syslogd" process which I PS'd above. It has logged 23 ticks in 13 days of running, yet it at a minimum produces a log record once an hour. You can't log a fraction of a tick, so it would appear there is about a 1 in 13 chance of the hourly timestamp logging a tick. The implication of this is that a process which does work (even "does work every hour") may still log clock ticks so slowly that even after a month of furious light activity (interesting use of the word "furious" ;-), it still has yet to log a single full second of time. -- John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh Ma Bell: (512) 832-8832 Domain: jfh@rpp386.cactus.org "I've never written a device driver, but I have written a device driver manual" -- Robert Hartman, IDE Corp.
bzs@world.std.com (Barry Shein) (02/07/91)
From: jfh@rpp386.cactus.org (John F Haugh II) >Hmmm. I'm half tempted to take that bet. One problem I envision with >the PS approach is that the CPU resolution is to the full second, and >there are many processes which lurk about in the background and don't >use much more than a second in a days time. That's a good point (see, you can tell the "wizards", they're the ones willing to admit they may be wrong...:-) Looks like we need another option to ps...to increase clock res. Seriously, grokking around in the kernel proc structures tends to be fraught with peril unless you're really pre-disposed to that sort of thing. The next best suggestion would be to try to find a reliable program distributed with source to just modify for this task. "Top" comes to mind, modifying top to do this, or even just rip out the full-screenness (there's a flag for this, -b) and modify the print-out to include more precision and then revert to the script idea. Whatever, at that point the rest of the code is probably pretty simple, just sleep, sweep an array, and kill if desired (how easy this is depends on how top is structured internally.) -- -Barry Shein Software Tool & Die | bzs@world.std.com | uunet!world!bzs Purveyors to the Trade | Voice: 617-739-0202 | Login: 617-739-WRLD
jfh@rpp386.cactus.org (John F Haugh II) (02/07/91)
In article <BZS.91Feb6211712@world.std.com> bzs@world.std.com (Barry Shein) writes: >Seriously, grokking around in the kernel proc structures tends to be >fraught with peril unless you're really pre-disposed to that sort of >thing. Hmmm ;-) > The next best suggestion would be to try to find a reliable >program distributed with source to just modify for this task. I posted a "crash" thing to alt.sources some time ago (or was it comp.sources.misc? I forget ...) If enough people are interested, I'll repost. It works on my SCO XENIX box, but that's all I can say ... -- John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh Ma Bell: (512) 832-8832 Domain: jfh@rpp386.cactus.org "I've never written a device driver, but I have written a device driver manual" -- Robert Hartman, IDE Corp.
scott@convergent.com (Scott Lurndal) (02/08/91)
|> In article <2304@inews.intel.com> bhoughto@hopi.intel.com (Blair P. Houghton) writes: |> >In article <BZS.91Feb5105600@world.std.com> bzs@world.std.com (Barry Shein) writes: |> >>For a problem like this I'll bet you a nickel crafting the whole thing |> >>in C using /dev/kmem etc will not be much faster than the above |> >>described script, and will take a week to get right instead of 30 minutes. |> > |> >Better, use popen(3) and top(1). Top usually gets the data |> >much faster than ps. Why? Who knows? Could be anything |> >from superior skills among public-domain software developers |> >to abuse of /dev/null. |> Actually the best solution would be to use the /proc file system code if you have a SVR4.0 system. Use opendir(3)/readdir(3) to fetch each process number, open it, issue a PIOCPSINFO ioctl(2), and close it. The PIOCPSINFO ioctl(2) will return the system and user times in seconds and nanoseconds (amongst other things). This should be sufficient resolution to determine whether the process is really doing anything. Be aware that although the resolution of the field is in nanoseconds, some systems may only support micro or milli-second resolution. Scott.
torek@elf.ee.lbl.gov (Chris Torek) (02/18/91)
>From: jfh@rpp386.cactus.org (John F Haugh II) >>... One problem I envision with the PS approach is that the CPU >>resolution is to the full second, and there are many processes which >>lurk about in the background and don't use much more than a second >>in a days time. In article <BZS.91Feb6211712@world.std.com> bzs@world.std.com (Barry Shein) writes: >Looks like we need another option to ps...to increase clock res. 4.3reno++ (4.3785?) already gives time to 10 ms resolution: % ps u USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND torek 9409 13.3 0.7 128 55 q1 R+ 2:28AM 0:00.27 ps u torek 9339 0.2 0.7 150 37 q1 Ss 1:10AM 0:01.85 -csh (csh) >Seriously, grokking around in the kernel proc structures tends to be >fraught with peril unless you're really pre-disposed to that sort of >thing. Actually, there is a more basic problem, at least under 4BSD. It is not difficult to write a program that uses 80 to 90 % of the total available CPU time yet shows 0% CPU usage. This is due to a phase interaction between the scheduler and the process accounting code. They both run off the same clock, and you can rig things so that your process is not running when the next clock interrupt fires. I will not give out the details here, since defeating the accounting code causes the scheduler to think that your process is being bullied and thus it gets higher priority than all the others, allowing it to continue in its evil ways. In other words, this goofs up the usual resource sharing, and someone could use it to hog all the cycles. (Fortunately, if someone *does*, and two people do it, the two processes wind up exposing each other's tricks. The clever bad guy will find a way around this as well [such a way does exist].) -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab EE div (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov
sears@cello.hpl.hp.com (Bart Sears) (02/22/91)
Anyone who is interested in the various problems one can run into using statistical timing and ways to get around some of the problems of statistical timing might want to take a look at the proceedings of last year's USENIX Mach workshop. David Black presented a paper on "The Mach Timing Facility: An Implementation of Accurate Low-Overhead Usage Timing" where he described a timing facility using timestamps instead of statistical timing. While it was implemented in Mach, it was not really tied to any Mach features and would probably be fairly easy to implement in any flavor of Unix. This facility would take care of the problem Chris Torek pointed out where under many current systems it is possible to write a program which uses 80+% of the system yet is charged for 0% CPU usage. Bart Sears sears@hplabs.hpl.hp.com