lijewski@theory.tn.cornell.edu (Mike Lijewski) (05/03/91)
Perl users, Appended is a script called 'governor' which I'm working on. The intent is to monitor the usage of one of our frontend machines for heavy usage, with the intent of niceing or killing cpu bound processes which should be running on our backend machines. While running, I've seen the perl process grow to roughly 10Mbytes on our IBM 3090 running AIX/370. The version of perl is 3.44. I would appreciate it if anyone could tell me why it is so memory inefficient. A typical 'ps -ef' returns 150 or so lines on the machine. Thanks. #!/usr/local/bin/perl # # Subroutine used with "sort" to do a numerical sort on the pid field # of the 'ps -ef' output, which is the second field. Note that even though # the pid is in field #2, we are actually checking the "third" field here # since Perl numbers arrays beginning at 0. The first blank delimited field # returned by "split" is null in our case since there is always some # whitespace preceding the user name. # sub pid { local(@a) = split(/[ \t]+/, $a); # split on whitespace local(@b) = split(/[ \t]+/, $b); # split on whitespace $a[2] <=> $b[2]; } # # This subroutine calls "ps -ef", deletes all processes owned by root, and # then sorts it by pid. # sub get_sorted_ps { open (PS, 'ps -ef |') || die "Couldn't open ps pipe: $!"; #open (PS, 'ps |') || die "Couldn't open ps pipe: $!"; @ps2 = <PS>; # slurp up the ps output shift @ps2; # chop off the header # Delete root processes and sort by pid. @ps2 = sort pid grep(!/^ *root/, @ps2); close (PS); } # # This subroutine finds those processes using "too much" CPU time. # sub find_bad_dudes { local(@merged_pids) = sort pid (@ps1, @ps2); # merge old and new local(@line1) = split(/[ \t]+/, $merged_pids[0]); # parse first element for ($i = 1; $i <= $#merged_pids; $i++) { # loop through lines local(@line2) = split(/[ \t]+/, $merged_pids[$i]); # if pids are identical and time fields are different if (($line1[2] == $line2[2]) && ($line1[7] ne $line2[7])) { # found a potential bad dude local(@time1) = split(/:/,$line1[7]); local(@time2) = split(/:/,$line2[7]); local($cpu_rate) = ((60 * $time2[0] + $time2[1]) - (60 * $time1[0] + $time1[1])) / $sleep_interval; # make sure cpu rate is positive if ($cpu_rate < 0) { $cpu_rate = -$cpu_rate; } if ($cpu_rate > $cpu_threshold) { # we've found a cpu burner print "BURNER: @line2"; } } @line1 = @line2; # update last line } } # # ********** main routine ********** # # # global variables # $sleep_interval = 10; # how long we sleep between checking process statictics $cpu_threshold = 0.01; # what we consider an unreasonable amount of cpu usage @ps1 = (); # contains old "ps" output @ps2 = (); # contains new "ps" output print 'starting...\n'; &get_sorted_ps; # get most recent process statistics for (;;) { # do forever sleep $sleep_interval; print 'waking up...\n'; @ps1 = @ps2; # save previous process statistics &get_sorted_ps; # get most recent process statistics &find_bad_dudes; #last; } exit(0); -- Mike Lijewski (H)607/272-0238 (W)607/254-8686 Cornell National Supercomputer Facility ARPA: mjlx@eagle.cnsf.cornell.edu BITNET: mjlx@cornellf.bitnet SMAIL: 25 Renwick Heights Road, Ithaca, NY 14850
tchrist@convex.COM (Tom Christiansen) (05/03/91)
From the keyboard of lijewski@theory.tn.cornell.edu (Mike Lijewski): : :Perl users, : :Appended is a script called 'governor' which I'm working on. The :intent is to monitor the usage of one of our frontend machines for :heavy usage, with the intent of niceing or killing cpu bound :processes which should be running on our backend machines. While :running, I've seen the perl process grow to roughly 10Mbytes on our :IBM 3090 running AIX/370. The version of perl is 3.44. I would :appreciate it if anyone could tell me why it is so memory :inefficient. A typical 'ps -ef' returns 150 or so lines on the :machine. Thanks. You're doing a few things that really slow you down, and a few things that really gobble up memory. First, the memory. You're using local()s inside of loops. That gobbles up more and more memory until you finally exit that scope. For example: for ($i = 1; $i <= $#merged_pids; $i++) { # loop through lines local(@line2) = split(/[ \t]+/, $merged_pids[$i]); This will make $#merged_pids copies of @lines, which is itself going to use up a bunch of memory. Declare the local() outside the loop in cases like this. Another thing that you're doing that sucks up memory and cpu is a lot of splitting. Splitting is expensive. Instead of the split line above, you could use: ($pid2, $time2) = $merged_pids[$i] =~ /^\s*\S+\s+(\d+)[^:]+(\d+:\d+)/; This is also nice because it doesn't care how many fields or columns over the times are. This varies a lot on different machines. (By the way, to split without leading null fields, split on ' ' instead of on /\s+/.) The slowest thing of all is the way you're doing the sorting. You split each line many many many many times. This will take nearly forever. You should use the "sort the indices" trick so that you only pull out what you need once. Here's a hacked up version of your program that runs fine on my machine, pretty quickly, and without too much memory use. --tom #!/usr/local/bin/perl $ps_opts = -f '/vmunix' ? 'axu' : '-ef'; sub sort_by_pid { local(@pids) = (); for (@_) { /^\s*\S+\s+(\d+)/ && push(@pids, $1); } = @_[sort _by_pid 0..@pids]; } sub _by_pid { $pids[$a] <=> $pids[$b]; } # This subroutine calls ps, deletes all processes owned by root, and # then sorts it by pid. # sub get_sorted_ps { open (PS, "ps $ps_opts |") || die "Couldn't open ps pipe: $!"; @ps2 = <PS>; # slurp up the ps output shift @ps2; # chop off the header # Delete root processes and sort by pid. @ps2 = &sort_by_pid(grep(!/^ *root\b/, @ps2)); close (PS); } # # This subroutine finds those processes using "too much" CPU time. # sub find_bad_dudes { local(@merged_pids) = &sort_by_pid(@ps1, @ps2); # merge old and new local($i, $min1, $min2, $sec1, $sec2, $cpu_rate, $pid1, $pid2, $time1, $time2); ($pid1, $time1) = $merged_pids[$0] =~ /^\s*\S+\s+(\d+)[^:]+(\d+:\d+)/; for ($i = 1; $i <= $#merged_pids; $i++) { # loop through lines ($pid2, $time2) = $merged_pids[$i] =~ /^\s*\S+\s+(\d+)[^:]+(\d+:\d+)/; # if pids are identical and time fields are different if ( ($pid1 == $pid2) && ($time1 ne $time2)) { # found a potential bad dude ($min1, $sec1) = $time1 =~ /(\d+):(\d+)/; ($min2, $sec2) = $time2 =~ /(\d+):(\d+)/; $cpu_rate = ((60 * $min2 + $sec2) - (60 * $min1 + $sec1)) / $sleep_interval; # make sure cpu rate is positive if ($cpu_rate < 0) { $cpu_rate = -$cpu_rate; } if ($cpu_rate > $cpu_threshold) { # we've found a cpu burner print "BURNER: ", $merged_pids[$i]; } } ($pid1, $time1) = ($pid2, $time2); # update last line } } # # ********** main routine ********** # # # global variables # $sleep_interval = 10; # how long we sleep between checking process statictics $cpu_threshold = 0.01; # what we consider an unreasonable amount of cpu usage @ps1 = (); # contains old "ps" output @ps2 = (); # contains new "ps" output print "starting...\n"; &get_sorted_ps; # get most recent process statistics for (;;) { # do forever sleep $sleep_interval; print "waking up...\n"; @ps1 = @ps2; # save previous process statistics &get_sorted_ps; # get most recent process statistics &find_bad_dudes; #last; } exit(0); -- Tom Christiansen tchrist@convex.com convex!tchrist "So much mail, so little time."
usenet@carssdf.UUCP (John Watson) (05/04/91)
In article <1991May2.212216.24563@batcomputer.tn.cornell.edu>, lijewski@theory.tn.cornell.edu (Mike Lijewski) writes: > >... I've seen the perl process grow to roughly 10Mbytes on our > IBM 3090 running AIX/370. The version of perl is 3.44. ..... > -- > Mike Lijewski (H)607/272-0238 (W)607/254-8686 > Cornell National Supercomputer Facility You owe it to yourself to try the version 4.0 patch 3. There were several fixes that reduced memory leaks getting to 4.0 and then one more fix in the patch to level 3. All my programs used to do the same thing and are all stable now. John Watson self-employed in New Jersey