[comp.realtime] Wanted: Performance Analysis Tools on PCs Info

rlk@telesoft.com (Bob Kitzberger @sation) (05/23/91)

[This is something that folks in comp.realtime can probably help out on, 
so I'm cross-posting there]

Del Gordon writes:

>    project where Ada and PCs are required, and run-time performance
>    analysis is required to show that a given program does not use more
>    than a given percentage of the CPU or memory at any time.  

"at any time" requires clarification.  For any given instruction cycle, your
program use will be either 0% (running in slack task) or 100% (running in
application code).  Of course, what is probably intended is "a given program
does not use more than a given percentage of the CPU over any 1 second span"
or somesuch.  This introduces a complexity -- where to measure the beginning
and ending of the 1 second span?  If you measure at every integral second
(i.e.  0 seconds, 1 sec, 2 secs, etc) then you may miss an overload during
the 0.5 seconds through 1.5 seconds interval, for example.  Getting clear on
this type of issue is essential before you start measuring things.  

If you have cyclic tasks, then a reasonable interval to measure unspent
time may be on the period boundaries of the lowest frequency task.

>    			a call to that vendor indicates that they
>    have no performance analysis tools available for their compiler.  I
>    know other vendors have Ada performance analysis tools for other
>    platforms such as Sun.  However, this project requires PCs.

Vendor-provided performance analysis tools tend to be profilers, which are
based on periodic interrupts of the application.  This may or may not 
be appropriate for a given system's measurement requirements...  problems 
with profiling include, poor measurement resolution, non-negligible
overhead, and the likelihood of missing worst-case situations.

For system tuning, a profiler is a great tool to help out in finding 'hot
spots'.  For verification of system timing correctness, a profiler is
just about useless.

> 	Finally, if all else fails and we have to "grow our own," does
>    anybody have any experience with performance analysis (or code ;)
>    they'd like to share?

To find the amount of unused CPU time on a system, I've used the following:
Implement a background task, with priority lower than all other tasks in the
system.  This task will 'eat up' any excess CPU time.  The background task
should do something repititious and predicatble, like incrementing several
global memory locations in a tight loop.  We'll call this the slack task.

Your next highest priority task should probably be the lowest frequency
task (if you are following Rate Monotonic Scheduling).  At each period
boundary, it can check the value of the global variables being updated
by the slack task.  If you know how quickly the slack task increases
these global counters when the system load is nil (nominal case), then you 
can calculate the amount of time spent in the slack task over the
measurement period.  100% minus the time spent in the slack task is your
system load.  Determination of the maximum rate at which the slack task can
update global counters is pretty straightforward, and only needs to be done
once.

This method is slightly intrusive. Non-intrusive means generally require
a logic analyzer with _deep_ measurement buffers, and the ability to do other
than simple statistical sampling.

As far as measuring the maximum memory usage of an application, some
compiler vendors provide high-water marks for heap usage...  it is really 
easy to implement from a vendors perspective, and can be done with 
very little runtime overhead.  Stack usage measurement, on the other hand,
is expensive to implement at runtime, since each stack growth must be
burdened with code to conditionally set the high-water mark.

Hope this helps,

	.Bob.
-- 
Bob Kitzberger               Internet : rlk@telesoft.com
TeleSoft                     uucp     : ...!ucsd.ucsd.edu!telesoft!rlk
5959 Cornerstone Court West, San Diego, CA  92121-9891  (619) 457-2700 x163
------------------------------------------------------------------------------
"Wretches, utter wretches, keep your hands from beans!"	-- Empedocles

vestal@SRC.Honeywell.COM (Steve Vestal) (05/24/91)

In article <1991May23.063331.13782@telesoft.com> rlk@telesoft.com (Bob Kitzberger @sation) writes:

Bob> Your next highest priority task should probably be the lowest frequency
Bob> task (if you are following Rate Monotonic Scheduling).  At each period
Bob> boundary, it can check the value of the global variables being updated
Bob> by the slack task.  If you know how quickly the slack task increases
Bob> these global counters when the system load is nil (nominal case), then you 
Bob> can calculate the amount of time spent in the slack task over the
Bob> measurement period.  100% minus the time spent in the slack task is your
Bob> system load.  Determination of the maximum rate at which the slack task can
Bob> update global counters is pretty straightforward, and only needs to be done
Bob> once.

I've done something similar to this, it works reasonably well.  I got fairly
close agreement between the measured utilizations etc. and analytically
predicted values (inputs to the analytic model were obtained via a "standard"
benchmark suite to measure various scheduling overheads).  There's a Tri-Ada
'90 paper about this.

Steve Vestal
Mail: Honeywell S&RC MN65-2100, 3660 Technology Drive, Minneapolis MN 55418 
Phone: (612) 782-7049                    Internet: vestal@src.honeywell.com