[comp.sys.sun] Accurate time measurements on SS1+, 3/260, etc.

rcomr@koel.co.rmit.oz.au (Mark Rawling) (12/20/90)

I am trying to make some accurate timing measurements of function call
times (user times only). It is intended to have the code monitor itself
something like :-

    ...
    get_current_time(&start);
    function_to_be_timed();
    get_current_time(&stop);
    function_time = stop - start;
    ...

The problem is that getrusage is limited to 10mS resolution (on the SS1+,
and worse elsewhere). Thus the rusage.ru_utime.tv_usec field is always
nn0000 micro-secs. The only thing I can see that is potentially better
than this is "time (3V)" which returns a milliseconds field. But of
course, this is real time and would require running single user, disabling
interrupts, etc.

Has anyone looked at this before? Is it possible to up the system clock to
1kHz or higher (temporarily) and still run multiuser? Failing this, is it
possible to get something more accurate than uS out of the real time chip
and if so how?

If all else fails I will have to resort to a sbus/vme - PC interface and
read real time from a PC card :-( Can someone *please* save me from this
fate!

Mark Rawling,    CSIRO Division of Information Technology,
	         High Performance Computation Group,
                 c/o Royal Melbourne Institute of Technology,
                 email: rcomr@koel.co.rmit.oz{.au}, phone: (+ 61 3) 660 2726

gemed.kohli@uwm.edu (Jim Kohli) (12/30/90)

In article <915@brchh104.bnr.ca>, rcomr@koel.co.rmit.oz.au (Mark
Rawling) writes:
<I am trying to make some accurate timing measurements of function call
<times (user times only). It is intended to have the code monitor itself
<something like :-
<
<    ...
<    get_current_time(&start);
<    function_to_be_timed();
<    get_current_time(&stop);
<    function_time = stop - start;
<    ...
<
<The problem is that getrusage is limited to 10mS resolution (on the SS1+,
<and worse elsewhere). Thus the rusage.ru_utime.tv_usec field is always
<nn0000 micro-secs. The only thing I can see that is potentially better
<than this is "time (3V)" which returns a milliseconds field. But of
<course, this is real time and would require running single user, disabling
<interrupts, etc.

It is not unusual to do something like this:

    ...
    get_current_time(&start);
	for( i=0 ; i<1000 ; i++ ) function_to_be_timed();
    get_current_time(&stop);
    function_time = (float)(stop - start)/1000.0;
    ...

Jim Kohli
GE Medical

oj@saber.com (01/01/91)

Mark wrote that he's trying to make function call timings by using code
like the following, and he can't get enough resolution out of the system
clock.

       ...
       get_current_time(&start);
       function_to_be_timed();
       get_current_time(&stop);
       function_time = stop - start;
       ...

I've done this sort of measurement several times, and I've always used a
method like this:

       get_current_time(&start);
       for (i = 0 ; i < COUNT ; i++ ) {
         ...
         function_to_be_timed();
         ...
       }
       get_current_time(&stop);
       experiment_time = stop - start;

       get_current_time(&start);
       for (i = 0 ; i < COUNT ; i++ ) {
         ...
         /* function_to_be_timed(); */
         ...
       }
       get_current_time(&stop);
       control_time = stop - start;
       total_function_time = experiment_time - control_time;
       function_time = total_function_time / COUNT;

I've gotten best results when I chose a COUNT value which caused the
total_function_time to be at least 100 of whatever clock ticks the machine
provides.  This allows an accuracy of about 2% in the final function_time
values.

It's also a good idea to fill the ... parts of the code in with operations
which successfully flush the caches, to avoid falsely low readings caused
by hammering on the exact same code over and over.  You'll have to
experiment with this.  For example, try summing up the elements of a large
array, and keep making the array larger until the function_time stops
increasing.

Another good control is to run the measurements three or four times, and
make sure you're getting repeatable results.

Also, you could measure the individual function times, and compute a mean
and standard deviation from the individual times.  This is most accurate,
but most painful.

Beware smart optimizers!  Make sure you get a decent "dose-response"
curve...that is, that a linear increase in COUNT causes a linear increase
in total_function_time.

This methodology has been accurate enough in the past for me to discover
such things as a single extra machine cycle out of a couple of hundred in
various functions' code.

   If all else fails I will have to resort to a sbus/vme - PC interface and
   read real time from a PC card :-( Can someone *please* save me from this
   fate!

Shouldn't be any need.  Plus, even if you do this, you'll still have to
repeat the measurements several times and divide out the result to assess
your accuracy.

Ollie Jones             Saber Software, Inc.       oj@saber.com
Saber-C Project Leader  185 Alewife Brook Parkway  uunet!saber.com!oj
+1(617)876-7636         Cambridge, MA 02138-9887   fax +1(617)868-9205