rcomr@koel.co.rmit.oz.au (Mark Rawling) (12/20/90)
I am trying to make some accurate timing measurements of function call times (user times only). It is intended to have the code monitor itself something like :- ... get_current_time(&start); function_to_be_timed(); get_current_time(&stop); function_time = stop - start; ... The problem is that getrusage is limited to 10mS resolution (on the SS1+, and worse elsewhere). Thus the rusage.ru_utime.tv_usec field is always nn0000 micro-secs. The only thing I can see that is potentially better than this is "time (3V)" which returns a milliseconds field. But of course, this is real time and would require running single user, disabling interrupts, etc. Has anyone looked at this before? Is it possible to up the system clock to 1kHz or higher (temporarily) and still run multiuser? Failing this, is it possible to get something more accurate than uS out of the real time chip and if so how? If all else fails I will have to resort to a sbus/vme - PC interface and read real time from a PC card :-( Can someone *please* save me from this fate! Mark Rawling, CSIRO Division of Information Technology, High Performance Computation Group, c/o Royal Melbourne Institute of Technology, email: rcomr@koel.co.rmit.oz{.au}, phone: (+ 61 3) 660 2726
gemed.kohli@uwm.edu (Jim Kohli) (12/30/90)
In article <915@brchh104.bnr.ca>, rcomr@koel.co.rmit.oz.au (Mark Rawling) writes: <I am trying to make some accurate timing measurements of function call <times (user times only). It is intended to have the code monitor itself <something like :- < < ... < get_current_time(&start); < function_to_be_timed(); < get_current_time(&stop); < function_time = stop - start; < ... < <The problem is that getrusage is limited to 10mS resolution (on the SS1+, <and worse elsewhere). Thus the rusage.ru_utime.tv_usec field is always <nn0000 micro-secs. The only thing I can see that is potentially better <than this is "time (3V)" which returns a milliseconds field. But of <course, this is real time and would require running single user, disabling <interrupts, etc. It is not unusual to do something like this: ... get_current_time(&start); for( i=0 ; i<1000 ; i++ ) function_to_be_timed(); get_current_time(&stop); function_time = (float)(stop - start)/1000.0; ... Jim Kohli GE Medical
oj@saber.com (01/01/91)
Mark wrote that he's trying to make function call timings by using code like the following, and he can't get enough resolution out of the system clock. ... get_current_time(&start); function_to_be_timed(); get_current_time(&stop); function_time = stop - start; ... I've done this sort of measurement several times, and I've always used a method like this: get_current_time(&start); for (i = 0 ; i < COUNT ; i++ ) { ... function_to_be_timed(); ... } get_current_time(&stop); experiment_time = stop - start; get_current_time(&start); for (i = 0 ; i < COUNT ; i++ ) { ... /* function_to_be_timed(); */ ... } get_current_time(&stop); control_time = stop - start; total_function_time = experiment_time - control_time; function_time = total_function_time / COUNT; I've gotten best results when I chose a COUNT value which caused the total_function_time to be at least 100 of whatever clock ticks the machine provides. This allows an accuracy of about 2% in the final function_time values. It's also a good idea to fill the ... parts of the code in with operations which successfully flush the caches, to avoid falsely low readings caused by hammering on the exact same code over and over. You'll have to experiment with this. For example, try summing up the elements of a large array, and keep making the array larger until the function_time stops increasing. Another good control is to run the measurements three or four times, and make sure you're getting repeatable results. Also, you could measure the individual function times, and compute a mean and standard deviation from the individual times. This is most accurate, but most painful. Beware smart optimizers! Make sure you get a decent "dose-response" curve...that is, that a linear increase in COUNT causes a linear increase in total_function_time. This methodology has been accurate enough in the past for me to discover such things as a single extra machine cycle out of a couple of hundred in various functions' code. If all else fails I will have to resort to a sbus/vme - PC interface and read real time from a PC card :-( Can someone *please* save me from this fate! Shouldn't be any need. Plus, even if you do this, you'll still have to repeat the measurements several times and divide out the result to assess your accuracy. Ollie Jones Saber Software, Inc. oj@saber.com Saber-C Project Leader 185 Alewife Brook Parkway uunet!saber.com!oj +1(617)876-7636 Cambridge, MA 02138-9887 fax +1(617)868-9205