[comp.os.minix] profile

brucee@runx.ips.oz (Bruce Evans) (11/17/88)

Profile(1) is a program to find and summarise where other programs spend
their time. It is easier to use and more accurate than the V7 profil(2).

#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  To overwrite existing
# files, type "sh file -c".  You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g..  If this archive is complete, you
# will see the following message at the end:
#		"End of shell archive."
# Contents:  README makefile profile.c profile1.s
# Wrapped by sys@besplex on Thu Nov 17 06:11:10 1988
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'README' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'README'\"
else
echo shar: Extracting \"'README'\" \(5887 characters\)
sed "s/^X//" >'README' <<'END_OF_FILE'
XProfile(1) is a program to find and summarise where other programs spend
Xtheir time. It is easier to use and more accurate than the V7 profil(2).
X
XUsage
X-----
X
XProfile(1) has defaults so that it can be used much like time(1):
X
X	profile program
X
Xprints some execution times for "program". The PATH variable is not used
Xso "program" must be in the current directory or have a full path name.
XIf "program" has a symbol table, a histogram for the times between the
Xlabels in the text segment is printed on stderr. Otherwise only totals
Xare printed so profile is reduced to a more acurate version of time(1).
X
XThere are options to change the accuracy, sample ranges, program segment
Xand amount of output. These are summarised when profile is run with no
Xarguments.
X
XExamples
X--------
X
X(1)	cc -Di8088 -o ls ls.c -s > ls.sym	# make an ls with symbols
X	ast -X ls ls.sym
X	profile ls -l /bin 2>ls.prof
X
XOn a 20MHz 386, this produces in ls.prof:
X
X   0.575     575  26.63%  _present  *******************************************
X   0.116     116   5.37%  _reverse  ********
X   0.077      77   3.56%  __doprin  *****
X   0.071      71   3.28%  _strlowe  *****
X   0.059      59   2.73%  _fputc    ****
X   0.053      53   2.45%  _sort     ****
X   0.038      38   1.76%  _date     **
X   0.032      32   1.48%  .cret     **
X   0.022      22   1.01%  .csb2     *
X   0.020      20   0.92%  .cmi4     *
X   0.016      16   0.74%  _getuidg  *
X   0.009       9   0.41%  .dvi4     
X   [others deleted]
X   1.167    1167  54.05%  TOTAL IN RANGE
X   0.992     992  45.94%  OTHER
X   2.159    2159 100.00%  TOTAL
X
XThe present() function is only used to check the options! These are encoded
Xas bits in a long and present() spends most of its time shifting this long.
XPresent() is called a lot because it is in the inner loop of a slow bubble
Xsort.
X
XOn the 386, removing this obstacle wouldn't help much (I haven't done it)
Xsince the i/o time is signifigant. On a PC with a hard disk though, ls
Xwould probably be 3 times as fast if present() was done better. I sped it
Xup by a factor of 2 by replacing the bubble sort by a shell sort. I didn't
Xsuspect until running profile that the slow comparisons caused by calling
Xpresent() were a more fundamental problem.
X
X(2)	profile -f 10000 -s /etc/system/atkernel -t 96 -v2 /bin/sleep 20 2>q&
X	time /usr/bruce/bin/wtest 300
X
XThis profiles the kernel doing mainly message passing. Just about all the
Xflags are illustrated:
X
X	-f 10000
X
XUse a tick frequency of 10000. This is for a 386. The 386 can sort of handle
X50000, and a PC 5000.
X
X	-s /etc/system/atkernel
X
XGet the symbol table from the kernel binary instead of the sleep program.
X
X	-t 96
X
XUse the kernel text segment 0x60. For FS and MM the segment has to be looked
Xup in the F1 dump.
X
X	-v2
X
XMore verbose output. -v1 is default, -v0 is just a 3 line summary.
X
X	/bin/sleep 20 &
X
XDo kernel profiling for next 20 sec.
X
X	2>q
X
XPut profile output in q.
X
X	time /usr/bruce/bin/wtest 300
X
XPrint 300 lines of 80 chars, 1 char at a time. This was known to exercise
Xmainly the kernel message passing, and take just over 20 sec (on the 386).
X
X   4.2295   42295  21.10%  sc_over_  ******************************************
X   1.4962   14962   7.46%  _do_writ  ***************
X   1.4730   14730   7.34%  _mini_se  **************
X   1.4688   14688   7.32%  _sys_cal  **************
X   1.0777   10777   5.37%  _out_cha  **********
X   0.5866    5866   2.92%  _tty_rep  *****
X   0.5254    5254   2.62%  _tty_tas  *****
X   0.4918    4918   2.45%  _console  ****
X   0.4364    4364   2.17%  _send     ****
X   0.3874    3874   1.93%  _mini_re  ***
X   0.2607    2607   1.30%  _umap     **
X   0.2451    2451   1.22%  over_swi  **
X   0.2447    2447   1.22%  _unready  **
X   0.2426    2426   1.21%  up_cp_me  **
X   0.2304    2304   1.14%  _printk   **
X   0.1829    1829   0.91%  _flush    *
X   0.1812    1812   0.90%  _receive  *
X   0.1785    1785   0.89%  _pick_pr  *
X   0.1377    1377   0.68%  over_sys  *
X   0.1176    1176   0.58%  _ready    *
X   0.1081    1081   0.53%  _finish   *
X   0.1052    1052   0.52%  _set_684  *
X   0.0879     879   0.43%  _port_ou  
X   [others deleted]
X  15.3042  153042  76.36%  TOTAL IN RANGE
X   4.7379   47379  23.63%  OTHER
X  20.0421  200421 100.00%  TOTAL
X
XThe results for a standard kernel will not be as instructive, because
Xthe message passing runs with interrupts disabled so will be invisible
Xto the profiler. My kernel reenables interrupts in the assembler code
Xbefore doing sys_call(), and the critical routines here have been
Xtweaked for speed. The large time against sc_over_ is from just after
Xthe interrupts are reenabled. It must result from interrupts being held
Xup during the first half of the context switch for a syscall, but the
Xsize of it is a big surprise.
X
XHistory
X-------
X
XI tried Dick Van Veen's profil(2). This was too much trouble without
Xofficial support. It didn't have scaling implemented so couldn't handle
Xtext sizes above 16K. The system clock tick of 60 Hz was far too slow.
XI already knew the benefits of a fast clock from a 6809 version of
Xprofile. 1000 Hz can show single cycle differences in instruction timing
Xat hot spots for slow processors. So I converted and improved the 6809
Xversion. It hasn't been used much on Minix since the bottlenecks in my
Xprograms were already painfully obvious.
X
XI have got the 386 running in protected mode where the the dirty tricks used
Xto implement profile no longer work. So it will have to go back into the
Xkernel. This is hard to make flexible enough for examples like (2).
X
XBugs
X----
X
XThe clock tick and clock vector are fiddled with, so sending an uncatchable
Xsignal to profile(1) will crash the system. Fix: implement profiling in the
Xkernel.
X
XThe method of determining the profiled program's segment is not reliable
X(but usually works if the system is not in heavy use). Fix: use the methods
Xin ps(1) or even run ps and read its output.
END_OF_FILE
if test 5887 -ne `wc -c <'README'`; then
    echo shar: \"'README'\" unpacked with wrong size!
fi
# end of 'README'
fi
if test -f 'makefile' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'makefile'\"
else
echo shar: Extracting \"'makefile'\" \(84 characters\)
sed "s/^X//" >'makefile' <<'END_OF_FILE'
XASMS = profile.s profile1.s
XCFLAGS = -F -O
X
Xprofile: $(ASMS)
X	cc -o profile $(ASMS)
END_OF_FILE
if test 84 -ne `wc -c <'makefile'`; then
    echo shar: \"'makefile'\" unpacked with wrong size!
fi
# end of 'makefile'
fi
if test -f 'profile.c' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'profile.c'\"
else
echo shar: Extracting \"'profile.c'\" \(14353 characters\)
sed "s/^X//" >'profile.c' <<'END_OF_FILE'
X/* profile.c - profile a command */
X
X/* Written by Bruce Evans Aug-Nov 1988.
X   Not copyright. No warranty. Please delete my name if changes are made.
X */
X
X#include <a.out.h>
X#include <signal.h>
X#include <stdio.h>
X
X#define DEF_FREQ        1000
X#define FALSE              0
X#define SQUARE_WAVE     0x36	/* mode for generating square wave */
X#define STANDARD_TICK_FREQUENCY 60
X#define TIMER0          0x40	/* port address for timer channel 0 */
X#define TIMER_FREQ  1193182L	/* PC-AT timer frequency */
X#define TIMER_MODE      0x43	/* port address for timer channel 3 */
X#ifdef BRUCES_KERNEL
X# define TIMER_VECTOR   0x40
X#else
X# define TIMER_VECTOR   0x08
X#endif
X#define TRUE               1
X#define VECTOR_SEG         0
X
Xtypedef unsigned bool_pt;
Xtypedef unsigned char bool_t;
X
Xstruct sym_s
X{
X  char *adr;
X  unsigned hits;
X  char name[1];			/* really variable length as required */
X};
X
Xchar *allocend;
Xchar *allocptr;
Xunsigned *count_base;		/* start of array of counters */
Xunsigned *count_end;		/* end of array of counters */
Xunsigned count_seg;		/* segment(s) for array of counters */
Xunsigned decplaces;		/* reasonably accurate decimal places */
Xunsigned long decpower[] = { 1, 10, 100, 1000, 10000 };
Xchar *endadr;			/* end of range being profiled (= 0) */
Xbool_t given_range;		/* (= FALSE) */
Xunsigned length;		/* length of range (= 0) */
Xint nsym;			/* number of text symbols */
Xunsigned long otherhits;	/* events not in code seg (= 0) */
Xunsigned overflow;		/* overflows from counters */
Xchar *progname;			/* name from argv[0] */
Xunsigned scale;			/* number of addresses per counter */
Xchar *startadr;			/* start of range being profiled (= 0) */
Xchar *symfile;			/* file with symbols (= NULL) */
Xstruct sym_s **symparray;	/* array of text symbols */
Xunsigned symwidth;		/* width of symbol name in formatting (= 0)*/
Xunsigned text_seg;		/* code segment being profiled */
Xunsigned tick_countdown;	/* counter for frequency divide */
Xunsigned long tick_freq;	/* frequency of clock tick in Hz */
Xunsigned tick_ratio;		/* tick_freq / standard frequency */
Xunsigned long tothits;		/* total hits */
Xunsigned verbosity = 1;		/* verbosity level */
Xunsigned long nztothits;	/* total hits except 0 changed to 1 */
X
Xextern long atol();
Xextern char *malloc();
Xextern old_tick_vector();  	/* in code segment */
Xextern tick_handler();  	/* in code segment */
Xextern trap();
X
Xchar *my_malloc();
X
Xint main( argc, argv )
Xint argc;
Xregister char **argv;
X{
X  register char *arg;
X  bool_t given_text_seg = FALSE;
X  unsigned ncounters;
X  int pid;
X  int pipefd[2];
X  int status;
X
X  progname = argv[0];
X  setbuf( stderr, malloc( BUFSIZ ) );
X  while ( TRUE )
X  {
X    if ( --argc == 0 )
X      usage();
X    arg = *++argv;
X    if ( arg[0] != '-' )
X      break;
X    if ( arg[1] == 'v' )
X    {
X      if ( arg[2] == 0 )
X        ++verbosity;
X      else if ( arg[2] >= '0' && arg[2] <= '9' && arg[3] == 0 )
X        verbosity = arg[2] - '0';
X      else
X        usage();
X    }
X    else if ( --argc == 0 || arg[2] != 0 )
X      usage();
X    else
X    {
X      ++argv;
X      switch ( arg[1] )
X      {
X      case 'f': tick_freq = atol( *argv ); break;
X      case 'r':
X        startadr = (char *) atoi( *argv );
X        if ( --argc == 0 )
X          usage();
X        endadr = (char *) atoi( *++argv );
X        if ( endadr < startadr )
X          usage();
X        length = endadr - startadr;
X        given_range = TRUE;
X        break;
X      case 's': symfile = *argv; break;
X      case 't': text_seg = atoi( *argv ); given_text_seg = TRUE; break;
X      case 'w': symwidth = atoi( *argv ); break;
X      default: usage();
X      }
X    }
X  }
X
X  if ( symfile == NULL )
X    symfile = argv[0];
X  if ( tick_freq == 0 )
X    tick_freq = DEF_FREQ;
X  if ( tick_freq <= 30 )
X    decplaces = 1;
X  else if ( tick_freq < 300 )
X    decplaces = 2;
X  else if ( tick_freq < 3000 )
X    decplaces = 3;
X  else
X    decplaces = 4;
X  tick_countdown = tick_ratio = tick_freq / STANDARD_TICK_FREQUENCY;
X  if ( symwidth == 0 )
X    symwidth = 8;
X
X  readsyms();
X  if ( !given_range )
X    endadr = startadr + length;
X  if ( length <= 0x4000 )
X    scale = 1;
X  else if ( length <= 0x8000 )
X    scale = 2;
X  else
X    scale = 4;
X  ncounters = (length + scale - 1) / scale;
X
X  if ( codeseg() != dataseg() )
X    fatal( "itself must not have separate I & D" );
X  count_seg = dataseg();
X  if ( (count_base = (unsigned *)
X                     my_malloc( ncounters * sizeof *count_base )) == NULL )
X    fatal( "out of memory allocating counters" );
X  count_end = count_base + ncounters;
X  fmemset( count_seg, count_base, 0, ncounters * sizeof *count_base );
X  if ( pipe( pipefd ) < 0 )
X    fatal( "error creating pipe" );
X  switch( (pid = fork()) )
X  {
X  case -1:
X    fatal( "fork failed" );
X  case 0:
X    if ( !given_text_seg )
X      text_seg = codeseg();	/* best guess, it doesn't always work */
X    write( pipefd[1], &text_seg, sizeof text_seg );
X    read( pipefd[0], "", 1 );	/* wait till profiler is ready */
X    close( pipefd[0] );
X    close( pipefd[1] );
X    execv( argv[0], argv );	/* assume NULL terminated */
X    fatal( "exec failed" );
X  }
X  read( pipefd[0], &text_seg, sizeof text_seg );
X  set_trap();
X  setclear_hooks( 1 );
X  write( pipefd[1], "", 1 );
X  close( pipefd[0] );
X  close( pipefd[1] );
X  while ( wait( &status ) != pid )
X    ;
X  setclear_hooks( 0 );
X  summarize();
X  exit( 0 );
X}
X
Xfatal( message )
Xchar *message;
X{
X  setclear_hooks( 0 );
X  fprintf( stderr, "%s: %s\n", progname, message );
X  exit( 1 );
X}
X
Xchar *my_malloc( nbytes )
Xunsigned nbytes;
X{
X  unsigned i;
X
X  if ( allocptr == NULL )
X  {
X    for ( i = 0xC000; i != 0; i -= 0x200 )
X      /* grrr, search should start at 0xFE00, but malloc returns a bad ptr */
X      if ( (allocptr = malloc( i )) != NULL )
X        break;
X      if ( allocptr != NULL )
X        allocend = allocptr + i;
X  }
X  {
X    register char *oldallocptr;
X    register char *newallocptr;
X
X    newallocptr = (oldallocptr = allocptr) + nbytes;
X    if ( newallocptr > allocend || newallocptr < allocptr )
X      return NULL;
X    allocptr = newallocptr;
X    return oldallocptr;
X  }
X}
X
Xnl()
X{
X  putc( '\n', stderr );
X}
X
Xreadsyms()
X{
X  struct sym_s **endsymparray;
X  FILE *fp;
X  struct exec header;		/* a.out header */
X  int len;
X  char *namep;
X  struct nlist newsym;
X  long remaining;
X  struct sym_s **psymparray;
X  struct sym_s *startsymptr;
X  struct sym_s *symptr;
X
X  if ( (fp = fopen( symfile, "r" )) == NULL )
X    return NULL;
X  if ( fread( &header, sizeof header, 1, fp ) != 1 || BADMAG( header ) )
X  {
X    fclose( fp );
X    return NULL;
X  }
X  if ( !given_range )
X  {
X    length = header.a_text;
X    if ( length != header.a_text || length == 0xFFFF )
X      length = 0xFFFE;
X  }
X  if ( header.a_syms == 0 || fseek( fp, A_SYMPOS( header ), 0 ) < 0 )
X  {
X    fclose( fp );
X    return NULL;
X  }
X  for ( startsymptr = NULL, remaining = header.a_syms;
X        remaining >= sizeof newsym; remaining -= sizeof newsym )
X  {
X    if ( fread( &newsym, sizeof newsym, 1, fp ) != 1 )
X      break;
X    if ( (newsym.n_sclass & N_SECT) == N_TEXT )
X    {
X      for ( namep = newsym.n_name, len = 0;
X            *namep != 0 && len < sizeof newsym.n_name; ++namep, ++len )
X        ;
X      if ( (symptr = (struct sym_s *) my_malloc( sizeof(struct sym_s) + len ))
X           == NULL )
X        break;
X      strncpy( symptr->name, newsym.n_name, len );
X      symptr->name[len] = 0;
X      symptr->adr = (char *) newsym.n_value;
X      symptr->hits = 0;
X      if ( startsymptr == NULL )
X        startsymptr = symptr;
X      ++nsym;
X    }
X  }
X  fclose( fp );
X
X  if ( (symparray = (struct sym_s **)
X                    my_malloc( (nsym + 1) * sizeof (*symparray) )) != NULL )
X  {
X    endsymparray = symparray + nsym;
X    for ( symptr = startsymptr, psymparray = symparray;
X          psymparray != endsymparray;
X          symptr = (struct sym_s *) ((char *) (symptr + 1) +
X                   strlen( symptr->name )) ) 
X      *psymparray++ = symptr;
X    *psymparray = NULL;
X    sortadr( symparray, nsym );
X  }
X}
X
Xsetclear_hooks( setflag )
Xbool_pt setflag;
X{
X  static bool_t hooksin;	/* = FALSE */
X  unsigned new_tick_vector[2];
X
X  lock();
X  if ( setflag )
X  {
X    new_tick_vector[0] = (unsigned) tick_handler;
X    new_tick_vector[1] = codeseg();
X    fmemcpy( codeseg(), old_tick_vector, VECTOR_SEG, TIMER_VECTOR * 4, 4 );
X    fmemcpy( VECTOR_SEG, TIMER_VECTOR * 4, codeseg(), new_tick_vector, 4 );
X    set_tick( tick_freq );
X    hooksin = TRUE;
X  }
X  else if ( hooksin )
X  {
X    /* system will crash if this is not done, e.g. after kill -9 */
X    set_tick( 60L );
X    fmemcpy( VECTOR_SEG, TIMER_VECTOR * 4, codeseg(), old_tick_vector, 4 );
X    hooksin = FALSE;
X  }
X  unlock();
X}
X
Xset_tick( freq )
Xunsigned long freq;
X{
X  unsigned count;
X
X  count = TIMER_FREQ / freq;
X  port_out( TIMER_MODE, SQUARE_WAVE );	/* set timer to run continuously */
X  port_out( TIMER0, count );		/* load timer low byte */
X  port_out( TIMER0, count >> 8 );	/* load timer high byte */
X}
X
Xset_trap()
X{
X  int signum;
X
X  for( signum = 0; signum <= NR_SIGS; ++signum )
X    signal( signum, trap );
X}
X
Xshowtime( hits )
Xunsigned long hits;
X{
X  /* BUG: the multiplications overflow when hits or nztothits is greater
X     than MAX_UNSIGNED_LONG / 100.
X     This is supposed to only affect the totals since the pieces are
X     limited to MAX_UNSIGNED.
X     We limited decplaces to 4 so the multiplication by decpower cannot
X     overflow with achievable tick_freq's (386 can handle 50000).
X   */
X
X  fprintf( stderr, "%4u.%0*u %7lu %3u.%02u%%",
X#define TIMEWIDTH   (4+1+   1+7+ 1+3+1+ 2+ 1+ decplaces)
X           (int) (hits / tick_freq), decplaces,
X           (int) (hits % tick_freq * decpower[decplaces] / tick_freq), hits,
X           (int) (100 * hits / nztothits),
X           (int) (100 * (100 * hits % nztothits) / nztothits) );
X}
X
Xsortadr( symparray, nsym )	/* shell sort on address */
Xstruct sym_s **symparray;
Xint nsym;
X{
X  register int gap;
X  register int i;
X  int j;
X  struct sym_s *temp;
X
X  gap = 1;
X  do
X    gap = 3 * gap + 1;
X  while ( gap <= nsym );
X  while ( gap != 1 )
X  {
X    gap /= 3;
X    for ( j = gap; j < nsym; ++j )
X      for ( i = j - gap;
X            i >= 0 && symparray[i]->adr > symparray[i + gap]->adr; i -= gap )
X      {
X        temp = symparray[i];
X        symparray[i] = symparray[i + gap];
X        symparray[i + gap] = temp;
X      }
X    }
X}
X
Xsorthits( symparray, nsym )	/* reverse shell sort on hits */
Xstruct sym_s **symparray;
Xint nsym;
X{
X  register int gap;
X  register int i;
X  int j;
X  struct sym_s *temp;
X
X  gap = 1;
X  do
X    gap = 3 * gap + 1;
X  while ( gap <= nsym );
X  while ( gap != 1 )
X  {
X    gap /= 3;
X    for ( j = gap; j < nsym; ++j )
X      for ( i = j - gap;
X            i >= 0 && symparray[i]->hits < symparray[i + gap]->hits; i -= gap )
X      {
X        temp = symparray[i];
X        symparray[i] = symparray[i + gap];
X        symparray[i + gap] = temp;
X      }
X    }
X}
X
Xsummarize()
X{
X  char *adr;
X  char col;
X  unsigned count;
X  unsigned long hits;
X  unsigned margin;
X  unsigned maxcount;
X  unsigned maxstars;
X  char *nextsymadr;
X  unsigned *pcount;
X  struct sym_s **psymparray;
X  char *symadr;
X  register struct sym_s *symptr;
X
X  if ( (psymparray = symparray) == NULL || (symptr = *psymparray++) == NULL )
X    symadr = nextsymadr = endadr;	/* address never reached by adr */
X  else
X  {
X    symadr = symptr->adr;
X    if ( *psymparray == NULL )
X      nextsymadr = endadr;
X    else
X      nextsymadr = (*psymparray)->adr;
X  }
X  for ( adr = startadr, pcount = count_base, col = 0, hits = 0, maxcount = 0;
X        pcount < count_end; adr += scale, ++pcount )
X    if ( (count = peekw( count_seg, pcount )) != 0 )
X    {
X      while ( adr >= nextsymadr )	/* implies symptr != NULL */
X      {
X        if ( (symptr = *psymparray++) == NULL )
X          symadr = nextsymadr = endadr;
X        else
X        {
X          symadr = symptr->adr;            
X          if ( *psymparray == NULL )
X            nextsymadr = endadr;
X          else
X            nextsymadr = (*psymparray)->adr;
X        }
X      }
X      if ( adr >= symadr )	/* implies symptr != NULL */
X      {
X        if ( verbosity >= 2 && symptr->hits == 0 )
X        {
X          fprintf( stderr, "\n%s %04x\n", symptr->name, symptr->adr );
X          col = 0;
X        }
X        if ( (symptr->hits += count) > maxcount )
X          maxcount = symptr->hits;
X      }
X      hits += count;
X      if ( verbosity >= 2 )
X      {
X        if ( ++col == 7 )
X        {
X          nl();
X          col = 1;
X        }
X        else if ( col != 1 )
X          fprintf( stderr, "   " );
X        fprintf( stderr, "%04x %5u", adr, count );
X      }
X    }
X
X  if ( verbosity >= 2 )
X    nl();
X  nztothits = tothits = hits + otherhits + overflow * 0x10000L;
X  if ( nztothits == 0 )
X    nztothits = 1;
X  if ( verbosity >= 1 && symparray != NULL )
X  {
X    sorthits( symparray, nsym );
X    margin = TIMEWIDTH + 2 + symwidth + 2;
X    maxstars = 80 - margin;	/* actually max + 1 */
X    while ( (symptr = *symparray++) != NULL )
X      if ( (count = symptr->hits) != 0 )
X      {
X        showtime( (unsigned long) count );
X        fprintf( stderr, "  %-*.*s  ", symwidth, symwidth, symptr->name );
X        if ( verbosity < 3 )
X          count = (unsigned long) count * maxstars / maxcount;
X        for ( col = maxstars; count-- != 0; )
X        {
X          if ( --col == 0 )
X          {
X            if ( verbosity < 3 )
X              break;
X            fprintf( stderr, "\n%*s", margin, "" );
X            col = maxstars - 1;
X          }
X          putc( '*', stderr );
X        }
X        nl();
X      }
X  }
X  showtime( hits );
X  fprintf( stderr, "  TOTAL IN RANGE\n" );
X  if ( overflow != 0 )
X  {
X    showtime( overflow * 0x10000L );
X    fprintf( stderr, "  OVERFLOW\n" );
X  }
X  showtime( otherhits );
X  fprintf( stderr, "  OTHER\n" );
X  showtime( tothits );
X  fprintf( stderr, "  TOTAL\n" );
X}
X
Xtrap( signum )
Xint signum;
X{
X  setclear_hooks( 0 );
X  exit( 2 );
X}
X
Xusage()
X{
X  fprintf( stderr,
X"%s: options:\n\
X-f N    timer frequency N (default %lu)\n\
X-r N M  range N <= PC < M\n",
X           progname, (unsigned long) DEF_FREQ );
X  /* grrr, cc requires the string to be split up */
X  fprintf( stderr, "\
X-s S    symbols from file S\n\
X-t N    text segment N\n" );
X  fprintf( stderr, "\
X-v      bump verbosity level\n\
X-vN     verbosity level N (0 to 9)\n" );
X  fprintf( stderr, "\
X-w N    format symbols with width N\n\
XThe first argument not starting with '-' begins the exec list.\n" );
X  exit( 1 );
X}
END_OF_FILE
if test 14353 -ne `wc -c <'profile.c'`; then
    echo shar: \"'profile.c'\" unpacked with wrong size!
fi
# end of 'profile.c'
fi
if test -f 'profile1.s' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'profile1.s'\"
else
echo shar: Extracting \"'profile1.s'\" \(3391 characters\)
sed "s/^X//" >'profile1.s' <<'END_OF_FILE'
X	.text			| must be common I&D
X	.define	_codeseg
X	.define	_dataseg
X	.define	_fmemcpy
X	.define	_fmemset
X	.define	_lock
X	.define	_old_tick_vector
X	.define	_peekw
X	.define	_port_out
X	.define	_tick_handler
X	.define	_unlock
X
X	.data
X	.extern	_count_base
X	.extern	_count_seg
X	.extern	_length
X	.extern	_otherhits
X	.extern	_overflow
X	.extern	_scale
X	.extern	_startadr
X	.extern	_text_seg
X	.extern	_tick_countdown
X	.extern	_tick_ratio
X	
X	.text
X
X| int codeseg();
X
X_codeseg:
X	mov	ax,cs
X	ret
X
X| int codeseg();
X
X_dataseg:
X	mov	ax,ds
X	ret
X
X| void fmemcpy( destseg, destadr, srcseg, srcadr, count );
X| efficient enough for word aligned source and target
X| does not handle overlapped moves
X
X.destseg	=	4+0
X.destadr	=	4+2
X.srcseg		=	4+4
X.srcadr		=	4+6
X.count		=	4+8
X
X_fmemcpy:
X	push	bp
X	mov	bp,sp
X	push	di
X	push	es
X	push	si
X	push	ds
X	mov	es,.destseg(bp)
X	mov	di,.destadr(bp)
X	mov	ds,.srcseg(bp)
X	mov	si,.srcadr(bp)
X	mov	cx,.count(bp)
X	shr	cx,*1
X	cld
X	rep
X	movw
X	rcl	cx,*1
X	rep
X	movb
X	pop	ds
X	pop	si
X	pop	es
X	pop	di
X	pop	bp
X	ret
X
X| void fmemset( destseg, destadr, value, count );
X
X| .destseg	=	4+0	| grrr, asld can't hack repeated definitions
X| .destadr	=	4+2	| even identical ones
X.value		=	4+4
X| .count	=	4+6	| this one is different
X.count1		=	4+6	| so change completely
X
X_fmemset:
X	push	bp
X	mov	bp,sp
X	push	di
X	push	es
X	mov	es,.destseg(bp)
X	mov	di,.destadr(bp)
X	movb	al,.value(bp)
X	movb	ah,al
X	mov	cx,.count1(bp)
X	shr	cx,*1
X	cld
X	rep
X	stow
X	rcl	cx,*1
X	rep
X	stob
X	pop	es
X	pop	di
X	pop	bp
X	ret
X
X| void lock();
X
X_lock:
X	cli
X	ret
X
X| int peekw( unsigned segment, int *offset );
X| returns the word at the far pointer  segment:offset
X| this should have been called peek() for compatibility with DOS compilers
X| but peek() is already in portio.s for byte peeks
X
X_peekw:
X	mov	cx,ds
X	pop	dx
X	pop	ds
X	pop	bx
X	sub	sp,*4
X	mov	ax,(bx)
X	mov	ds,cx
X|	jmp	dx		| asld does this non-portably
X	push	dx
X	ret
X
X| void port_out( int port, char value );
X| writes the byte  value  to  the i/o port  port
X| included so this is independent of portio.s
X| this should have been called outportb() for compatibility with DOS compilers
X
Xport_out:
X	pop	bx
X	pop	dx
X	pop	ax
X	sub	sp,*4
X	out
X|	jmp	bx		| grrr
X	push	bx
X	ret
X
X| timer interrupt handler
X
XEOI		= 	0x20
XINT_CONTROLLER	=	0x20
XFAR_JUMP_OPCODE	=	0xEA
XOLD_CS		=	2+2	| offsets on stack after interrupt and push bp
XOLD_IP		=	2+0
X
X_tick_handler:
X	push	bp
X	mov	bp,sp
X	push	ax
X	push	bx
X	movb	al,#EOI
X	out	INT_CONTROLLER
X	mov	ax,OLD_CS(bp)
X	seg	cs
X	cmp	ax,_text_seg
X	jne	count_others
X	mov	bx,OLD_IP(bp)
X	seg	cs
X	sub	bx,_startadr
X	jb	count_others
X	seg	cs
X	cmp	bx,_length
X	jae	count_others
X	seg	cs
X	movb	al,_scale
X	shr	ax,*1
X	jnc	scale2or4
X	shl	bx,*1
X	j	scaled		| scale 1
X
Xscale2or4:
X	shr	ax,*1
X	jc	prescaled	| scale 2
X	shr	bx,*1		| scale 4
Xprescaled:
X	andb	bl,*0xFE	| round odd address to even
Xscaled:
X	seg	cs
X	add	bx,_count_base
X	push	ds
X	seg	cs
X	mov	ds,_count_seg
X	inc	(bx)
X	je	overflow	
Xpre_end_tick_handler:
X	pop	ds
Xend_tick_handler:
X	seg	cs
X	dec	_tick_countdown
X	je	to_old_tick_handler
X	pop	bx
X	pop	ax
X	pop	bp
X	iret
X
Xto_old_tick_handler:
X	seg	cs
X	mov	ax,_tick_ratio
X	seg	cs
X	mov	_tick_countdown,ax
X	pop	bx
X	pop	ax
X	pop	bp
X	.byte	FAR_JUMP_OPCODE
X_old_tick_vector:
X	.word	0,0		| set up at at run time
X
Xcount_others:
X	seg	cs
X	inc	_otherhits
X	jne	end_tick_handler
X	seg	cs
X	inc	_otherhits+2
X	j	end_tick_handler
X
Xoverflow:
X	inc	_overflow
X	j	pre_end_tick_handler
X
X| void unlock();
X
X_unlock:
X	sti
X	ret
END_OF_FILE
if test 3391 -ne `wc -c <'profile1.s'`; then
    echo shar: \"'profile1.s'\" unpacked with wrong size!
fi
# end of 'profile1.s'
fi
echo shar: End of shell archive.
exit 0

Bruce Evans
Internet: brucee@runx.ips.oz.au    UUCP: uunet!runx.ips.oz.au!brucee