brucee@runx.ips.oz (Bruce Evans) (11/17/88)
Profile(1) is a program to find and summarise where other programs spend their time. It is easier to use and more accurate than the V7 profil(2). #! /bin/sh # This is a shell archive. Remove anything before this line, then unpack # it by saving it into a file and typing "sh file". To overwrite existing # files, type "sh file -c". You can also feed this as standard input via # unshar, or by typing "sh <file", e.g.. If this archive is complete, you # will see the following message at the end: # "End of shell archive." # Contents: README makefile profile.c profile1.s # Wrapped by sys@besplex on Thu Nov 17 06:11:10 1988 PATH=/bin:/usr/bin:/usr/ucb ; export PATH if test -f 'README' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'README'\" else echo shar: Extracting \"'README'\" \(5887 characters\) sed "s/^X//" >'README' <<'END_OF_FILE' XProfile(1) is a program to find and summarise where other programs spend Xtheir time. It is easier to use and more accurate than the V7 profil(2). X XUsage X----- X XProfile(1) has defaults so that it can be used much like time(1): X X profile program X Xprints some execution times for "program". The PATH variable is not used Xso "program" must be in the current directory or have a full path name. XIf "program" has a symbol table, a histogram for the times between the Xlabels in the text segment is printed on stderr. Otherwise only totals Xare printed so profile is reduced to a more acurate version of time(1). X XThere are options to change the accuracy, sample ranges, program segment Xand amount of output. These are summarised when profile is run with no Xarguments. X XExamples X-------- X X(1) cc -Di8088 -o ls ls.c -s > ls.sym # make an ls with symbols X ast -X ls ls.sym X profile ls -l /bin 2>ls.prof X XOn a 20MHz 386, this produces in ls.prof: X X 0.575 575 26.63% _present ******************************************* X 0.116 116 5.37% _reverse ******** X 0.077 77 3.56% __doprin ***** X 0.071 71 3.28% _strlowe ***** X 0.059 59 2.73% _fputc **** X 0.053 53 2.45% _sort **** X 0.038 38 1.76% _date ** X 0.032 32 1.48% .cret ** X 0.022 22 1.01% .csb2 * X 0.020 20 0.92% .cmi4 * X 0.016 16 0.74% _getuidg * X 0.009 9 0.41% .dvi4 X [others deleted] X 1.167 1167 54.05% TOTAL IN RANGE X 0.992 992 45.94% OTHER X 2.159 2159 100.00% TOTAL X XThe present() function is only used to check the options! These are encoded Xas bits in a long and present() spends most of its time shifting this long. XPresent() is called a lot because it is in the inner loop of a slow bubble Xsort. X XOn the 386, removing this obstacle wouldn't help much (I haven't done it) Xsince the i/o time is signifigant. On a PC with a hard disk though, ls Xwould probably be 3 times as fast if present() was done better. I sped it Xup by a factor of 2 by replacing the bubble sort by a shell sort. I didn't Xsuspect until running profile that the slow comparisons caused by calling Xpresent() were a more fundamental problem. X X(2) profile -f 10000 -s /etc/system/atkernel -t 96 -v2 /bin/sleep 20 2>q& X time /usr/bruce/bin/wtest 300 X XThis profiles the kernel doing mainly message passing. Just about all the Xflags are illustrated: X X -f 10000 X XUse a tick frequency of 10000. This is for a 386. The 386 can sort of handle X50000, and a PC 5000. X X -s /etc/system/atkernel X XGet the symbol table from the kernel binary instead of the sleep program. X X -t 96 X XUse the kernel text segment 0x60. For FS and MM the segment has to be looked Xup in the F1 dump. X X -v2 X XMore verbose output. -v1 is default, -v0 is just a 3 line summary. X X /bin/sleep 20 & X XDo kernel profiling for next 20 sec. X X 2>q X XPut profile output in q. X X time /usr/bruce/bin/wtest 300 X XPrint 300 lines of 80 chars, 1 char at a time. This was known to exercise Xmainly the kernel message passing, and take just over 20 sec (on the 386). X X 4.2295 42295 21.10% sc_over_ ****************************************** X 1.4962 14962 7.46% _do_writ *************** X 1.4730 14730 7.34% _mini_se ************** X 1.4688 14688 7.32% _sys_cal ************** X 1.0777 10777 5.37% _out_cha ********** X 0.5866 5866 2.92% _tty_rep ***** X 0.5254 5254 2.62% _tty_tas ***** X 0.4918 4918 2.45% _console **** X 0.4364 4364 2.17% _send **** X 0.3874 3874 1.93% _mini_re *** X 0.2607 2607 1.30% _umap ** X 0.2451 2451 1.22% over_swi ** X 0.2447 2447 1.22% _unready ** X 0.2426 2426 1.21% up_cp_me ** X 0.2304 2304 1.14% _printk ** X 0.1829 1829 0.91% _flush * X 0.1812 1812 0.90% _receive * X 0.1785 1785 0.89% _pick_pr * X 0.1377 1377 0.68% over_sys * X 0.1176 1176 0.58% _ready * X 0.1081 1081 0.53% _finish * X 0.1052 1052 0.52% _set_684 * X 0.0879 879 0.43% _port_ou X [others deleted] X 15.3042 153042 76.36% TOTAL IN RANGE X 4.7379 47379 23.63% OTHER X 20.0421 200421 100.00% TOTAL X XThe results for a standard kernel will not be as instructive, because Xthe message passing runs with interrupts disabled so will be invisible Xto the profiler. My kernel reenables interrupts in the assembler code Xbefore doing sys_call(), and the critical routines here have been Xtweaked for speed. The large time against sc_over_ is from just after Xthe interrupts are reenabled. It must result from interrupts being held Xup during the first half of the context switch for a syscall, but the Xsize of it is a big surprise. X XHistory X------- X XI tried Dick Van Veen's profil(2). This was too much trouble without Xofficial support. It didn't have scaling implemented so couldn't handle Xtext sizes above 16K. The system clock tick of 60 Hz was far too slow. XI already knew the benefits of a fast clock from a 6809 version of Xprofile. 1000 Hz can show single cycle differences in instruction timing Xat hot spots for slow processors. So I converted and improved the 6809 Xversion. It hasn't been used much on Minix since the bottlenecks in my Xprograms were already painfully obvious. X XI have got the 386 running in protected mode where the the dirty tricks used Xto implement profile no longer work. So it will have to go back into the Xkernel. This is hard to make flexible enough for examples like (2). X XBugs X---- X XThe clock tick and clock vector are fiddled with, so sending an uncatchable Xsignal to profile(1) will crash the system. Fix: implement profiling in the Xkernel. X XThe method of determining the profiled program's segment is not reliable X(but usually works if the system is not in heavy use). Fix: use the methods Xin ps(1) or even run ps and read its output. END_OF_FILE if test 5887 -ne `wc -c <'README'`; then echo shar: \"'README'\" unpacked with wrong size! fi # end of 'README' fi if test -f 'makefile' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'makefile'\" else echo shar: Extracting \"'makefile'\" \(84 characters\) sed "s/^X//" >'makefile' <<'END_OF_FILE' XASMS = profile.s profile1.s XCFLAGS = -F -O X Xprofile: $(ASMS) X cc -o profile $(ASMS) END_OF_FILE if test 84 -ne `wc -c <'makefile'`; then echo shar: \"'makefile'\" unpacked with wrong size! fi # end of 'makefile' fi if test -f 'profile.c' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'profile.c'\" else echo shar: Extracting \"'profile.c'\" \(14353 characters\) sed "s/^X//" >'profile.c' <<'END_OF_FILE' X/* profile.c - profile a command */ X X/* Written by Bruce Evans Aug-Nov 1988. X Not copyright. No warranty. Please delete my name if changes are made. X */ X X#include <a.out.h> X#include <signal.h> X#include <stdio.h> X X#define DEF_FREQ 1000 X#define FALSE 0 X#define SQUARE_WAVE 0x36 /* mode for generating square wave */ X#define STANDARD_TICK_FREQUENCY 60 X#define TIMER0 0x40 /* port address for timer channel 0 */ X#define TIMER_FREQ 1193182L /* PC-AT timer frequency */ X#define TIMER_MODE 0x43 /* port address for timer channel 3 */ X#ifdef BRUCES_KERNEL X# define TIMER_VECTOR 0x40 X#else X# define TIMER_VECTOR 0x08 X#endif X#define TRUE 1 X#define VECTOR_SEG 0 X Xtypedef unsigned bool_pt; Xtypedef unsigned char bool_t; X Xstruct sym_s X{ X char *adr; X unsigned hits; X char name[1]; /* really variable length as required */ X}; X Xchar *allocend; Xchar *allocptr; Xunsigned *count_base; /* start of array of counters */ Xunsigned *count_end; /* end of array of counters */ Xunsigned count_seg; /* segment(s) for array of counters */ Xunsigned decplaces; /* reasonably accurate decimal places */ Xunsigned long decpower[] = { 1, 10, 100, 1000, 10000 }; Xchar *endadr; /* end of range being profiled (= 0) */ Xbool_t given_range; /* (= FALSE) */ Xunsigned length; /* length of range (= 0) */ Xint nsym; /* number of text symbols */ Xunsigned long otherhits; /* events not in code seg (= 0) */ Xunsigned overflow; /* overflows from counters */ Xchar *progname; /* name from argv[0] */ Xunsigned scale; /* number of addresses per counter */ Xchar *startadr; /* start of range being profiled (= 0) */ Xchar *symfile; /* file with symbols (= NULL) */ Xstruct sym_s **symparray; /* array of text symbols */ Xunsigned symwidth; /* width of symbol name in formatting (= 0)*/ Xunsigned text_seg; /* code segment being profiled */ Xunsigned tick_countdown; /* counter for frequency divide */ Xunsigned long tick_freq; /* frequency of clock tick in Hz */ Xunsigned tick_ratio; /* tick_freq / standard frequency */ Xunsigned long tothits; /* total hits */ Xunsigned verbosity = 1; /* verbosity level */ Xunsigned long nztothits; /* total hits except 0 changed to 1 */ X Xextern long atol(); Xextern char *malloc(); Xextern old_tick_vector(); /* in code segment */ Xextern tick_handler(); /* in code segment */ Xextern trap(); X Xchar *my_malloc(); X Xint main( argc, argv ) Xint argc; Xregister char **argv; X{ X register char *arg; X bool_t given_text_seg = FALSE; X unsigned ncounters; X int pid; X int pipefd[2]; X int status; X X progname = argv[0]; X setbuf( stderr, malloc( BUFSIZ ) ); X while ( TRUE ) X { X if ( --argc == 0 ) X usage(); X arg = *++argv; X if ( arg[0] != '-' ) X break; X if ( arg[1] == 'v' ) X { X if ( arg[2] == 0 ) X ++verbosity; X else if ( arg[2] >= '0' && arg[2] <= '9' && arg[3] == 0 ) X verbosity = arg[2] - '0'; X else X usage(); X } X else if ( --argc == 0 || arg[2] != 0 ) X usage(); X else X { X ++argv; X switch ( arg[1] ) X { X case 'f': tick_freq = atol( *argv ); break; X case 'r': X startadr = (char *) atoi( *argv ); X if ( --argc == 0 ) X usage(); X endadr = (char *) atoi( *++argv ); X if ( endadr < startadr ) X usage(); X length = endadr - startadr; X given_range = TRUE; X break; X case 's': symfile = *argv; break; X case 't': text_seg = atoi( *argv ); given_text_seg = TRUE; break; X case 'w': symwidth = atoi( *argv ); break; X default: usage(); X } X } X } X X if ( symfile == NULL ) X symfile = argv[0]; X if ( tick_freq == 0 ) X tick_freq = DEF_FREQ; X if ( tick_freq <= 30 ) X decplaces = 1; X else if ( tick_freq < 300 ) X decplaces = 2; X else if ( tick_freq < 3000 ) X decplaces = 3; X else X decplaces = 4; X tick_countdown = tick_ratio = tick_freq / STANDARD_TICK_FREQUENCY; X if ( symwidth == 0 ) X symwidth = 8; X X readsyms(); X if ( !given_range ) X endadr = startadr + length; X if ( length <= 0x4000 ) X scale = 1; X else if ( length <= 0x8000 ) X scale = 2; X else X scale = 4; X ncounters = (length + scale - 1) / scale; X X if ( codeseg() != dataseg() ) X fatal( "itself must not have separate I & D" ); X count_seg = dataseg(); X if ( (count_base = (unsigned *) X my_malloc( ncounters * sizeof *count_base )) == NULL ) X fatal( "out of memory allocating counters" ); X count_end = count_base + ncounters; X fmemset( count_seg, count_base, 0, ncounters * sizeof *count_base ); X if ( pipe( pipefd ) < 0 ) X fatal( "error creating pipe" ); X switch( (pid = fork()) ) X { X case -1: X fatal( "fork failed" ); X case 0: X if ( !given_text_seg ) X text_seg = codeseg(); /* best guess, it doesn't always work */ X write( pipefd[1], &text_seg, sizeof text_seg ); X read( pipefd[0], "", 1 ); /* wait till profiler is ready */ X close( pipefd[0] ); X close( pipefd[1] ); X execv( argv[0], argv ); /* assume NULL terminated */ X fatal( "exec failed" ); X } X read( pipefd[0], &text_seg, sizeof text_seg ); X set_trap(); X setclear_hooks( 1 ); X write( pipefd[1], "", 1 ); X close( pipefd[0] ); X close( pipefd[1] ); X while ( wait( &status ) != pid ) X ; X setclear_hooks( 0 ); X summarize(); X exit( 0 ); X} X Xfatal( message ) Xchar *message; X{ X setclear_hooks( 0 ); X fprintf( stderr, "%s: %s\n", progname, message ); X exit( 1 ); X} X Xchar *my_malloc( nbytes ) Xunsigned nbytes; X{ X unsigned i; X X if ( allocptr == NULL ) X { X for ( i = 0xC000; i != 0; i -= 0x200 ) X /* grrr, search should start at 0xFE00, but malloc returns a bad ptr */ X if ( (allocptr = malloc( i )) != NULL ) X break; X if ( allocptr != NULL ) X allocend = allocptr + i; X } X { X register char *oldallocptr; X register char *newallocptr; X X newallocptr = (oldallocptr = allocptr) + nbytes; X if ( newallocptr > allocend || newallocptr < allocptr ) X return NULL; X allocptr = newallocptr; X return oldallocptr; X } X} X Xnl() X{ X putc( '\n', stderr ); X} X Xreadsyms() X{ X struct sym_s **endsymparray; X FILE *fp; X struct exec header; /* a.out header */ X int len; X char *namep; X struct nlist newsym; X long remaining; X struct sym_s **psymparray; X struct sym_s *startsymptr; X struct sym_s *symptr; X X if ( (fp = fopen( symfile, "r" )) == NULL ) X return NULL; X if ( fread( &header, sizeof header, 1, fp ) != 1 || BADMAG( header ) ) X { X fclose( fp ); X return NULL; X } X if ( !given_range ) X { X length = header.a_text; X if ( length != header.a_text || length == 0xFFFF ) X length = 0xFFFE; X } X if ( header.a_syms == 0 || fseek( fp, A_SYMPOS( header ), 0 ) < 0 ) X { X fclose( fp ); X return NULL; X } X for ( startsymptr = NULL, remaining = header.a_syms; X remaining >= sizeof newsym; remaining -= sizeof newsym ) X { X if ( fread( &newsym, sizeof newsym, 1, fp ) != 1 ) X break; X if ( (newsym.n_sclass & N_SECT) == N_TEXT ) X { X for ( namep = newsym.n_name, len = 0; X *namep != 0 && len < sizeof newsym.n_name; ++namep, ++len ) X ; X if ( (symptr = (struct sym_s *) my_malloc( sizeof(struct sym_s) + len )) X == NULL ) X break; X strncpy( symptr->name, newsym.n_name, len ); X symptr->name[len] = 0; X symptr->adr = (char *) newsym.n_value; X symptr->hits = 0; X if ( startsymptr == NULL ) X startsymptr = symptr; X ++nsym; X } X } X fclose( fp ); X X if ( (symparray = (struct sym_s **) X my_malloc( (nsym + 1) * sizeof (*symparray) )) != NULL ) X { X endsymparray = symparray + nsym; X for ( symptr = startsymptr, psymparray = symparray; X psymparray != endsymparray; X symptr = (struct sym_s *) ((char *) (symptr + 1) + X strlen( symptr->name )) ) X *psymparray++ = symptr; X *psymparray = NULL; X sortadr( symparray, nsym ); X } X} X Xsetclear_hooks( setflag ) Xbool_pt setflag; X{ X static bool_t hooksin; /* = FALSE */ X unsigned new_tick_vector[2]; X X lock(); X if ( setflag ) X { X new_tick_vector[0] = (unsigned) tick_handler; X new_tick_vector[1] = codeseg(); X fmemcpy( codeseg(), old_tick_vector, VECTOR_SEG, TIMER_VECTOR * 4, 4 ); X fmemcpy( VECTOR_SEG, TIMER_VECTOR * 4, codeseg(), new_tick_vector, 4 ); X set_tick( tick_freq ); X hooksin = TRUE; X } X else if ( hooksin ) X { X /* system will crash if this is not done, e.g. after kill -9 */ X set_tick( 60L ); X fmemcpy( VECTOR_SEG, TIMER_VECTOR * 4, codeseg(), old_tick_vector, 4 ); X hooksin = FALSE; X } X unlock(); X} X Xset_tick( freq ) Xunsigned long freq; X{ X unsigned count; X X count = TIMER_FREQ / freq; X port_out( TIMER_MODE, SQUARE_WAVE ); /* set timer to run continuously */ X port_out( TIMER0, count ); /* load timer low byte */ X port_out( TIMER0, count >> 8 ); /* load timer high byte */ X} X Xset_trap() X{ X int signum; X X for( signum = 0; signum <= NR_SIGS; ++signum ) X signal( signum, trap ); X} X Xshowtime( hits ) Xunsigned long hits; X{ X /* BUG: the multiplications overflow when hits or nztothits is greater X than MAX_UNSIGNED_LONG / 100. X This is supposed to only affect the totals since the pieces are X limited to MAX_UNSIGNED. X We limited decplaces to 4 so the multiplication by decpower cannot X overflow with achievable tick_freq's (386 can handle 50000). X */ X X fprintf( stderr, "%4u.%0*u %7lu %3u.%02u%%", X#define TIMEWIDTH (4+1+ 1+7+ 1+3+1+ 2+ 1+ decplaces) X (int) (hits / tick_freq), decplaces, X (int) (hits % tick_freq * decpower[decplaces] / tick_freq), hits, X (int) (100 * hits / nztothits), X (int) (100 * (100 * hits % nztothits) / nztothits) ); X} X Xsortadr( symparray, nsym ) /* shell sort on address */ Xstruct sym_s **symparray; Xint nsym; X{ X register int gap; X register int i; X int j; X struct sym_s *temp; X X gap = 1; X do X gap = 3 * gap + 1; X while ( gap <= nsym ); X while ( gap != 1 ) X { X gap /= 3; X for ( j = gap; j < nsym; ++j ) X for ( i = j - gap; X i >= 0 && symparray[i]->adr > symparray[i + gap]->adr; i -= gap ) X { X temp = symparray[i]; X symparray[i] = symparray[i + gap]; X symparray[i + gap] = temp; X } X } X} X Xsorthits( symparray, nsym ) /* reverse shell sort on hits */ Xstruct sym_s **symparray; Xint nsym; X{ X register int gap; X register int i; X int j; X struct sym_s *temp; X X gap = 1; X do X gap = 3 * gap + 1; X while ( gap <= nsym ); X while ( gap != 1 ) X { X gap /= 3; X for ( j = gap; j < nsym; ++j ) X for ( i = j - gap; X i >= 0 && symparray[i]->hits < symparray[i + gap]->hits; i -= gap ) X { X temp = symparray[i]; X symparray[i] = symparray[i + gap]; X symparray[i + gap] = temp; X } X } X} X Xsummarize() X{ X char *adr; X char col; X unsigned count; X unsigned long hits; X unsigned margin; X unsigned maxcount; X unsigned maxstars; X char *nextsymadr; X unsigned *pcount; X struct sym_s **psymparray; X char *symadr; X register struct sym_s *symptr; X X if ( (psymparray = symparray) == NULL || (symptr = *psymparray++) == NULL ) X symadr = nextsymadr = endadr; /* address never reached by adr */ X else X { X symadr = symptr->adr; X if ( *psymparray == NULL ) X nextsymadr = endadr; X else X nextsymadr = (*psymparray)->adr; X } X for ( adr = startadr, pcount = count_base, col = 0, hits = 0, maxcount = 0; X pcount < count_end; adr += scale, ++pcount ) X if ( (count = peekw( count_seg, pcount )) != 0 ) X { X while ( adr >= nextsymadr ) /* implies symptr != NULL */ X { X if ( (symptr = *psymparray++) == NULL ) X symadr = nextsymadr = endadr; X else X { X symadr = symptr->adr; X if ( *psymparray == NULL ) X nextsymadr = endadr; X else X nextsymadr = (*psymparray)->adr; X } X } X if ( adr >= symadr ) /* implies symptr != NULL */ X { X if ( verbosity >= 2 && symptr->hits == 0 ) X { X fprintf( stderr, "\n%s %04x\n", symptr->name, symptr->adr ); X col = 0; X } X if ( (symptr->hits += count) > maxcount ) X maxcount = symptr->hits; X } X hits += count; X if ( verbosity >= 2 ) X { X if ( ++col == 7 ) X { X nl(); X col = 1; X } X else if ( col != 1 ) X fprintf( stderr, " " ); X fprintf( stderr, "%04x %5u", adr, count ); X } X } X X if ( verbosity >= 2 ) X nl(); X nztothits = tothits = hits + otherhits + overflow * 0x10000L; X if ( nztothits == 0 ) X nztothits = 1; X if ( verbosity >= 1 && symparray != NULL ) X { X sorthits( symparray, nsym ); X margin = TIMEWIDTH + 2 + symwidth + 2; X maxstars = 80 - margin; /* actually max + 1 */ X while ( (symptr = *symparray++) != NULL ) X if ( (count = symptr->hits) != 0 ) X { X showtime( (unsigned long) count ); X fprintf( stderr, " %-*.*s ", symwidth, symwidth, symptr->name ); X if ( verbosity < 3 ) X count = (unsigned long) count * maxstars / maxcount; X for ( col = maxstars; count-- != 0; ) X { X if ( --col == 0 ) X { X if ( verbosity < 3 ) X break; X fprintf( stderr, "\n%*s", margin, "" ); X col = maxstars - 1; X } X putc( '*', stderr ); X } X nl(); X } X } X showtime( hits ); X fprintf( stderr, " TOTAL IN RANGE\n" ); X if ( overflow != 0 ) X { X showtime( overflow * 0x10000L ); X fprintf( stderr, " OVERFLOW\n" ); X } X showtime( otherhits ); X fprintf( stderr, " OTHER\n" ); X showtime( tothits ); X fprintf( stderr, " TOTAL\n" ); X} X Xtrap( signum ) Xint signum; X{ X setclear_hooks( 0 ); X exit( 2 ); X} X Xusage() X{ X fprintf( stderr, X"%s: options:\n\ X-f N timer frequency N (default %lu)\n\ X-r N M range N <= PC < M\n", X progname, (unsigned long) DEF_FREQ ); X /* grrr, cc requires the string to be split up */ X fprintf( stderr, "\ X-s S symbols from file S\n\ X-t N text segment N\n" ); X fprintf( stderr, "\ X-v bump verbosity level\n\ X-vN verbosity level N (0 to 9)\n" ); X fprintf( stderr, "\ X-w N format symbols with width N\n\ XThe first argument not starting with '-' begins the exec list.\n" ); X exit( 1 ); X} END_OF_FILE if test 14353 -ne `wc -c <'profile.c'`; then echo shar: \"'profile.c'\" unpacked with wrong size! fi # end of 'profile.c' fi if test -f 'profile1.s' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'profile1.s'\" else echo shar: Extracting \"'profile1.s'\" \(3391 characters\) sed "s/^X//" >'profile1.s' <<'END_OF_FILE' X .text | must be common I&D X .define _codeseg X .define _dataseg X .define _fmemcpy X .define _fmemset X .define _lock X .define _old_tick_vector X .define _peekw X .define _port_out X .define _tick_handler X .define _unlock X X .data X .extern _count_base X .extern _count_seg X .extern _length X .extern _otherhits X .extern _overflow X .extern _scale X .extern _startadr X .extern _text_seg X .extern _tick_countdown X .extern _tick_ratio X X .text X X| int codeseg(); X X_codeseg: X mov ax,cs X ret X X| int codeseg(); X X_dataseg: X mov ax,ds X ret X X| void fmemcpy( destseg, destadr, srcseg, srcadr, count ); X| efficient enough for word aligned source and target X| does not handle overlapped moves X X.destseg = 4+0 X.destadr = 4+2 X.srcseg = 4+4 X.srcadr = 4+6 X.count = 4+8 X X_fmemcpy: X push bp X mov bp,sp X push di X push es X push si X push ds X mov es,.destseg(bp) X mov di,.destadr(bp) X mov ds,.srcseg(bp) X mov si,.srcadr(bp) X mov cx,.count(bp) X shr cx,*1 X cld X rep X movw X rcl cx,*1 X rep X movb X pop ds X pop si X pop es X pop di X pop bp X ret X X| void fmemset( destseg, destadr, value, count ); X X| .destseg = 4+0 | grrr, asld can't hack repeated definitions X| .destadr = 4+2 | even identical ones X.value = 4+4 X| .count = 4+6 | this one is different X.count1 = 4+6 | so change completely X X_fmemset: X push bp X mov bp,sp X push di X push es X mov es,.destseg(bp) X mov di,.destadr(bp) X movb al,.value(bp) X movb ah,al X mov cx,.count1(bp) X shr cx,*1 X cld X rep X stow X rcl cx,*1 X rep X stob X pop es X pop di X pop bp X ret X X| void lock(); X X_lock: X cli X ret X X| int peekw( unsigned segment, int *offset ); X| returns the word at the far pointer segment:offset X| this should have been called peek() for compatibility with DOS compilers X| but peek() is already in portio.s for byte peeks X X_peekw: X mov cx,ds X pop dx X pop ds X pop bx X sub sp,*4 X mov ax,(bx) X mov ds,cx X| jmp dx | asld does this non-portably X push dx X ret X X| void port_out( int port, char value ); X| writes the byte value to the i/o port port X| included so this is independent of portio.s X| this should have been called outportb() for compatibility with DOS compilers X Xport_out: X pop bx X pop dx X pop ax X sub sp,*4 X out X| jmp bx | grrr X push bx X ret X X| timer interrupt handler X XEOI = 0x20 XINT_CONTROLLER = 0x20 XFAR_JUMP_OPCODE = 0xEA XOLD_CS = 2+2 | offsets on stack after interrupt and push bp XOLD_IP = 2+0 X X_tick_handler: X push bp X mov bp,sp X push ax X push bx X movb al,#EOI X out INT_CONTROLLER X mov ax,OLD_CS(bp) X seg cs X cmp ax,_text_seg X jne count_others X mov bx,OLD_IP(bp) X seg cs X sub bx,_startadr X jb count_others X seg cs X cmp bx,_length X jae count_others X seg cs X movb al,_scale X shr ax,*1 X jnc scale2or4 X shl bx,*1 X j scaled | scale 1 X Xscale2or4: X shr ax,*1 X jc prescaled | scale 2 X shr bx,*1 | scale 4 Xprescaled: X andb bl,*0xFE | round odd address to even Xscaled: X seg cs X add bx,_count_base X push ds X seg cs X mov ds,_count_seg X inc (bx) X je overflow Xpre_end_tick_handler: X pop ds Xend_tick_handler: X seg cs X dec _tick_countdown X je to_old_tick_handler X pop bx X pop ax X pop bp X iret X Xto_old_tick_handler: X seg cs X mov ax,_tick_ratio X seg cs X mov _tick_countdown,ax X pop bx X pop ax X pop bp X .byte FAR_JUMP_OPCODE X_old_tick_vector: X .word 0,0 | set up at at run time X Xcount_others: X seg cs X inc _otherhits X jne end_tick_handler X seg cs X inc _otherhits+2 X j end_tick_handler X Xoverflow: X inc _overflow X j pre_end_tick_handler X X| void unlock(); X X_unlock: X sti X ret END_OF_FILE if test 3391 -ne `wc -c <'profile1.s'`; then echo shar: \"'profile1.s'\" unpacked with wrong size! fi # end of 'profile1.s' fi echo shar: End of shell archive. exit 0 Bruce Evans Internet: brucee@runx.ips.oz.au UUCP: uunet!runx.ips.oz.au!brucee