jsilva@cogsci.berkeley.edu (John Silva) (07/20/88)
Due to the number of requests for these routines I have received, I have decided to post them. Please keep in mind that these were written for 286 based SCO V2.2.0g, but may work on your implementation of xenix. The best way to find out is to try them and see if they crash anything.. Hacking around with adb, I have discovered quite a few library routines which utilize 32 bit shifts, such as _doprint (the kernel of the printf routines), and others. Even the 32 bit math functions. John P. Silva Inova Products -------------------- Cut Here ----------- Cut Here ---------------------- #! /bin/sh # This is a shell archive. Remove anything before this line, then unpack # it by saving it into a file and typing "sh file". To overwrite existing # files, type "sh file -c". You can also feed this as standard input via # unshar, or by typing "sh <file", e.g.. If this archive is complete, you # will see the following message at the end: # "End of shell archive." # Contents: Readme Copyright Lflshift.s Sflshift.s # Wrapped by root@empire on Sun Jul 17 15:28:15 1988 PATH=/bin:/usr/bin:/usr/ucb ; export PATH if test -f 'Readme' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'Readme'\" else echo shar: Extracting \"'Readme'\" \(2454 characters\) sed "s/^X//" >'Readme' <<'END_OF_FILE' XIntroduction X------------ X XThe routines in the files Sflshift.s and Lflshift.s are high speed 32 bit Xshift subroutines intended to replace the library routines _lshr and _lshl. XThese routines will give a noticable speed increase in those programs which Xmake heavy use of bit shifts on long integers. X XOn 80x86 Xenix routines running the Microsoft compiler, and some others, Xthe standard 32 bit shift routines are optimized for size rather than Xspeed. The algorithm Microsoft chose is essentially "shift one bit, loop". XThis makes it dreadfully slow for long bit shifts. X XMy routines achieve their speed increase by replacing the loop structure with Xtwo or three integer shift instructions and some minor calculations. To enable Xthe use of the 16 bit shift instructions, I had to break up the long into Xtwo 16 bit chunks and operate on each seperately. Each routine is comprised Xof two parts: a part for shifts of less than 16 bits, and one for > 16 bit Xshifts. X XDo to the longer algorithm required for < 16 bit shifts, you will notice Xthat these routines will be slower than the library routines for small bit Xshifts, equivalent for about 5 bit shifts, and faster for 6 shifts and larger. XPast the 16 bit shift mark, the routines really gain in speed since only two Xinteger shifts are required to achieve the shift as opposed to 3 shifts and Xan or for < 16 bit shifts. X XI had originally written these routines to enhance the speed of compress in X16 bit mode. They did: I acheived a speed increase of about 24%. (For Xthose of you who have never hacked the compress sources, compress uses Xa LOT of 32 bit shifts to get the job done.) X XTo install these routines, simply compile your original code and link these Xroutines in with the rest of your code. It's as simple as that. X XJohn P. Silva, XInova Products X XUUCP: ucbvax!cogsci!jsilva XDOMAIN: jsilva@cogsci.berkeley.edu X XCopyright Notice X---------------- X X This code is NOT in the public domain. It is a copyrighted work, X and as such is protected by law. X X Inova Products places no restrictions on distribution or noncommercial X use of this product, as long as this notice is left intact. X Commercial use is prohibited without express written permission of X the Author. X X Inova Products IS NOT RESPONSIBLE for damages incurred through use X of this package. We make no warranty of fitness for any particular X application, nor that the code actually works as intended. X X Use at your own risk. X END_OF_FILE if test 2454 -ne `wc -c <'Readme'`; then echo shar: \"'Readme'\" unpacked with wrong size! fi # end of 'Readme' fi if test -f 'Copyright' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'Copyright'\" else echo shar: Extracting \"'Copyright'\" \(638 characters\) sed "s/^X//" >'Copyright' <<'END_OF_FILE' X Copyright Notice X ---------------- X X Written by John P. Silva X Copyright 1988 by Inova Products X X This code is NOT in the public domain. It is a copyrighted work, X and as such is protected by law. X X Inova Products places no restrictions on distribution or noncommercial X use of this product, as long as this notice is left intact. X Commercial use is prohibited without express written permission of X the Author. X X Inova Products IS NOT RESPONSIBLE for damages incurred through use X of this package. We make no warranty of fitness for any particular X application, nor that the code actually works as intended. X X Use at your own risk. X END_OF_FILE if test 638 -ne `wc -c <'Copyright'`; then echo shar: \"'Copyright'\" unpacked with wrong size! fi # end of 'Copyright' fi if test -f 'Lflshift.s' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'Lflshift.s'\" else echo shar: Extracting \"'Lflshift.s'\" \(2894 characters\) sed "s/^X//" >'Lflshift.s' <<'END_OF_FILE' X; Faster 32 bit shift routines for 8086 family processors X; Written by John P. Silva X; Copyright 1988 by Inova Products X; X; This module is written for use with the Medium, Large and Huge X; memory models. X; X; Copyright Notice X; ---------------- X; X; This code is NOT in the public domain. It is a copyrighted work, X; and as such is protected by law. X; X; Inova Products places no restrictions on distribution or noncommercial X; use of this product, as long as this notice is left intact. X; Commercial use is prohibited without express written permission of X; the Author. X; X; Inova Products IS NOT RESPONSIBLE for damages incurred through use X; of this package. We make no warranty of fitness for any particular X; application, nor that the code actually works as intended. X; X; Use at your own risk. X; X TITLE flshift X XFLSHIFT_TEXT SEGMENT BYTE PUBLIC 'CODE' XFLSHIFT_TEXT ENDS X_DATA SEGMENT WORD PUBLIC 'DATA' X_DATA ENDS XCONST SEGMENT WORD PUBLIC 'CONST' XCONST ENDS X_BSS SEGMENT WORD PUBLIC 'BSS' X_BSS ENDS XDGROUP GROUP CONST, _BSS, _DATA X ASSUME CS: FLSHIFT_TEXT, DS: DGROUP, SS: DGROUP, ES: DGROUP X_DATA SEGMENT X_DATA ENDS X_BSS SEGMENT X_BSS ENDS XFLSHIFT_TEXT SEGMENT X X; register di = general temporary X; register si = shift count save X X PUBLIC __lshr X__lshr PROC FAR X push di X push si X xor ch,ch ;Clear hi byte of cx register X mov si,cx ;Save shift count in si reg X cmp cx,16 ;Should we use the 32bit shifter? X jge SHORT LSHR_32 X mov di,dx ;Figure bits to be moved to low byte X mov cx,16 X sub cx,si X shl di,cl ;di now contains bits to be ored later X mov cx,si X sar dx,cl ;Perform arithmetic shift on high byte X shr ax,cl ;Perform arithmetic shift on low byte X or ax,di ;Replace saved bits into low byte X jmp SHORT LSHR_ex XLSHR_32: ;Shift routine for >16 bit shifts X xor di,di ;Calculate artificial sign extension X test dh,80h ;If dx is negative, di should be 0 X jne SHORT LSHR_32a X dec di ;Make di -1 XLSHR_32a: X lea cx,[si-16] ;Calculate amount to shift X sar dx,cl ;Arithmetically shift high byte X mov ax,dx ;Place freshly shifted high byte into low X mov dx,di ;And place artifical sign extension into high XLSHR_ex: X pop si X pop di X ret X__lshr ENDP X X PUBLIC __lshl X__lshl PROC FAR X push di X push si X xor ch,ch ;Clear hi byte of cx register X mov si,cx ;Save shift count in si reg X cmp cx,16 ;Should we use the 32bit shifter? X jge SHORT LSHL_32 X mov di,ax ;Figure bits to be moved to high byte X mov cx,16 X sub cx,si X shr di,cl ;di now contains bits to be ored later X mov cx,si X shl ax,cl ;Shift low byte X shl dx,cl ;Shift high byte X or dx,di ;Replace saved bits into high byte X jmp SHORT LSHL_ex XLSHL_32: ;Shift routine for >16 bit shifts X lea cx,[si-16] ;Calculate amount to shift X shl ax,cl ;Shift low byte X mov dx,ax ;Place freshly shifted low byte into high X xor ax,ax ;And zero low byte XLSHL_ex: X pop si X pop di X ret X__lshl ENDP X XFLSHIFT_TEXT ENDS XEND END_OF_FILE if test 2894 -ne `wc -c <'Lflshift.s'`; then echo shar: \"'Lflshift.s'\" unpacked with wrong size! fi # end of 'Lflshift.s' fi if test -f 'Sflshift.s' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'Sflshift.s'\" else echo shar: Extracting \"'Sflshift.s'\" \(2881 characters\) sed "s/^X//" >'Sflshift.s' <<'END_OF_FILE' X; Faster 32 bit shift routines for 8086 family processors X; Written by John P. Silva X; Copyright 1988 by Inova Products X; X; This module is written to be used only in the Small memory model. X; X; Copyright Notice X; ---------------- X; X; This code is NOT in the public domain. It is a copyrighted work, X; and as such is protected by law. X; X; Inova Products places no restrictions on distribution or noncommercial X; use of this product, as long as this notice is left intact. X; Commercial use is prohibited without express written permission of X; the Author. X; X; Inova Products IS NOT RESPONSIBLE for damages incurred through use X; of this package. We make no warranty of fitness for any particular X; application, nor that the code actually works as intended. X; X; Use at your own risk. X; X TITLE flshift X XFLSHIFT_TEXT SEGMENT BYTE PUBLIC 'CODE' XFLSHIFT_TEXT ENDS X_DATA SEGMENT WORD PUBLIC 'DATA' X_DATA ENDS XCONST SEGMENT WORD PUBLIC 'CONST' XCONST ENDS X_BSS SEGMENT WORD PUBLIC 'BSS' X_BSS ENDS XDGROUP GROUP CONST, _BSS, _DATA X ASSUME CS: FLSHIFT_TEXT, DS: DGROUP, SS: DGROUP, ES: DGROUP X_DATA SEGMENT X_DATA ENDS X_BSS SEGMENT X_BSS ENDS XFLSHIFT_TEXT SEGMENT X X; register di = general temporary X; register si = shift count save X X PUBLIC __lshr X__lshr PROC NEAR X push di X push si X xor ch,ch ;Clear hi byte of cx register X mov si,cx ;Save shift count in si reg X cmp cx,16 ;Should we use the 32bit shifter? X jge SHORT LSHR_32 X mov di,dx ;Figure bits to be moved to low byte X mov cx,16 X sub cx,si X shl di,cl ;di now contains bits to be ored later X mov cx,si X sar dx,cl ;Perform arithmetic shift on high byte X shr ax,cl ;Perform arithmetic shift on low byte X or ax,di ;Replace saved bits into low byte X jmp SHORT LSHR_ex XLSHR_32: ;Shift routine for >16 bit shifts X xor di,di ;Calculate artificial sign extension X test dh,80h ;If dx is negative, di should be 0 X jne SHORT LSHR_32a X dec di ;Make di -1 XLSHR_32a: X lea cx,[si-16] ;Calculate amount to shift X sar dx,cl ;Arithmetically shift high byte X mov ax,dx ;Place freshly shifted high byte into low X mov dx,di ;And place artifical sign extension into high XLSHR_ex: X pop si X pop di X ret X__lshr ENDP X X PUBLIC __lshl X__lshl PROC NEAR X push di X push si X xor ch,ch ;Clear hi byte of cx register X mov si,cx ;Save shift count in si reg X cmp cx,16 ;Should we use the 32bit shifter? X jge SHORT LSHL_32 X mov di,ax ;Figure bits to be moved to high byte X mov cx,16 X sub cx,si X shr di,cl ;di now contains bits to be ored later X mov cx,si X shl ax,cl ;Shift low byte X shl dx,cl ;Shift high byte X or dx,di ;Replace saved bits into high byte X jmp SHORT LSHL_ex XLSHL_32: ;Shift routine for >16 bit shifts X lea cx,[si-16] ;Calculate amount to shift X shl ax,cl ;Shift low byte X mov dx,ax ;Place freshly shifted low byte into high X xor ax,ax ;And zero low byte XLSHL_ex: X pop si X pop di X ret X__lshl ENDP X XFLSHIFT_TEXT ENDS XEND END_OF_FILE if test 2881 -ne `wc -c <'Sflshift.s'`; then echo shar: \"'Sflshift.s'\" unpacked with wrong size! fi # end of 'Sflshift.s' fi echo shar: End of shell archive. exit 0 --- UUCP: ucbvax!cogsci!jsilva DOMAIN: jsilva@cogsci.berkeley.edu