dillon@CORY.BERKELEY.EDU (Matt Dillon) (04/24/87)
(I better sign my name here... this is a long article) -Matt Here is a good example of how to use a DBcc statement, including how to extend it to handle counts larger than 16 bits (only two more instructions). I've tested all three of these routines. The routines transfer information a byte at a time. Theoretically, transfering information a word or a long at a time would be about 2x and 4x as fast, but you need to add code to take care of initial an trailing alignment. If you made a restriction that the supplied arguments were on word or long boundries, and the size in multiples of 2 or 4 bytes, you could implement such an improvment by a simple shift of the count register to the right by 1 or 2 before beginning the loop, and using move.w/move.l instead of move.b in the code below. The best example for the use of DBcc is the BCMP() routine, which employs an actual condition (it uses DBNE) to exit the loop on compare failure. 68000 specs, clock-cycles-per-loop/MBytes-per-second on Amiga where the MBsec is the total transfer rate. E.g. if you want to zero 396K, it will take a second. If you want to move 324K from one place to another it will take a second. 68000 @ 7.14Mhz BSET/BZERO BMOV BCMP as is: 18/0.396 MBsec 22/0.324 MBsec 22/0.324 MBsec mod for word at a time 18/0.793 MBsec 22/0.649 MBsec 22/0.649 MBsec mod for long at a time 22/1.298 MBsec 30/0.952 MBsec 30/0.952 MBsec Transfering a long at a time is about 3x faster than transfering a byte at a time. A 68010 can do memory fills/moves/compares from 1.3x to 1.6x faster than a 68000 can (1.6x for fills, 1.3x for moves/compares). ----------------------------------------------------------- Oh yah.. extending the count beyond 16 bits. Just look at the assembly. Since DBcc only effects the lower word of a data register, you can still use a single data register to hold the 32-bit 'count' you want, and simply add a two instruction outer loop after the DBcc inner loop. --- Just for kicks, this is what a 68010 would give you --- Since the loops in all cases will take two instructions, both a word in size, if you have a 68010 it will enter loop mode operation, which gives significantly faster results: (clock cycle times are for the meat of the loop) 68010 @ 7.14Mhz BSET/BZERO BMOV BCMP as is: 10/0.714 MBsec 14/0.510 MBsec 14/0.510 MBsec mod for word at a time 10/1.428 MBsec 14/1.002 MBsec 14/1.002 MBsec mod for long at a time 14/2.040 MBsec 22/1.298 MBsec 22/1.298 MBsec NOTE: All passed arguments are 32 bits each #! /bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #! /bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create: # bcmp.asm # bmov.asm # bset.asm # This archive created: Thu Apr 23 20:56:55 1987 export PATH; PATH=/bin:/usr/bin:$PATH echo shar: "extracting 'bcmp.asm'" '(590 characters)' if test -f 'bcmp.asm' then echo shar: "will not over-write existing file 'bcmp.asm'" else cat << \!Funky!Stuff! > 'bcmp.asm' ;BCMP.ASM ; using byte operations ; ; BCMP(p1,p2,n) return 0=failed, 1=compare ok xdef _bcmp _bcmp: movem.l 4(A7),A0/A1 ;A0 = ptr1, A1 = ptr2 move.l 12(A7),D1 ;# bytes clr.l D0 ;def. return value is false, also sets Z bit bra drop ;drop into the DBF loop loop cmpm.b (A0)+,(A1)+ drop dbne.w D1,loop ;until count exhausted or compare failed bne end sub.l #$10000,D1 ;for buffers >65535 bpl loop ;branch to loop because D0.W now is FFFF addq.l #1,D0 ;return TRUE end rts !Funky!Stuff! fi # end of overwriting check echo shar: "extracting 'bmov.asm'" '(504 characters)' if test -f 'bmov.asm' then echo shar: "will not over-write existing file 'bmov.asm'" else cat << \!Funky!Stuff! > 'bmov.asm' ;BMOV.ASM ; 4 8 12 ;BMOV(src,dest,bytes) ; xdef _bmov _bmov move.l 4(A7),A0 ;source move.l 8(A7),A1 ;destination move.l 12(A7),D0 ;bytes cmp.l A0,A1 beq end ;trivial case ble dropfwd ;forward copy (dest < src) add.l D0,A0 ;backward copy (dest > src) add.l D0,A1 bra dropbck loopfwd move.b (A0)+,(A1)+ dropfwd dbf.w D0,loopfwd sub.l #$10000,D0 bpl loopfwd bra end loopbck move.b -(A0),-(A1) dropbck dbf.w D0,loopbck sub.l #$10000,D0 bpl loopbck end move.l 8(A7),D0 rts !Funky!Stuff! fi # end of overwriting check echo shar: "extracting 'bset.asm'" '(593 characters)' if test -f 'bset.asm' then echo shar: "will not over-write existing file 'bset.asm'" else cat << \!Funky!Stuff! > 'bset.asm' ;BSET.ASM ;BZERO.ASM ; xdef _bset xdef _bzero _bzero clr.l D1 bra begin _bset move.b 15(A7),D1 ;12(A7)-> msb . . lsb (D1.B = data) begin move.l 4(A7),A0 ;A0 = pointer to memory move.l 8(A7),D0 ;D0 = bytes to set bra drop ;drop into the DBF loop loop move.b D1,(A0)+ drop dbf.w D0,loop ;remember, only effects lower word sub.l #$10000,D0 ;for buffers >65535 bpl loop ;branch to loop because D0.W now is FFFF move.l 4(A7),D0 ;return pointer to buffer start rts !Funky!Stuff! fi # end of overwriting check exit 0 # End of shell archive