[comp.sys.acorn] really FAST integer division

gcwillia@daisy.waterloo.edu (Graeme Williams) (06/29/91)

Following several posts regarding division routines, in particular one
with a lovely 3 instr. central loop I modified my old division routine
into the following (supersonic, lightning fast, ZR rated, :-) )
32bit routine:

;Essentially what's happening here is we are figuring out how far left
;we can shift the numerator before any actual dividing begins and thus save
;a significant number of useless trips through the (unrolled) 3 instr. loop.
; - Note we only go to the nearest 4 bits, trying to save another 2 bits
;   results in 3 extra instrs. for a saving of 6 instrs. half the time
;   and 0 instrs. the other half -  i.e. no nett benefit.

MOV cnt,#28 : MOV mod,num,LSR#4
CMP den,mod,LSR#12 : SUBLE cnt,cnt,#16 : MOVLE mod,mod,LSR#16
CMP den,mod,LSR#4 : SUBLE cnt,cnt,#8 : MOVLE mod,mod,LSR#8
CMP den,mod : SUBLE cnt,cnt,#4 : MOVLE mod,mod,LSR#4
MOV num,num,LSL cnt : RSB den,den,#0

ADDS num,num,num
;Now skip over cnt copies of the 3 instr. loop.
ADD cnt,cnt,cnt,LSL#1 : ADD pc,pc,cnt,LSL#2 : MOV R0,R0

{ADCS mod,den,mod,LSL#1 : SUBLO mod,mod,den : ADCS num,num,num} x 32 copies


The routine behaves as follows:
On Entry:  num - is a signed numerator A (but +ve)
           den - is a signed denominator B (but non-zero and +ve)
On Exit:   mod - contains A MOD B
           num - contains A DIV B

The routine requires 4 registers and 452 bytes of memory.

Execution time:
  Given numbers A and B (both +ve) chosen at random from a distribution in
  which the most significant bit of the numbers is uniformly distributed
  between bits 0 and 31 (i.e. the density of numbers goes as roughly log n)
  then the average execution time is:

       For A < B :  32 cycles      (no division actually required)
       For A >= B:  58.25 cycles

       The best and worst cases are 32 cycles and 116 cycles respectively.


Have fun. (Stuff like this ought to be worth good money!)

Graeme Williams
gcwillia@daisy.waterloo.edu

P.S. With any luck there's only a couple of typos :-).