radford@calgary.UUCP (Radford Neal) (04/02/85)
> sym.1: > movb (r2)+,(r1)+ > bneq sym.1 > By the way, Colonel, this loop is not improved by unrolling. WRONG! I timed the following two routines: # String copy with ordinary loop. _sc1: .word 0 movl 4(ap),r1 movl 8(ap),r2 1: movb (r1)+,(r2)+ bneq 1b ret # String copy with unrolled loop. _sc2: .word 0 movl 4(ap),r1 movl 8(ap),r2 1: movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ beql 2f movb (r1)+,(r2)+ bneq 1b 2: ret The first takes 120 microseconds to copy a thirty character string. The second takes only 100 microseconds. Seems that branches not taken are faster than branches which are taken. Radford Neal The University of Calgary