[net.lang.c] Unrolling string copy loop

radford@calgary.UUCP (Radford Neal) (04/02/85)
> 	sym.1:
> 		movb	(r2)+,(r1)+
> 		bneq	sym.1

> By the way, Colonel, this loop is not improved by unrolling.

WRONG! I timed the following two routines:

# String copy with ordinary loop.

_sc1:	.word	0
	movl	4(ap),r1
	movl	8(ap),r2

1:	movb	(r1)+,(r2)+
	bneq	1b

	ret

# String copy with unrolled loop.

_sc2:	.word	0
	movl	4(ap),r1
	movl	8(ap),r2

1:	movb	(r1)+,(r2)+
	beql	2f
	movb	(r1)+,(r2)+
	beql	2f
	movb	(r1)+,(r2)+
	beql	2f
	movb	(r1)+,(r2)+
	beql	2f
	movb	(r1)+,(r2)+
	beql	2f
	movb	(r1)+,(r2)+
	beql	2f
	movb	(r1)+,(r2)+
	beql	2f
	movb	(r1)+,(r2)+
	beql	2f
	movb	(r1)+,(r2)+
	beql	2f
	movb	(r1)+,(r2)+
	bneq	1b

2:	ret

The first takes 120 microseconds to copy a thirty character string. The
second takes only 100 microseconds. 

Seems that branches not taken are faster than branches which are taken.

    Radford Neal
    The University of Calgary