[comp.sys.intel] strcpy

m5@bobkat.UUCP (09/19/87)

How do clever iAPX 386 users (or iAPX *8[68] users for that matter) 
implement C-style string copies?  Seems to me that there are two
approaches:

    ; Set up ES, DS, EI, DI, CX, and the direction flag
loop:
        lodsb
        stosb
        cmp     AL, 0
        loopne  loop        ; or maybe it's loope

The other approach would be to SCASB to the end of the source string,
then move that many bytes with REP MOVSB.  Neither way is particularly
appealing.  So like, how do the pros do it?  What's the deal?

-- 
Mike McNally, mercifully employed at Digital Lynx ---
    Where Plano Road the Mighty Flood of Forest Lane doth meet,
    And Garland fair, whose perfumed air flows soft about my feet...
uucp: {texsun,killer,infotel}!pollux!bobkat!m5 (214) 238-7474

perkins@bnrmtv.UUCP (Henry Perkins) (10/14/87)

In article <2804@bobkat.UUCP>, m5@bobkat.UUCP (Mike McNally ) writes:
> How do clever iAPX 386 users (or iAPX *8[68] users for that matter) 
> implement C-style string copies?  Seems to me that there are two
> approaches:
> 
> loop:   lodsb
>         stosb
>         cmp     AL, 0
>         loopne  loop
> 
> The other approach would be to SCASB to the end of the source string,
> then move that many bytes with REP MOVSB.  Neither way is particularly
> appealing.  So like, how do the pros do it?  What's the deal?

Here's a quick comparison of the two approaches above, using
8086 cycle counts.  I've left out the overhead of setting up
all the appropriate registers before getting to the loop itself.

Repeat_Here:  LODSW                ; 12 cycles
	      OR      AL,AL        ; 3 cycles
	      JZ      Exit         ; 4 cycles to stay in the loop
	      STOSW                ; 11 cycles
	      OR      AH,AH        ; 3 cycles
	      LOOPNZ  Repeat_Here  ; 19 cycles to stay in the loop
Exit:                              ; Total: 52 cycles per 2 characters

	      REP     SCASB        ; 15 cycles per character
	      REP     MOVSW        ; 17 cycles per 2 characters
				   ; Total: 47 cycles per 2 characters

But why try to be so clever?  My approach is MUCH simpler: just
move a number of characters equal to the lesser of the declared
lengths of the two strings.  Since REP MOVSW is only 17 cycles
per 2 characters, this is more efficient whenever the shorter of
the strings is longer than 1/3 its declared maximum length.  On
the average this wins 2 out of 3 times, and the code is both
smaller and easier to read.
-- 
{hplabs,amdahl,ames}!bnrmtv!perkins         --Henry Perkins

It is better never to have been born.  But who among us has such luck?
One in a million, perhaps.