m5@bobkat.UUCP (09/19/87)
How do clever iAPX 386 users (or iAPX *8[68] users for that matter) implement C-style string copies? Seems to me that there are two approaches: ; Set up ES, DS, EI, DI, CX, and the direction flag loop: lodsb stosb cmp AL, 0 loopne loop ; or maybe it's loope The other approach would be to SCASB to the end of the source string, then move that many bytes with REP MOVSB. Neither way is particularly appealing. So like, how do the pros do it? What's the deal? -- Mike McNally, mercifully employed at Digital Lynx --- Where Plano Road the Mighty Flood of Forest Lane doth meet, And Garland fair, whose perfumed air flows soft about my feet... uucp: {texsun,killer,infotel}!pollux!bobkat!m5 (214) 238-7474
perkins@bnrmtv.UUCP (Henry Perkins) (10/14/87)
In article <2804@bobkat.UUCP>, m5@bobkat.UUCP (Mike McNally ) writes: > How do clever iAPX 386 users (or iAPX *8[68] users for that matter) > implement C-style string copies? Seems to me that there are two > approaches: > > loop: lodsb > stosb > cmp AL, 0 > loopne loop > > The other approach would be to SCASB to the end of the source string, > then move that many bytes with REP MOVSB. Neither way is particularly > appealing. So like, how do the pros do it? What's the deal? Here's a quick comparison of the two approaches above, using 8086 cycle counts. I've left out the overhead of setting up all the appropriate registers before getting to the loop itself. Repeat_Here: LODSW ; 12 cycles OR AL,AL ; 3 cycles JZ Exit ; 4 cycles to stay in the loop STOSW ; 11 cycles OR AH,AH ; 3 cycles LOOPNZ Repeat_Here ; 19 cycles to stay in the loop Exit: ; Total: 52 cycles per 2 characters REP SCASB ; 15 cycles per character REP MOVSW ; 17 cycles per 2 characters ; Total: 47 cycles per 2 characters But why try to be so clever? My approach is MUCH simpler: just move a number of characters equal to the lesser of the declared lengths of the two strings. Since REP MOVSW is only 17 cycles per 2 characters, this is more efficient whenever the shorter of the strings is longer than 1/3 its declared maximum length. On the average this wins 2 out of 3 times, and the code is both smaller and easier to read. -- {hplabs,amdahl,ames}!bnrmtv!perkins --Henry Perkins It is better never to have been born. But who among us has such luck? One in a million, perhaps.