stergios@Jessica.stanford.edu (stergios marinopoulos) (09/08/89)
I wanted a faster bcopy, so I used duffs device as a basis for it. In addition, it copies ints at a time instead of chars, and the loop is unrolled a little too. Its been working well for me today, so it has to be perfect right? I have been seeing 4X speed ups, so I thought I would pass it along. A potential problem is the char*'s not being alligned, but I have not run into it. Also, this probably will not copy strings smaller than 32 bytes (no problem for me, I wanted to copy megs-o-stuff.) Let me know what you think. Of the code or anything else for that matter. sm ********************************************************************** #define IFACTOR 4 dcopy(chardest, charsrc, size) char *chardest, *charsrc ; int size ; { register int *src, *dest, intcount ; int startcharcpy, intoffset, numints2cpy, i ; numints2cpy = size >> 2 ; startcharcpy = numints2cpy << 2 ; intcount = numints2cpy & ~(IFACTOR-1) ; intoffset = numints2cpy - intcount ; src = (int *)(((int) charsrc) + intcount*sizeof(int*)) ; dest = (int *)(((int) chardest) + intcount*sizeof(int*)) ; /* copy the ints */ switch(intoffset) do { case 0: dest[3] = src[3] ; case 3: dest[2] = src[2] ; case 2: dest[1] = src[1] ; case 1: dest[0] = src[0] ; intcount -= IFACTOR ; dest -= IFACTOR ; src -= IFACTOR ; } while (intcount >= 0) ; /* copy the chars left over by the int copy at the end */ for(i=startcharcpy ; i<size ; i++) chardest[i] = charsrc[i] ; }