[comp.unix.questions] bcopy

guy@auspex.UUCP (Guy Harris) (12/10/88)

>[Sorry it this isn't the "perfect" group for this, but y'all were
> talking about index vs strchr so this can't be too far off.]

[Except "bcopy" isn't part of any ANSI C draft, so it's definitely not a
"C Standard" issue, it's more of a UNIX issue.  (The ANSI C drafts specify
"memcpy", and the S5 man page explicitly says that it's
implementation-dependent - "overlapping moves may yield surprises".)]

>What does the following mean?  [What should it do?]

It *appears* that the 4.3BSD implementation is intended not to yield
surprises.  The 4.3-tahoe documentation doesn't appear to make any such
claim (although the C-language version also appears to be intended not
to yield surprises).

Keith?  Chris?  If being unsurprising is intended to be a defined
characteristic of the interface, the man page should really state that,
so that implementors don't screw up.

andrew@alice.UUCP (Andrew Hume) (12/12/88)

don't let the twerps responsible for the memcpy/memmove debacle blind
implementers. memcpy and memmove should be the same entry point;
the test for overlapping regions is only a few instructions and just doesn't
matter in any practical timing sense. If the bytecount is at all significant,
initial overhead is irrelevent and if teh bytecount is small, then the
subroutine call overhead is probably 2-3 times more expensive than the
check.
	in any case, in a lot of hardware, overlapping doesn't matter;
what matters is left to right or right to left.
	does anyone know of any case where the above analysis fails?

mcdonald@uxe.cso.uiuc.edu (12/12/88)

>don't let the twerps responsible for the memcpy/memmove debacle blind
>implementers. memcpy and memmove should be the same entry point;
>the test for overlapping regions is only a few instructions and just doesn't
>matter in any practical timing sense. If the bytecount is at all significant,
>initial overhead is irrelevent and if teh bytecount is small, then the
>subroutine call overhead is probably 2-3 times more expensive than the
>check.
>	in any case, in a lot of hardware, overlapping doesn't matter;
>what matters is left to right or right to left.
>	does anyone know of any case where the above analysis fails?

memmove and memcpy should NOT be the same entry point. On most
computers they shouldn't HAVE entry points - they should be inline.
On sufficiently CISC cpu's they should be single instructions
if the arguments are constants. If not, they should still be very small.
It seems to me the idea is that memcpy should be the most efficient
possible way to copy SMALL things - maybe even things like structs
containing four bytes, in which case a 32 bit integer move instruction
could be generated (if the alignemnt was right). When ANSI becomes
standard, there will be standard benchmarks that cruelly penalize
compilers with slow memcpy and memmoves.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/13/88)

In article <8522@alice.UUCP> andrew@alice.UUCP (Andrew Hume) writes:
>memcpy and memmove should be the same entry point;

I fully agree with this for implementation on systems with small virtual
address spaces.  In cases where there is good hardware support for memcpy
but not for memmove, I would recommend making memcpy an in-line intrinsic
(there are acceptable ways to do this; the library must still include an
actual function having that name, and memmove's code can be shared for
that).  Really fancy implementations could also inline memmove even if
only memcpy semantics are directly supported by the hardware, but the
generated code would be more complex than for the expansion of memcpy.

>	in any case, in a lot of hardware, overlapping doesn't matter;
>what matters is left to right or right to left.

Which affects behavior on block-copy of overlapping regions.

>	does anyone know of any case where the above analysis fails?

If memcpy and memmove are implemented as functions, then the savings from
avoiding the special-case tests are, as you say, normally neglible.  But
in the case of in-line expansion, it can make a significant difference
(if there is a really good blitter in the hardware and the typical block
being transfered is of modest size).

guy@auspex.UUCP (Guy Harris) (12/15/88)

 >It seems to me the idea is that memcpy should be the most efficient
 >possible way to copy SMALL things - maybe even things like structs
 >containing four bytes, in which case a 32 bit integer move instruction
 >could be generated (if the alignemnt was right).

The C compilers I know of generate inline code, not calls to "memcpy" or
any other subroutine, for structure assignments; they may even generate
such a move instruction if appropriate.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/16/88)

In article <712@auspex.UUCP> guy@auspex.UUCP (Guy Harris) writes:
>The C compilers I know of generate inline code, not calls to "memcpy" or
>any other subroutine, for structure assignments; they may even generate
>such a move instruction if appropriate.

Actually, as part of a bug fix in Gould's UTX-32 C compiler made here,
it called on bcopy() to copy large structures.  Guess what happened in
our System V environment..  Right, no bcopy().

mcdonald@uxe.cso.uiuc.edu (12/17/88)

/* Written  2:11 pm  Dec 14, 1988 by guy@auspex.UUCP in uxe.cso.uiuc.edu:comp.unix.questions */

 >It seems to me the idea is that memcpy should be the most efficient
 >possible way to copy SMALL things - maybe even things like structs
 >containing four bytes, in which case a 32 bit integer move instruction
 >could be generated (if the alignemnt was right).

The C compilers I know of generate inline code, not calls to "memcpy" or
any other subroutine, for structure assignments; they may even generate
such a move instruction if appropriate.
/* End of text from uxe.cso.uiuc.edu:comp.unix.questions */

My brain was on coffee break when I said "struct". I meant "array".

larryd1@attctc.Dallas.TX.US (Larry Clark) (02/09/90)

	I'm porting sources from the bsd world into a Sys V environment
	and need an explanation of bcopy() and bzero().  Would someone
	in a bsd environment explain the parameters of each call and
	possibly map them to memcpy() and memset()?

	Thanks
	attct!larryd1

echarne@orion.oac.uci.edu (Eli B. Charne) (02/11/90)

larryd1@attctc.Dallas.TX.US (Larry Clark) writes:


>	I'm porting sources from the bsd world into a Sys V environment
>	and need an explanation of bcopy() and bzero().  Would someone
>	in a bsd environment explain the parameters of each call and
>	possibly map them to memcpy() and memset()?


     Well, here's how bcopy and bzero work:

	 bcopy(b1, b2, length)
	 char *b1, *b2;
	 unsigned int length;

     bcopy() Copies length bytes from the "string" b1, and copies them into
     string b2.  I was under the impression this was exactly what memcpy()
     does, but perhaps the order of arguments is changed.

	 bzero(b, length)
	 char *b;
	 unsigned int length;
    
    bzero() just sets length bytes pointed to by b, to the value zero.

					       Hope this helps

						    -Eli

-- 
--------
Eli B. Charne              INTERNET: echarne@ORION.OAC.UCI.EDU
			     BITNET: echarne@UCI.BITNET