[comp.sys.ibm.pc] bug in Turbo C 1.5

tim@cit-vax.Caltech.Edu (Timothy L. Kay) (01/24/88)

To the best of my recollection, I don't recall having seen any bug
reports so far for Turbo C 1.5.  Am I the first person to report a
bug in Turbo C 1.5?

This program

	----------------------------------
	#include <string.h>

	void main() {
	  char *s = "this is a test";
	  printf("%s\n", s);
	  memcpy(s + 1, s, strlen(s) - 1);
	  printf("%s\n", s);
	}
	----------------------------------

displays

	--------------
	this is a test
	tthis is a tes
	--------------

under Turbo C 1.0, and

	--------------
	this is a test
	tthhssii    ee
	--------------

under Turbo C 1.5.

The Turbo C 1.0 memcpy() detects if the source and destination
overlap and switches the direction of the copying.  Borland must have
decided that the memcpy() routine should not be so smart.  The Turbo
C 1.5 memcpy() generates the above (rather peculiar) pattern by doing
word-at-a-time copying, always in the positive direction.

To see how the pattern was generated, we should look at the copy one
word at a time.  The ^^ marks the word to be moved.  The destination
of the move is marked by the second ^.

	--------------
	this is a test
	^^
	tths is a test
	  ^^
	tthhsis a test
	    ^^
	tthhssi a test
	      ^^
	tthhssii  test
	        ^^
	tthhssii   est
	          ^^
	tthhssii    et
	            ^^
	tthhssii    ee
	--------------

I have worked around this bug by extracting the old version out of
the old library.

	tlib /C \tc10\lib\ch.lib *memcpy.obj

Then I added "memcpy.obj" to my project, and it works.

Tim

crh@hpcvmb.HP (Ron Henderson) (01/26/88)

>reports so far for Turbo C 1.5.  Am I the first person to report a
>bug in Turbo C 1.5?
 ^^^
A bug?, I don't think so.

>
>This program
>
>	----------------------------------
>	#include <string.h>
>
>	void main() {
>	  char *s = "this is a test";
>	  printf("%s\n", s);
>	  memcpy(s + 1, s, strlen(s) - 1);
>	  printf("%s\n", s);
>	}
>	----------------------------------
>

According to the README file, line 189, the memcpy function now conforms
to the ANSI standard. If you want the 'smart' move, you should use
memmove, not memcpy.

Ron ...!hplabs!hp-pcd!crh

skl@sklpc.vnet.van-bc.UUCP (Samuel Lam) (01/26/88)

> To the best of my recollection, I don't recall having seen any 
> bug reports so far for Turbo C 1.5.  Am I the first person to 
> report a bug in Turbo C 1.5?  

No, since what you have observed is not a bug, but a *very* 
intended feature put in by Borland in version 1.5.  For more 
details, edit the README file that comes in disk #1 of your 
Turbo-C 1.5 distribution and scan for the string "memcpy", the 
information is in the errata section of the file.

> The Turbo C 1.0 memcpy() detects if the source and destination 
> overlap and switches the direction of the copying.  Borland 
> must have decided that the memcpy() routine should not be so 
> smart.  

Borland had changed memcpy() in order to make its behaviour 
comply with the ANSI C draft standard.  

> I have worked around this bug by extracting the old version out 
> of the old library.  

The above README file pointed out that the library function 
memmove() will handle overlapping region properly.

...Sam
-- 
Samuel Lam   {ihnp4!alberta,watmath,uw-beaver}!ubc-vision!van-bc!sklpc!skl

robf2@pyuxf.UUCP (robert fair) (01/27/88)

In article <5298@cit-vax.Caltech.Edu>, tim@cit-vax.UUCP writes:
# To the best of my recollection, I don't recall having seen any bug
# reports so far for Turbo C 1.5.  Am I the first person to report a
# bug in Turbo C 1.5?
# 
# This program
# 
# 	----------------------------------
# 	#include <string.h>
# 
# 	void main() {
# 	  char *s = "this is a test";
# 	  printf("%s\n", s);
# 	  memcpy(s + 1, s, strlen(s) - 1);
# 	  printf("%s\n", s);
# 	}
# 	----------------------------------
# 
# displays
# 
# 	--------------
# 	this is a test
# 	tthis is a tes
# 	--------------
# 
# under Turbo C 1.0, and
# 
# 	--------------
# 	this is a test
# 	tthhssii    ee
# 	--------------
# 
# under Turbo C 1.5.
This is precisely correct behaviour - the action of memcpy() on overlapping
regions is undefined in the draft ANSI standard. It looks like Borland took
advantage of this to do memcpy() faster, but it only works on non-overlapping 
areas.

Simply don't use memcpy() on overlapping regions. There should be a function
which does what you want if Turbo is a decent compiler.

Incidently MSC memcpy() changed in *exactly* the same way from 4.0 to 5.0 -
but at leat Microsoft documented the change and provided a function which 
worked in the old way for compatibility [ memmove() ]

Personally I think the entire thing sucks - the function should be
able to detect overlapping areas and take special action in such cases :(
-
-- 
Robert L. Fair
Bell Communications Research/CHC
Piscataway, NJ
{ihnp4,allegra}!pyuxww!pyuxf!robf2

tim@cit-vax.Caltech.Edu (Timothy L. Kay) (01/27/88)

>... ("error" in mem_copy)
>
>Please read your Readme file...

I'd like to thank all the people who explained to me that the change
in memcpy was noted in the readme file that came with the disks.  Somehow
I had missed this eventhough I had read the readme file.

I'd like to make two points.  First, why was it necessary for Borland
to *change* memcpy to conform to the standard?  It seems to me that the
old version of memcpy already conformed to the standard.  In changing
memcpy, they only caused people like me inconvenience.  (Please don't
suggest that some people want to use memcpy to clear memory.)

The second point is this.  After I posted my (not-a-)bug report, I received
several notes via mail that pointed out my error.  Then, six days later,
I am still seeing postings that tell me the same thing.  I think
followups are an inappropriate way to answer such questions.  In the
future, if there are questions such as mine, please mail the answer.  The
original poster should then summarize.

Tim

scjones@sdrc.UUCP (Larry Jones) (01/27/88)

In article <2490@emory.uucp>, platt@emory.uucp (Dan Platt) writes:
> 
> The version 1.5 fixes a "bug" in the 1.0 version by NOT checking for overwrites
> in the memcopy function.  The memmove function preserves the old (safe) 
> technique.  [by following a standard like ANSI a lot of inferiorities are
> introduced]

Sorry, but I get tired of reading statements like these.  For the last time,
ANSI did not REQUIRE anyone to change the way memcpy works.  ANSI does not say
that memcpy can't check for overlapping moves and do them correctly, it merely
affirms the fact that many implementations don't, quite a lot of people expect
memcpy to be blindingly fast, and thus said that not checking was OK.  For
those who need a function to handle overlapping moves correctly, ANSI added
memmove.  So quit blaming ANSI and start blaming { Microsoft, Borland, ... }
'cause THEY'RE the ones that decided to change their implementations.

chip@ateng.UUCP (Chip Salzenberg) (01/28/88)

In article <1596@imagen.UUCP> mark@imagen.UUCP (Mark Peek) writes:
}
}If you'd like a real bug, try compiling a program in huge model and do
}pointer arithmetic. Unless you specify "huge" in the pointer declaration,
}you get a "far" pointer. This can cause a lot of problems because the offset
}arithmetic does not roll over into the segment. Maybe in 1.6 ???

This is a documented feature.  :-)

"Huge model" and "huge pointers" have nothing to do with each other.
It's too bad that the same word is used to describe them.

Actually, it's good that far pointers are the default in huge model, since
huge pointers are _slow_.  And I do mean _slow_.
-- 
Chip Salzenberg                 UUCP: "{codas,uunet}!ateng!chip"
A T Engineering                 My employer's opinions are a trade secret.
       "Anything that works is better than anything that doesn't."

hsu@santra.UUCP (Heikki Suonsivu) (01/28/88)

In article <5298@cit-vax.Caltech.Edu> tim@cit-vax.UUCP (Timothy L. Kay) writes:
>The Turbo C 1.0 memcpy() detects if the source and destination
>overlap and switches the direction of the copying.  Borland must have
>decided that the memcpy() routine should not be so smart.  The Turbo

'man memcpy' and other C manuals I have around, state "Because
character movement is performed differently in different
implementations, overlapping moves (memcpies) may yield unexpected
results". Using such a feature, even if it exists in some certain
compiler/library, will certainly make code unportable.

Haven't read Turbo C manuals so I have no idea whether there is
something about this, but I hope there is, otherwise there will be
lots of nice code which can be used in all machines which run turbo-C,
but rest of the world will have lots of fun porting it, specially when
bugs generated by that kind of thing are sometimes quite hard to find.

Inet: hsu@santra ................. Kuutamokatu 5 A 7 
Uucp: ...!mcvax!santra!hsu ....... 02210 Espoo .....
Fido: Heikki Suonsivu at 2:504/1 . FINLAND .........

carlj@hpcvmb.HP (Carl Johnson) (01/29/88)

>The Turbo C 1.0 memcpy() detects if the source and destination
>overlap and switches the direction of the copying.  Borland must have
>decided that the memcpy() routine should not be so smart.  The Turbo

I think Borland is right not to bother making the memcpy() so smart.
If you think about what a _COPY_ operation implies, an overlapping copy
makes no sense whatsoever.  In general it is impossible to have
overlapping strings be identical.  A move operation on the other hand
must be able to handle overlaps, so it is important to have an
intelligent move function such as memmove().

-----
Carl Johnson
...!hp-labs!hp-pcd!carlj

rbradbur@oracle.UUCP (Robert Bradbury) (02/14/88)

In article <5331@cit-vax.Caltech.Edu> tim@cit-vax.UUCP (Timothy L. Kay) writes:
>
>I'd like to make two points.  First, why was it necessary for Borland
>to *change* memcpy to conform to the standard?  It seems to me that the
>old version of memcpy already conformed to the standard.  In changing
>memcpy, they only caused people like me inconvenience.  (Please don't
>suggest that some people want to use memcpy to clear memory.)

Ah, a question I can answer finally :-).  As one of the prime motivators
of the addition of memmove() to the standard I can explain the reasons
for this.  The problem revolves around trying to serve 2 masters in the
C runtime library: efficiency and portability.  The original (to the best
of my knowledge) definition for memcpy() (Unix System V memory(3C)) is:

  "memcpy copies *n* characters from memory area *s2* to *s1*".

It is undefined what happens if the memory areas overlap.  On the VAX
memcpy() was implemented using movc3/c5 instructions which have clever
microcode which handles overlapping copies correctly.  On a variety of
other machines implementors had less smart instructions and chose to
implement memcpy() efficiently using instructions which did not handle
overlapping areas.  This gets particularly important as compiler vendors
are now providing in-line versions of memcpy() which *usually* do not
handle overlapping copies.  (Supporting overlapping copies involves
adding extra code on most machines.)  We (at Oracle) build large DBMS
which run on a variety of machines and desire that functions which move
bytes of data around should be both efficient AND portable.

In order to have efficient in-line functions we recommended the C standard
support memcpy() as not having to handle overlapping copies.  To handle
copies of overlapping areas portably (and efficiently on machines with
clever microcode) the function memmove() was added.

Borland simply changed memcpy to be the more efficient implementation
allowed by the current draft of the C standard.

I'll admit we could have gone the other way on the names (adding memecpy()
for efficient copies) but it was felt that the majority of current memcpy()
uses did not involve overlapping copies so "standardizing" the existing practice
of machines where memcpy() did not handle overlapping copies would not break
alot of code on those machines where memcpy() did handle overlapping copies.

Most people have no idea of the effort which goes into standardizing something
like C.  Suffice to say that when something is done there are usually good
reasons behind it and that in many cases the compromises required are going
to upset someone.

On another note; does everyone realize that the current standard allows
the results of the str/memcmp() function to be implementation defined
if the characters being compared have the high-bit set?  The net result
is to prevent portable comparisons of unsigned chars or European/EBCDIC
character sets.  This has been pointed out to the committee but they
didn't feel it was "significant".  So if you are writing portable code
and plan to compare anything other than standard ASCII text you should
define your own comparison functions and not rely on the runtime library.

-- 
Robert Bradbury
Oracle Corporation
(206) 784-9726                            hplabs!oracle!rbradbur