[comp.unix.questions] Strcpy on SysV vs. BSD.

hsw@sparta.com (Howard Weiss) (08/31/90)

I've run into an interesting problem trying to use strcpy on a i386 machine
running Interactive 386/ix (SysV 3.2).  At first I thought the library
routine for strcpy was blown away, but when I substituted my own string
copy routine and got the same results (a core dump), I knew somthing
strange was going on.  Here is a short C program that demonstrates the problem:

main(){
  char *TTx = "/dev/";
  char tty[10]; /* works on both SysV and BSD */
/*  char *tty;	/* works only on BSD */
  strcpy(tty,TTx);
  printf("what's in tty now is %s\n",tty);
}

When I tried using the above program on SysV with the 'char *tty;'
declaration, it compiles fine, but core dumps when run.  The same
thing occurs if I substitute 'while (*tty++ = *TTx++)' in place of the
library strcpy.  Yet, the 'char *tty' compiles and runs fine on BSD!
To get this to work on SysV, I used the 'char tty[10]' declaration.

Any words of wisdom as to why this is the case would be appreciated!
I've worked on UNIX systems since V6 (in 1976) and I've never seen
this before.

Thanks,

Howard Weiss

Sparta,Inc.
Columbia, Md. 21046
hsw@sparta.com

hunt@dg-rtp.dg.com (Greg Hunt) (09/01/90)

In article <24351@adm.BRL.MIL>, hsw@sparta.com (Howard Weiss) writes:
> Here is a short C program that demonstrates the problem:
> 
> main(){
>   char *TTx = "/dev/";
>   char tty[10]; /* works on both SysV and BSD */
> /*  char *tty;	/* works only on BSD */
>   strcpy(tty,TTx);
>   printf("what's in tty now is %s\n",tty);
> }
> 
> When I tried using the above program on SysV with the 'char *tty;'
> declaration, it compiles fine, but core dumps when run.  The same
> thing occurs if I substitute 'while (*tty++ = *TTx++)' in place of the
> library strcpy.  Yet, the 'char *tty' compiles and runs fine on BSD!
> To get this to work on SysV, I used the 'char tty[10]' declaration.
> 
> Howard Weiss
> 

The problem isn't with strcpy, SysV, or BSD, there is an error in the
program.

When you use 'char *tty;', you've built a 'pointer to a char', which
is how you refer to a string in C.  However, the pointer hasn't been
initialized to anything, it doesn't point to any allocated memory.

When you then try the 'strcpy (tty, TTx);', you're trying to copy
information using an uninitialized pointer.  Apparently on the BSD
system you used, the pointer had 'good enough garbage' in it that
it was pointing to valid memory.  In that case, the program
destructively overwrote some part of its address space with the
string.  Ouch!

On the other systems you tried, the pointer had bad garbage in it
(possibly null).  When the program tried to dereference the pointer
it took a validity trap, causing the core dump.  This is the result
I got on my machine when I tried it.

This didn't occur, as you noted, when you used 'char tty [10];',
because tty in that case is a pointer to an array of characters and
the complier initialized the pointer, tty, to point to the allocated
area of memory that it created to hold the 10 elements of the array.

You could also solve the problem by using malloc to allocate an area
of memory and assign the pointer returned by malloc to tty.  It will
then point to valid memory and the strcpy will work.

Some compilers can catch the unintentional use of uninitialized 
variables like this if you use some of their warning switches.  Lint
may also be able to detect things like this (never having used lint,
I don't know, the compiler I use generates nice warnings for
uninitialized variables).

Aren't pointers fun?  I hope my explanation is clear.  Enjoy!

--
Greg Hunt                        Internet: hunt@dg-rtp.dg.com
DG/UX Kernel Development         UUCP:     {world}!mcnc!rti!dg-rtp!hunt
Data General Corporation
Research Triangle Park, NC       These opinions are mine, not DG's.

avg@hq.demos.su (Vadim G. Antonov) (09/02/90)

In article <24351@adm.BRL.MIL> hsw@sparta.com (Howard Weiss) writes:

> Here is a short C program that demonstrates the problem:
>
>main(){
>  char *TTx = "/dev/";
>  char tty[10]; /* works on both SysV and BSD */
>/*  char *tty;	/* works only on BSD */
>  strcpy(tty,TTx);
>  printf("what's in tty now is %s\n",tty);
>}
>
>Yet, the 'char *tty' compiles and runs fine on BSD!

	Generally speaking, it SHOULD NOT work because the only
	thing this program do (commented version) is writing
	bytes "/dev/" into some undefined place pointed by
	uninitialized pointer `tty'. You're lucky: Sys V catched
	you and did not allow you to make a hard mistake.
	Errors of such sort are very easy to put and very hard
	to get (out of the program).

>I've worked on UNIX systems since V6 (in 1976) and I've never seen
>this before.

	I've also worked on Unix V6; but I've ran across
	uninitialized pointers many times! :-)

>Howard Weiss

	Vadim Antonov,
	DEMOS, Moscow, USSR
	(it is NOT a joke)

chris@mimsy.umd.edu (Chris Torek) (09/02/90)

>In article <24351@adm.BRL.MIL> hsw@sparta.com (Howard Weiss) writes:
[Judging from your address, you probably sent this to a mailing list.
 Nonetheless, this is the wrong place for it.  The proper place is
 comp.lang.c, aka INFO-C, although I do not recall the list redistribution
 site.]
>>main(){
>>  char *TTx = "/dev/", *tty;
>>  strcpy(tty,TTx);
>>}

In article <1990Aug31.202707.14353@dg-rtp.dg.com>
hunt@dg-rtp.dg.com (Greg Hunt) writes:
>The problem isn't with strcpy, SysV, or BSD, there is an error in the
>program.

Correct.

>This didn't occur, as you noted, when you used 'char tty [10];',
>because tty in that case is a pointer to an array of characters and ....

Not quite.  When declared as an array, `tty' *IS* an array.  (The
only exception to this occurs when declaring formal parameters.)  This
means that sizeof(tty) is 10*sizeof(char), and `&tty' is a value
of type `pointer to array 10 of char' (or, in Classic C, simply an
error).  In other contexts, `tty' *ACTS LIKE* a pointer.  That does
not make it one.

Whenever an array object is used in a value (aka `rvalue') context, the
compiler changes the triple <object, array N of T, foo> to the triple
<value, pointer to T, &foo[0]>.  Here, if you change the program to

	main() {
		char *strcpy();
		char tty[10];
		strcpy(tty, "/dev/");
	}

the call to strcpy is originally:

	[CALL] <object, function returning pointer to char, strcpy>
	  [ARGUMENT] <object, array 10 of char, tty>
	  [ARGUMENT] <object, array 6 of char, "/dev/">

The two arguments are both in value contexts, so they both undergo the
usual conversion, and the compiler comes up with

	[CALL] <object, function returning pointer to char, strcpy>
	  [ARGUMENT] <value, pointer to char, &tty[0]>
	  [ARGUMENT] <value, pointer to char, '/' in { / d e v / \000 }>

In the CALL context the function object also converts (to a pointer to
the corresponding function); and the compiler then arranges for the
converted arguments to be stuffed into an envelope, sealed, stamped,
and mailed off to strcpy() ... or whatever else is a convenient way to
get them there.

(In New C, strcpy as declared above is `function (unknown args)
returning pointer to char', rather than `function returning pointer to
char'; using `#include <string.h>' would get `pointer to function
(pointer to char, pointer to readonly char) returning pointer to
char'.  This allows the compiler to type-check the arguments.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

gwyn@smoke.BRL.MIL (Doug Gwyn) (09/02/90)

In article <24351@adm.BRL.MIL> hsw@sparta.com (Howard Weiss) writes:
>main(){
>  char *TTx = "/dev/";
>  char tty[10]; /* works on both SysV and BSD */
>/*  char *tty;	/* works only on BSD */
>  strcpy(tty,TTx);
>  printf("what's in tty now is %s\n",tty);
>}
>When I tried using the above program on SysV with the 'char *tty;'
>declaration, it compiles fine, but core dumps when run.  The same
>thing occurs if I substitute 'while (*tty++ = *TTx++)' in place of the
>library strcpy.  Yet, the 'char *tty' compiles and runs fine on BSD!
>To get this to work on SysV, I used the 'char tty[10]' declaration.
>Any words of wisdom as to why this is the case would be appreciated!
>I've worked on UNIX systems since V6 (in 1976) and I've never seen
>this before.

Heavens, you've made one of the most common blunders made by novice C
programmers.  In the pointer form of the example, the "tty" variable
is not made to point to valid storage and in fact could contain
arbitrary garbage, since it has never been initialized.  When you
supply this garbage-valued pointer to strcpy(), strcpy() attempts to
use it to point into the target for the copied characters, with
unpredictable results.  Apparently one of the systems you tried this
on happens to not fail very miserably, while the other one does.  The
program is, however, not correct for any system.

The following rewrite contains some instructive comments:

	#include <stdio.h>	/* required for printf() */
	#include <string.h>	/* to properly declare strcpy() */
	int			/* (making this explicit is optional) */
	main( argc, argv )	/* ANSI C permits main() to optionally
				   have no arguments, but old C doesn't */
		int	argc;	/* (making this explicit is optional) */
		char	*argv[];/* or char **argv; which is equivalent
				   for function parameters, but NOT the
				   same thing elsewhere */
	{
		static		/* might as well let the linker init it */
		char	TTx[] =	/* no sense in using a pointer where the
				   array name itself could be used */
			"/dev/";/* initial contents of the TTx[6] array */
		char	tty[10];/* allocates auto storage for 10 chars */
		char	*tp = tty;	/* METHOD 2: points to tty[0] */

		(void)		/* (making this explicit is optional) */
		strcpy( tty, TTx );	/* copy 6 chars into tty[.] */
		(void)strcpy( tp, TTx );/* METHOD 2; same result */
		(void)		/* (making this explicit is optional) */
		printf( "What's in the initial part of tty[] is \"%s\"\n",
			tty	/* or tp */
		      );
		return 0;	/* return "successful" exit status */
	}

cpcahil@virtech.uucp (Conor P. Cahill) (09/02/90)

In article <24351@adm.BRL.MIL> hsw@sparta.com (Howard Weiss) writes:
>main(){
>  char *TTx = "/dev/";
>  char tty[10]; /* works on both SysV and BSD */
>/*  char *tty;	/* works only on BSD */

While you might get away with the *tty stuff for small strings on BSD systems,
it in very bad code.  What is hapening is that *tty is getting a default 
value (probably Zero in this case) and you are then writing data to that
location (which has not been allocated to you and which *should* result in
a core dump). 

The fact that you didn't get a core dump probably means that the startup 
code on your BSD system happens to leave some non-zero data on the stack
that just happens to point to an area that you can write to.  This is not
expected behavior and will probably not work in all cases (even under your
BSD system).  

Remember, you can NEVER NEVER NEVER write to an area pointed to by a pointer
that you have not yet initialized (set to point to a data area that you 
can write to) and expect the code to work.

>I've worked on UNIX systems since V6 (in 1976) and I've never seen
>this before.

I don't know what systems you have tried this on before, but It should
fail on almost every system (or at least it should clobber some other
variables).

-- 
Conor P. Cahill            (703)430-9247        Virtual Technologies, Inc.,
uunet!virtech!cpcahil                           46030 Manekin Plaza, Suite 160
                                                Sterling, VA 22170 

guy@auspex.auspex.com (Guy Harris) (09/03/90)

>Heavens, you've made one of the most common blunders made by novice C
>programmers.

Or, to put it another way, "just because routine 'foobar()' is defined
as taking a 'char *' as an argument doesn't mean:

1) what you pass to it has to be a variable declared as a 'char *'

or

2) that you can just pass it any random variable declared as a 'char
   *'."

I suspect the reason some novice C programmers make this blunder is that
they see something like

	int stat(path, buf)
	char *path;
	struct stat *buf;

in the manual, and think "ok, I have to declare a 'struct stat *' to
pass to 'stat()'," and do something such as

	struct stat *buf;

	...

	if (stat("my_file", buf) < 0)
		...

ott@guug.guug.de (Joachim Ott Munich-Germany) (09/03/90)

In article <24351@adm.BRL.MIL>, hsw@sparta.com (Howard Weiss) writes:
> ...
> main(){
>   char *TTx = "/dev/";
>   char tty[10]; /* works on both SysV and BSD */
> /*  char *tty;	/* works only on BSD */
> 
> ...

As Greg Hunt already said, the pointer 'char *tty' contains a random-
value. lint would have said 'warning: tty may be used before set'.
Obviously you do not use lint, try do it from now on.

Joachim Ott	ott@guug.guug.de