martin@prodix.liu.se (Martin Wendel) (08/31/90)
Can anyone explain to me why this piece of code is OK to run: #include <stdio.h> #include <strings.h> main() { char line[]; char *tmp = "1234"; strcpy(line, tmp); printf("%s\n", line); } when this produce a segmentation fault: #include <stdio.h> #include <strings.h> main() { char *line; char *tmp = "1234"; strcpy(line, tmp); printf("%s\n", line); } I have a sparcstation 1+ and run SUNOS 4.03 and I have tried the Sun C compiler and the GNU C compiler with and without the -ansi flag set, but they all behave the same. Thanks in advance. _____________________________________________________________ < Martin Wendel > martin@solix.udac.uu.se > > Postmaster at UDAC < Martin.Wendel@UDAC.UU.SE < < Uppsala University - > Postmaster@UDAC.UU.SE > > Data Centre < Phone: 018 - 18 77 80 < < Sysslomansgatan 21 > Int: +46 18 18 77 80 > > S-750 02 Uppsala <---------------------------------< < SWEDEN > /\/\ \ \/ / > >_________________________<___________/ /\ \_\/\/___________<
martin@prodix.liu.se (Martin Wendel) (08/31/90)
Can anyone explain to me why this piece of code is O to run: #include <stdio.h> #include <strings.h> main() { char line[]; char *tmp = "1234"; strcpy(line, tmp); printf("%s\n", line); } when this pboduce a segmentation fault: #include <stdio.h> #include <strings.h> main() { char *line; char *tmp = "1234"; strcpy(line, tmp); printf("%s\n", line); } I have a sparcstation 1+ and run SUNOS 4.03 and I have tried the Sun C compiler and the GNE C compiler with and without the -ansi flag set, but they all behave the same. Thanks in advance. _____________________________________________________________ < Martin Wendel > martin@solix.udac.uu.se > > Postmaster at UDAC < Martin.Wendel@UDAC.UU.SE < < Uppsala University - > Postmaster@UDAC.UU.SE > > Data Centre < Phone: 018 - 18 77 80 < < Sysslomansgatan 21 > Int: +46 18 18 77 80 > > S-750 02 Uppsala <---------------------------------< < SWEDEN > /\/\ \ \/ / > >_________________________<___________/ /\ \_\/\/___________<
vu0310@bingvaxu.cc.binghamton.edu (R. Kym Horsell) (09/01/90)
In article <163@prodix.liu.se> martin@prodix.liu.se (Martin Wendel) writes: \\\ > #include <stdio.h> > #include <strings.h> > main() > { > char line[]; > char *tmp = "1234"; > strcpy(line, tmp); > printf("%s\n", line); > } \\\ Strictly speaking neither of these programs is ok. In the first one you have allocated how many bytes for ``line''? (In is supposed to be an array). In the second, which uses a char * instead of an array, how many bytes have you allocated to store them? None. If you try: char line[5]; things will look a bit better -- remember to allocate 1 extra byte to store that pesky null at the end of every (normal) C string. Alternatively, char *line; line = malloc(5); should also work. -Kym Horsell
gld2@clutx.clarkson.edu (E.W.D, ,0,0) (09/01/90)
From article <163@prodix.liu.se>, by martin@prodix.liu.se (Martin Wendel): > Can anyone explain to me why this piece of code is OK to run: > main() > { > char line[]; > char *tmp = "1234"; > strcpy(line, tmp); > printf("%s\n", line); > } main () { char line[]; char *tmp = "1234\336\255\276\357"; int innocent = 030073335276; printf("wanted %08X\n", innocent); strcpy(line, tmp); printf("got %08X\n", innocent); } wanted C0EDBABE got DEADBEEF (in certain endian contexts). Eliot W. Dudley edudley@rodan.acs.syr.edu RD 1 Box 66 Cato, New York 13033 315 437 0215
schlake@nmt.edu (William Colburn) (09/01/90)
In article <163@prodix.liu.se> martin@prodix.liu.se (Martin Wendel) writes: > >Can anyone explain to me why this piece of code is OK to run: > > #include <stdio.h> > #include <strings.h> > main() > { > char line[]; > char *tmp = "1234"; > strcpy(line, tmp); > printf("%s\n", line); > } > >when this produce a segmentation fault: > > #include <stdio.h> > #include <strings.h> > main() > { > char *line; > char *tmp = "1234"; > strcpy(line, tmp); > printf("%s\n", line); > } > It seems to me that they should BOTH fail. You are copying a string to a pointer, and not having the pointer point anyplace. The fact that the first program works is pure luck. #include <stdio.h> #include <strings.h> char *malloc(); int strlen(); main() { char *line; char *tmp = "1234"; int strsize; strsize=strlen(tmp); line=malloc(strsize+1); strcpy(line,tmp); printf("%s\n",line); } Schlake
userAKDU@mts.ucs.UAlberta.CA (Al Dunbar) (09/01/90)
In article <163@prodix.liu.se>, martin@prodix.liu.se (Martin Wendel) writes: > >Can anyone explain to me why this piece of code is OK to run: > > #include <stdio.h> > #include <strings.h> > main() > { > char line[]; > char *tmp = "1234"; > strcpy(line, tmp); > printf("%s\n", line); > } > >when this produce a segmentation fault: > > #include <stdio.h> > #include <strings.h> > main() > { > char *line; > char *tmp = "1234"; > strcpy(line, tmp); > printf("%s\n", line); > } > Simple. In the first case, strcpy receives the address of array line, and copies the string to it, clobbering whatever variables happen to follow the array in memory. The program has a bug, but it does not result in an exception. In the second case the address that strcpy tries to use is the value of the pointer line. Since it has not been initialized you should not be surprized that it happens to point somewhere illegal. -------------------+------------------------------------------- Alastair Dunbar | Edmonton: a great place, but... Edmonton, Alberta | before Gretzky trade: "City of Champions" CANADA | after Gretzky trade: "City of Champignons" -------------------+-------------------------------------------
mccaugh@sunb0.cs.uiuc.edu (09/02/90)
Hmmmmm: "pure luck" that the first program works, comments the previous note. But the question orignally posed did not address correctness of either program, but rather why the second version precipitates a segmentation fault where the first one does not. So what is the key difference in the two programs? It would appear to be in the declaration of variable 'line' which is a null (length = 0) string in the first declaration (char []) and a char-ptr in the second. We are not informed as to whether the assignment (via 'strcpy') caused the problem or the subsequent 'printf' but I would suspect the latter. (If the former caused the problem in the second version, why not in the first?) Perhaps "%s" allows for "normal" execution in the first program -- since 'line' is technically a string -- but not in the second. I, too, have encountered a similar problem with C compilers on VAXen. My point is that certain compilers MAY draw some serious distinction between char-ptrs and "true" strings (char [*]) even when the string is null. Since un-initialized ptrs so often lead to segmentation faults, here is my guess as to what happened. The first declaration (char line [];) must have initialized variable 'line' as a char-ptr to some 0-length area, while the second declaration (char *line;) left 'line' un-initialized. Hence the value of 'line' in the first case was legitimate -- even if it addressed 0-length space -- while the un-initialized "value" of 'line' in the second case could not even be considered legitimate. Scott McCaughrin
vu0310@bingvaxu.cc.binghamton.edu (R. Kym Horsell) (09/03/90)
In article <24700008@sunb0.cs.uiuc.edu> mccaugh@sunb0.cs.uiuc.edu writes: \\\ > Since un-initialized ptrs so often lead to segmentation faults, here is my >guess as to what happened. The first declaration (char line [];) must have >initialized variable 'line' as a char-ptr to some 0-length area, while the >second declaration (char *line;) left 'line' un-initialized. Hence the value >of 'line' in the first case was legitimate -- even if it addressed 0-length \\\ Your analysis seems substantially correct -- but why guess? Try running: main(){ char a[]; char *b; printf("%d\n",sizeof(a)); printf("%d\n",sizeof(b)); } Output: 0 4 The *complier* sure thinks ``a'' is zero length -- if any string copy is done there you will essentially be copying ``above the stack pointer'' (if nothing else is declared local). This *may*, but not necessarily, cause a trap (depends on the hardware). -Kym Horsell
chris@mimsy.umd.edu (Chris Torek) (09/03/90)
In article <24700008@sunb0.cs.uiuc.edu> mccaugh@sunb0.cs.uiuc.edu writes: >... the question orignally posed did not address correctness of either >program, but rather why the second version precipitates a segmentation >fault where the first one does not. So what is the key difference in >the two programs? It would appear to be in the declaration of variable >'line' ... This much is correct. Quick recap: the program that caused a `segmentation fault, core dumped' result was of the form prog2> main() { char *line; strcpy(line, "foo"); } while the program that appeared to work was of the form prog1> main() { char line[]; strcpy(line, "foo"); } >which is a null (length = 0) string in the first declaration (char []) This is a slightly peculiar definition for `null string'. The program labelled `prog1' has a constraint violation: the subscript brackets in the declaration must not be empty. A buggy compiler allowed the empty declaration, and---since I happen to know the internal implementation of this compiler, I know what it did---treated it as `char line[0];', reserving zero bytes for the array `line'. >and a char-ptr in the second. We are not informed as to whether the >assignment (via 'strcpy') caused the problem or the subsequent 'printf' >but I would suspect the latter. You would suspect incorrectly. >(If the former caused the problem in [prog2], why not in [prog1]?) The actual generated code on a VAX for prog1 is (unoptimized but slightly simplified): _main: .word 0 # save no registers subl2 $0,sp # allocate 0 bytes of stack for line[] pushab L1 # push &"foo"[0] pushab (fp) # push &line[0] calls $2,_strcpy # call strcpy() ret # return from main, no value L1: .ascii "foo\0" # C string {f,o,o,\0} Compare this with a correct program in which line[] is declared as `char line[4];': _main: .word 0 # save no registers subl2 $4,sp # allocate 4 bytes of stack for line[] pushab L1 # push &"foo"[0] pushab -4(fp) # push &line[0] calls $2,_strcpy # call strcpy() ret # return from main, no value L1: .ascii "foo\0" # C string {f,o,o,\0} The only difference between these two programs at run time is what goes on the stack. Assume that the stack pointer sp in main() is 0x7fffeb80. (At the entry to a subroutine, the VAX makes sp==fp; fp is later used to mean `sp before we adjusted it with a subl2 or push instruction'.) In the first program, the `subl2 $0' does not affect sp at all; then we have a pushab that pushes, say, 0x1000 on the stack, and in the process changes sp to 0x7fffeb7c. Then we have a `pushab (fp)'; this pushes 0x7fffeb80 on the stack, in the process changing it to 0x7fffeb78. The `calls $2' then pushes 2, and then a register save mask, and some other stuff. strcpy() then copies {foo\0} (four bytes) to locations 0x7fffeb80 through 0x7fffeb83, and returns to main(). At this point the word `foo\0' has overwritten whatever used to be at 0x7fffeb80. The only question then is: what was there, and was it any use? As it happens, on the VAX, what was there was a 0, and it does not get used. In the corrected program, strcpy() copies into four bytes at 0x7fffeb7c.. 0x7fffeb7f, which were set aside for that purpose by the `subl2 $4,sp'. Program 2, on the other hand, compiles to code something like this: _main: .word 0 # save no registers subl2 $4,sp # make space for `line' pushab L1 # push &"foo"[0] pushl -4(fp) # push value of `line' calls $2,_strcpy ret L1: .ascii "foo\0" Again, on the VAX, this might start with sp=fp=0x7fffeb80. The subl2 would then set sp=0x7fffeb7c. The `pushl' instruction would then push the contents of locations 0x7fffeb7c..0x7fffeb7f. This is, on the VAX, normally preset to 0 (newly allocated stack pages are cleared so that programs cannot search memory for passwords, or unencrypted files from the last invocation of the editor, or whatever). Thus, this asks strcpy to copy {foo\0} into location 0. Location 0 is not writable, and the program gets a segmentation violation signal and crashes. >My point is that certain compilers MAY draw some >serious distinction between char-ptrs and "true" strings (char [*]) even >when the string is null. *Every* C compiler *must* draw a serious distinction between a pointer and an array. See the Frequently Asked Questions list for some of the differences. The two are not and never have been equivalent, and an array is never a pointer. An array *object* is *converted to* a pointer *value* in some places, and in one very special place an array *declaration* is *rewritten as* a pointer declaration. But an array never `is' a pointer. > Since un-initialized ptrs so often lead to segmentation faults, here is my >guess as to what happened. The first declaration (char line [];) must have >initialized variable 'line' as a char-ptr to some 0-length area, while the >second declaration (char *line;) left 'line' un-initialized. Hence the value >of 'line' in the first case was legitimate -- even if it addressed 0-length >space -- while the un-initialized "value" of 'line' in the second case could >not even be considered legitimate. Not quite. The empty brackets incorrectly passed through the compiler, leaving line[] as a zero-length area but *NOT* a `pointer to' a zero-length area. The pointer aspect only comes in when the array name (`line') is used in a value context (argument to strcpy); then the compiler changes the object to a value by computing the address of element 0 of the array. In other words, the analysis above derives one correct conclusion from false premises. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris