romwa@gpu.utcs.toronto.edu (Mark Dornfeld) (10/11/88)
I've been reading a book on /rdb recently. In the book they
have a small example where they assign strings.
char *p1="first";
char *p2;
main( argc, argv )
int argc;
char *argv[];
{
p2 = " is:";
}
The assignment to p1 makes sense to me, because the compiler
could set aside the size of the string being assigned. The
second case baffles me. I always thought that you had to give
a string a "maximum size" and then use strcpy or sprintf for
assignment. Isn't the assignment of p2 a dangerous thing to
do since the compiler has (presumably) only left enough space
for the pointer and not for the string. I tried this example
out on QuickC and everything worked. i.e. printf'ing p2 gives
'is:'.
Could someone please shed some light on this for me. Could
you also please respond via e-mail, since I am borrowing someone's
account in order to post this.
advTHANKSance
Pavneet Arora
...!utgpu!rom!pavneet
Royal Ontario Museum
100 Queen's Park
Toronto, Ontario
M5S 2C6
(416) 585-5626
john@chinet.chi.il.us (John Mundt) (10/13/88)
In article <1988Oct11.143728.28627@gpu.utcs.toronto.edu> romwa@gpu.utcs.toronto.edu (Mark Dornfeld) writes: > char *p1="first"; > char *p2; > main( ) > { > p2 = " is:"; > } >Isn't the assignment of p2 a dangerous thing to >do since the compiler has (presumably) only left enough space >for the pointer and not for the string. The two are the same. Each is a pointer to char. Each string, "first" and " is:" are reserved by the compiler as unnamed strings somewhere in memory. Both p1 and p2 are pointers who are set to point to these strings. You could reassign p1 or p2 to any other string as well. In other words, you could say p1 = p2 or p2 = (char *) 0. Try running sizeof() on either of them and both will return an integer equal to sizeof(char *). Now, this would be different: char p[] = { 't','h','i','s',' ','a',' ','s','t','r','i','n','g','\n' }; main() { printf(p); } Here, p is a fixed array of characters and cannot be reassigned. Trying to say p = (char *) 0 would be illegal. Further, sizeof(p) would be the length of the string "this is a string\n" rather than the size of a character pointer. -- --------------------- John Mundt Teachers' Aide, Inc. P.O. Box 1666 Highland Park, IL (312) 432-8860 -998-5007 Voice || -432-5386 Modem
kyriazis@rpics (George Kyriazis) (10/13/88)
In article <6777@chinet.chi.il.us> john@chinet.chi.il.us (John Mundt) writes: > ... stuff deleted ... >Each string, >"first" and " is:" are reserved by the compiler as unnamed strings >somewhere in memory. ...... My question is: Are strings like " is:" volatile or not? When you say p2 = " is:", are you sure that the string will remain in memory or the optimizer will decide to put something else there since the string is basically a constant used only once?? I also have the idea that if you say p1 = "abc"; p2 = "abc"; p1 and p2 will have different value, since the strings are not the same (they have the same contents, but physically should be different). Is that a right assumption? George Kyriazis kyriazis@turing.cs.rpi.edu ------------------------------
peter@ficc.uu.net (Peter da Silva) (10/14/88)
In article <6777@chinet.chi.il.us>, john@chinet.chi.il.us (John Mundt) writes: > Now, this would be different: > char p[] = { 't','h','i','s',' ','a',' ','s','t','r','i','n','g','\n' }; > main() > { > printf(p); > } Different all right. You forgot to terminate your string. Your milage will vary, but I got: % a.out | vcat this a string l&^Z'.^R% [] [] is the cursor. -- Peter da Silva `-_-' Ferranti International Controls Corporation. "Have you hugged U your wolf today?" peter@ficc.uu.net
levy@ttrdc.UUCP (Daniel R. Levy) (10/16/88)
In article <6777@chinet.chi.il.us>, john@chinet.chi.il.us (John Mundt) writes: > char p[] = { 't','h','i','s',' ','a',' ','s','t','r','i','n','g','\n' }; > > main() > { > printf(p); > } Be careful with this kind of declaration: without the terminating '\0' there is no guarantee where the "string" p[] ends. In fact, try this on a vax, 3b, or machine of similar architecture: char p[] = { 't','h','i','s',' ','a',' ','s','t','r','i','n','g',' ', ' ','\n' }; /* 16 bytes, NO NULL TERMINATOR */ char q[] = { 't','h','i','s',' ','g','a','r','b','a','g','e','\n','\0' }; main() { printf(p); } The output of the program will be: this a string this garbage The 16 bytes in p[] is to make it end just before the word boundary on which q[] begins. Otherwise the system will probably pad p[] with nulls and you won't notice the lack of explicit null terminator. -- |------------Dan Levy------------| THE OPINIONS EXPRESSED HEREIN ARE MINE ONLY | Bell Labs Area 61 (R.I.P., TTY)| AND ARE NOT TO BE IMPUTED TO AT&T. | Skokie, Illinois | |-----Path: att!ttbcad!levy-----|
nagel@paris.ics.uci.edu (Mark Nagel) (10/17/88)
In article <1414@imagine.PAWL.RPI.EDU>, kyriazis@rpics (George Kyriazis) writes: |In article <6777@chinet.chi.il.us> john@chinet.chi.il.us (John Mundt) writes: |>Each string, |>"first" and " is:" are reserved by the compiler as unnamed strings |>somewhere in memory. ...... | |My question is: Are strings like " is:" volatile or not? |When you say p2 = " is:", are you sure that the string will remain in |memory or the optimizer will decide to put something else there since |the string is basically a constant used only once?? Of course not! That would be analogous to having the 2 in: x = 2; change at some point in the program since it is "basically a constant." Also, *all* code has the potential for being executed more than once through function calls (well, not global variable initialization, but all code in functions). |I also have the idea that if you say | p1 = "abc"; | p2 = "abc"; |p1 and p2 will have different value, since the strings are not the same |(they have the same contents, but physically should be different). |Is that a right assumption? No. Some compilers will treat them as different objects (i.e. you get two distinct pointer values for the two string constants). Other's will optimize the program's space usage by sorting and uniq'ing all of the strings so that different string constant references all have the same physical address. This is fine since constant strings should be read-only objects. Mark D. Nagel UC Irvine - Dept of Info and Comp Sci | The probability of someone nagel@ics.uci.edu (ARPA) | watching you is proportional to {sdcsvax|ucbvax}!ucivax!nagel (UUCP) | the stupidity of your action.
gandalf@csli.STANFORD.EDU (Juergen Wagner) (10/17/88)
In article <1414@imagine.PAWL.RPI.EDU> George Kyriazis writes: ... > p1 = "abc"; > p2 = "abc"; >p1 and p2 will have different value, since the strings are not the same >(they have the same contents, but physically should be different). Hmmm.... >Is that a right assumption? I don't think so. Look, what is the same, is not the pointer to those strings. The contents of the respective memory locations are the same (happen to be). There is a difference between e.g. ints and those strings: ints fit into a register and can be 'in-line' coded, strings can't. If the compiler finds a line foo = 6; and another line bar = 6; then these values might be transformed into instructions loading the value 6 directly into the locations of foo and bar (i.e. without evaluating a lot). On the other hand, the lines p1 = "abc"; p2 = "abc"; do not allow to do that in general. The optimization is to assign a fixed memory location to each of those strings, and optimize the use of their addresses. Usually, strings like these are stored in the static area of the data space. They have to be distinct unless the compiler can make assumptions like "static data are read-only and can therefore be merged into the text space". If you want to share then, use xstr(1) to get shared strings. -- Juergen "Gandalf" Wagner, gandalf@csli.stanford.edu Center for the Study of Language and Information (CSLI), Stanford CA
gwyn@smoke.ARPA (Doug Gwyn ) (10/17/88)
In article <790@paris.ics.uci.edu> nagel@paris.ics.uci.edu (Mark Nagel) writes: >| p1 = "abc"; >| p2 = "abc"; >|p1 and p2 will have different value, since the strings are not the same >optimize the program's space usage by sorting and uniq'ing all of the >strings so that different string constant references all have the same >physical address. This is fine since constant strings should be >read-only objects. The key is that you are allowed to portably compare pointers only in two cases: at least one pointer is a null pointer, or both pointers are pointers into the same object. This means that the fact that p1==p2 for pointers to distinct objects is not a problem, since such comparison is "undefined". (Otherwise, the two string objects would have to be given unique addresses.) As it stands, if p1 and p2 are pointers to const char, the storage may be shared, but not if they are pointers to char (unless the compiler can determine that the pointers and all aliases to them are not used to modify the contents of the string literals). Thus the automatic "dumb" crunching together of string literals is not permitted in a standard-conforming implementation. You can accomplish this coalescing yourself in your source code: static char abc_str[] = "abc"; ... char *p1 = abc_str; char *p2 = abc_str; Whether it is a good idea or not depends on how the pointers are used.
knudsen@ihlpl.ATT.COM (Knudsen) (10/18/88)
In article <1414@imagine.PAWL.RPI.EDU>, kyriazis@rpics (George Kyriazis) writes: > My question is: Are strings like " is:" volatile or not? > When you say p2 = " is:", are you sure that the string will remain in > memory or the optimizer will decide to put something else there since > the string is basically a constant used only once?? Nope. p2 will continue to point to " is:" unless you reassign p2, which the compiler has no idea if or when you might do. So as long as p2 exists, that constant string can't be recycled. Worse yet, even if p2 is an auto variable, the " is:" has to stick around in the text/code segment so that p2 can be initialized again if that function is re-entered. > I also have the idea that if you say > p1 = "abc"; > p2 = "abc"; > p1 and p2 will have different value, since the strings are not the same > (they have the same contents, but physically should be different). Some compilers are fancy enough to detect identical string constants and merge them, so indeed p1==p2 there. I doubt many compilers go this far, however. -- Mike Knudsen Bell Labs(AT&T) att!ihlpl!knudsen "Lawyers are like handguns and nuclear bombs. Nobody likes them, but the other guy's got one, so I better get one too."
karl@haddock.ima.isc.com (Karl Heuer) (10/20/88)
In article <8696@smoke.ARPA> gwyn@brl.arpa (Doug Gwyn) writes: >... As it stands, if p1 and p2 are pointers to const char, the storage may be >shared, but not if they are pointers to char.... Thus the automatic "dumb" >crunching together of string literals is not permitted in a standard- >conforming implementation. True in K&R C, but fixed in dpANS: "Identical string literals ... need not be distinct. If the program attempts to modify a string literal ... the behavior is undefined." [3.1.4] "This specification allows implementations to share copies of [identical] strings." [R3.1.4] Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
gwyn@smoke.BRL.MIL (Doug Gwyn ) (10/20/88)
In article <9710@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
-True in K&R C, but fixed in dpANS: "Identical string literals ... need not be
-distinct. If the program attempts to modify a string literal ... the behavior
-is undefined." [3.1.4] "This specification allows implementations to share
-copies of [identical] strings." [R3.1.4]
True. I was thinking of string-literal initialized char arrays,
not pointers to string literals. Sorry.