ccdn@levels.sait.edu.au (DAVID NEWALL) (08/07/89)
I want to scan a string with fields separated by commas. To do this, I wrote the following: do ... while ((s = strchr(s, ',') + 1) - 1); I've been told that this is not valid C because, in the case that there are no more fields (commas), strchr() returns NULL; and NULL + 1 is not valid. Comments, anyone? David Newall Phone: +61 8 343 3160 Unix Systems Programmer Fax: +61 8 349 6939 Academic Computing Service E-mail: ccdn@levels.sait.oz.au SA Institute of Technology Post: The Levels, South Australia, 5095
cpcahil@virtech.UUCP (Conor P. Cahill) (08/08/89)
In article <1043@levels.sait.edu.au>, ccdn@levels.sait.edu.au (DAVID NEWALL) writes: > > I've been told that this is not valid C because, in the case that there > are no more fields (commas), strchr() returns NULL; and NULL + 1 is not > valid. NULL, in this case, is just a pointer that has the value 0. NULL + 1 is a valid operations, however *(NULL+1) is not. I wouldn't code the loop as you have displayed because one has to spend time thinking about what the will is trying to do. A "better" method would be something like: do ...; s = strchr(s,','); while ( s++ != NULL );
gwyn@smoke.BRL.MIL (Doug Gwyn) (08/09/89)
In article <1043@levels.sait.edu.au> ccdn@levels.sait.edu.au (DAVID NEWALL) writes:
-I've been told that this is not valid C because, in the case that there
-are no more fields (commas), strchr() returns NULL; and NULL + 1 is not
-valid.
-Comments, anyone?
You were told right. You're not allowed to perform pointer arithmetic
involving null pointers.
gwyn@smoke.BRL.MIL (Doug Gwyn) (08/09/89)
In article <961@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes: >NULL + 1 is a valid operations, ... No!
wjr@ftp.COM (Bill Rust) (08/09/89)
In article <10684@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: >In article <961@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes: >>NULL + 1 is a valid operations, ... > >No! In my experience, NULL is always defined using the preprocessor line "#define NULL 0" (or 0L). Since the while construct is relying on the fact NULL is, in fact, 0, doing NULL + 1 - 1 is ok. I certainly wouldn't recommend using it as a reference to memory. But, unless NULL is a reserved word to your compiler, the compiler sees 0 + 1 - 1 and that is ok. Bill Rust (wjr@ftp.com)
gwyn@smoke.BRL.MIL (Doug Gwyn) (08/10/89)
In article <696@ftp.COM> wjr@ftp.UUCP (Bill Rust) writes: -In article <10684@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: ->In article <961@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes: ->>NULL + 1 is a valid operations, ... ->No! -In my experience, NULL is always defined using the preprocessor line -"#define NULL 0" (or 0L). That's not always true, but anyway it's irrelevant... -Since the while construct is relying on the fact NULL is, in fact, 0, -doing NULL + 1 - 1 is ok. The code example was adding 1 to the return value from strchr(). strchr() does not return a preprocessor macro; it returns a null macro (when it doesn't return a pointer to a valid char object). You are not allowed to add 1 to a null pointer. If you happen to get away with it, you're just lucky; it's not correct code. In any event, if you rely on NULL being defined (for example in <stdio.h>) as the source character string "0", then you're asking for trouble, since it can be defined as any valid form of null pointer constant, including for example "((void*)0)". Indeed, it's rather expected that standard- conforming implementations are more likely to choose the latter form. Your program may suddenly stop working when a new release of the compiler is installed, or when you port it to another environment.
gwyn@smoke.BRL.MIL (Doug Gwyn) (08/10/89)
In article <10691@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: >strchr() does not return a preprocessor macro; it returns a null macro That was supposed to say: strchr() does not return a preprocessor macro; it returns a null pointer ... Our stupid news system software wouldn't let me cancel the article so that I could send a corrected version. I hope that this slip-up didn't cause anybody too much confusion. strchr(/*...*/)+1 is WRONG when strchr() returns a null pointer.
chris@mimsy.UUCP (Chris Torek) (08/10/89)
In article <696@ftp.COM> wjr@ftp.COM (Bill Rust) writes: >In my experience, NULL is always defined using the preprocessor line >"#define NULL 0" (or 0L). NULL may correctly (by the pANS) be defined as `(void *)0'. >Since the while construct is relying on the fact NULL is, in fact, 0, >doing NULL + 1 - 1 is ok. It is *if* two conditions hold: 0. NULL is `#define'd as an integral constant zero rather than (void *)0, and 1. the loop actually reads `while (NULL + 1 - 1)'. The latter did not hold in the original example, which was do ... while ((s = index(s, ',') + 1) - 1); The result of <expression yeilding non nil character pointer> + 1 is a pointer to the character `beyond the one returned', so that s = index("foo, bar", ',') + 1 winds up making s point to the space in "foo, bar"; but the result of <expression yeilding nil character pointer> + 1 is not defined.% On many machines it `just happens' to give the address of byte number 1 in the machine; loading this into a machine pointer register (e.g., for assignment to s) may cause a runtime trap. In any case, its being undefined gives the system license to do arbitrarily annoying things at this point. The `-1' after this is thus irrelevant: like Humpty Dumpty, once a pointer is broken, not all the King's horses nor all the King's persons%% can put it back together again. ----- % So *that* is how you get a butterfly! :-) %% non-sexist noun :-) [too bad about `King'] -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
chad@lakesys.UUCP (D. Chadwick Gibbons) (08/10/89)
In article <696@ftp.COM> wjr@ftp.UUCP (Bill Rust) writes: |In my experience, NULL is always defined using the preprocessor line |"#define NULL 0" (or 0L). Since the while construct is relying on the |fact NULL is, in fact, 0, doing NULL + 1 - 1 is ok. I certainly wouldn't |recommend using it as a reference to memory. But, unless NULL is a |reserved word to your compiler, the compiler sees 0 + 1 - 1 and that is |ok. Bad assumption. In most decent implementations, NULL is indeed defined as either 0 or 0L. But this can't be, and isn't, true in all implementations, which immediately prohibts use of it, if for nothing else than portability reasons. In many current implmentations, NULL is often defined as ((char *)0) since it is the only "safe" thing to do--since many programmers are not safe. As defined by K&R2, NULL is an expression "with value 0, or such cast to type void *." (A6.6, p.198) This allows implementations to define NULL as (void *)0, which would cause your NULL +1 -1 to fail. In spite of all that, why the hell would you want to use something designed to designate a _nil pointer_ as an integer expression?! All of the above is moot; NULL should not be used as an integer in an integer expression! If using a symbolic is that important to you, do the ASCII thing: #define NUL (0) Or perhaps something a little more readable: #define ZERO (0) (Don't you feel sorry for those you don't know what a "0" means when they see it inside code? I know I sure do.) -- D. Chadwick Gibbons, chad@lakesys.lakesys.com, ...!uunet!marque!lakesys!chad
wolfgang@ruso.UUCP (Wolfgang Deifel) (08/10/89)
ccdn@levels.sait.edu.au (DAVID NEWALL) writes: > do > ... > while ((s = strchr(s, ',') + 1) - 1); >I've been told that this is not valid C because, in the case that there >are no more fields (commas), strchr() returns NULL; and NULL + 1 is not >valid. Why should NULL + 1 not be valid ??? NULL is a pointer with the value 0 and you can add the integer 1 to it ( but you cannot access *s in the case strchr is NULL of course ). Wolfgang.
rae98@wash08.UUCP (Robert A. Earl) (08/10/89)
In article <696@ftp.COM> wjr@ftp.UUCP (Bill Rust) writes: >In article <10684@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: >>In article <961@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes: >>>NULL + 1 is a valid operations, ... >>No! >In my experience, NULL is always defined using the preprocessor line >"#define NULL 0" (or 0L). Since the while construct is relying on the >fact NULL is, in fact, 0, doing NULL + 1 - 1 is ok. >Bill Rust (wjr@ftp.com) I have to disagree with Bill here. The NULL being returned was from a string manipulation function...ie not just a NULL but a (char *) NULL....I believe it is illegal (or at least unportable) to add (char *)NULL + 1. -- =========================================================== Name: Bob Earl Phone: (202) 872-6018 (wk) UUCP: ...!uunet!wash08!rae98 BITNET: ...rae98@CAS (At least, that is what I'm told)
chris@mimsy.UUCP (Chris Torek) (08/11/89)
In article <940@lakesys.UUCP> chad@lakesys.UUCP (D. Chadwick Gibbons) writes: >... In most decent implementations, NULL is indeed defined as either >0 or 0L. Right. >But this can't be, and isn't, true in all implementations, No and yes: it could be, but it is not. >... In many current implmentations, NULL is often defined as ((char *)0) >since it is the only "safe" thing to do [meaning `the only way the vendor >can keep the authors of bad code happy']. This is both unsafe and wrong, even if it does keep such authors happy. Consider: If we write char *cp; int *ip; ip = cp; the compiler must issue some kind of diagnostic (it says so in the proposed ANSI C specification, and it says in K&R-1 that this operation is machine-dependent, and all quality compilers do indeed generate a warning). This situation does not change if we write ip = (char *)ip; It does change if we write instead ip = (int *)(char *)ip; which puts the value in ip through two transformations (from pointer-to-int to pointer-to-char, then from pointer-to-char to pointer-to-int), and these two together are required to reproduce the original value (this is something of a special case). So: consider what happens if some implementer has wrongly put the line #define NULL ((char *)0) in <stdio.h> and <stdarg.h> and so forth, and we write ip = NULL; The compiler sees ip = ((char *)0); which, as far as the type system is concerned, is identical to ip = cp; ---that is, it is machine dependent, and requires a warning. We can (probably) eliminate the warning% by adding a cast: ip = (int *)NULL; which expands to ip = (int *)((char *)0); On *most* machines, this `just happens' to work. But if we look very closely at the language definition, we find that it is not *required* to work. The version of this that is required to work is instead ip = (int *)(char *)(int *)0; We are not allowed (outside of machine-dependent code) to change a pointer-to-char into a pointer-to-int unless the pointer-to-char itself came into existence as the result of a cast from a pointer-to-int. The only way to *create* a nil-pointer-to-int in the first place is to write (int *)0, or (in the proposed ANSI C) (int *)(void *)0. Of course, the actual definition of NULL in <stdio.h> and <stdarg.h> and so on is provided per machine, so if int *ip = (int *)(char *)0; `just happens' to work on that machine, the vendor could get away with it. But int *ip = NULL; is guaranteed to work *without* generating warnings on any machine where NULL is correctly defined, and one should not have to write int *ip = (int *)NULL; just to avoid getting warnings---nor should the compiler be silent about code like int *ip; char *cp; ip = cp; The rest of <940@lakesys.UUCP> is correct. ----- % The (probably) in eliminating warnings refers to the fact that a compiler can warn about anything it pleases: % cc -o foo foo.c cc: Warning: relative humidity and barometer pressure indicate that thunderstorms are likely cc: Warning: your shoelace is untied cc: Warning: this code looks ugly cc: Warning: your mother wears army boots cc: Warning: Hey! Keep away from me with that axe! cc: Warning: Ack! No, wait, I di(*&1to01llk -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
chris@mimsy.UUCP (Chris Torek) (08/11/89)
>> while ((s = strchr(s, ',') + 1) - 1) In article <826@ruso.UUCP> wolfgang@ruso.UUCP (Wolfgang Deifel) writes: >Why should NULL + 1 not be valid ??? NULL is a pointer with the value 0 >and you can add the integer 1 to it .... NULL is not a pointer with the value 0, and 1 is not being added to NULL here, but rather to a nil-pointer-to-char in the case in question. NULL is a preprocessor macro; it expands to either an integral constant zero (whose type is one of the integral types, e.g., int or short or long, and whose value is zero) or to such a value cast to pointer-to-void (whose type is pointer-to-void and whose value is unknowable). A nil-pointer-to-char has type pointer-to-char and an ineffable value. There is no way to talk about its value other than to say `it is a nil pointer to char'. In particular, you cannot say what happens when you add one to it. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
dfp@cbnewsl.ATT.COM (david.f.prosser) (08/12/89)
In article <18996@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: > > ip = (int *)((char *)0); > >On *most* machines, this `just happens' to work. But if we look very >closely at the language definition, we find that it is not *required* >to work. The version of this that is required to work is instead > > ip = (int *)(char *)(int *)0; > >We are not allowed (outside of machine-dependent code) to change a >pointer-to-char into a pointer-to-int unless the pointer-to-char itself >came into existence as the result of a cast from a pointer-to-int. The >only way to *create* a nil-pointer-to-int in the first place is to >write (int *)0, or (in the proposed ANSI C) (int *)(void *)0. The pANS does guarantee that, for example, 0 == (void *)(int *)(char *)0 [3.2.2.3: "Two null pointers, converted through possibly different sequences of casts to pointer types, shall compare equal."] Therefore, I interpret the pANS as requiring (int *)(char *)0 to have the same value as (int *)0 (the nil-pointer-to-int, in your terminology)--not `just happening' to work. Dave Prosser ...not an official X3J11 answer...
gwyn@smoke.BRL.MIL (Doug Gwyn) (08/12/89)
In article <18996@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: > cc: Warning: this code looks ugly I've seen this one (actually I think it was "expression too complex") > cc: Warning: your mother wears army boots You must be using our compiler..
bph@buengc.BU.EDU (Blair P. Houghton) (08/12/89)
In article <10709@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: >In article <18996@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >> cc: Warning: this code looks ugly > >I've seen this one (actually I think it was "expression too complex") > >> cc: Warning: your mother wears army boots > >You must be using our compiler.. I got this one the other day: cc: Warning: this is a riot; posting to comp.lang.c --Blair "Y'see, I was trying to add two NULL pointers together, and..."
bengsig@oracle.nl (Bjorn Engsig) (08/14/89)
Article <3726@buengc.BU.EDU> by bph@buengc.bu.edu (Blair P. Houghton) says: | --Blair | "Y'see, I was trying to | add two NULL pointers | together, and..." Can you add two pointers ? [ No replies or followups please :-) ] -- Bjorn Engsig, ORACLE Europe \ / "Hofstadter's Law: It always takes Path: mcvax!orcenl!bengsig X longer than you expect, even if you Domain: bengsig@oracle.nl / \ take into account Hofstadter's Law"
wolfgang@ruso.UUCP (Wolfgang Deifel) (08/15/89)
ccdn@levels.sait.edu.au (DAVID NEWALL) writes: > while ((s = strchr(s, ',') + 1) - 1); >I've been told that this is not valid C because, in the case that there >are no more fields (commas), strchr() returns NULL; and NULL + 1 is not >valid. I think it's a difference if you write " NULL + 1 " ( which is non- portable C, NULL is a machine dependent macro ) and " strchr(...) + 1 ". strchr() is a function that returns always a legal value. If strchr fails it will return (char*)0 ( regardless of the machine or the compiler ), and here it's legal to add '1' ( the result is (char*)1 ). ---------------------------------------------------------------------------- Wolfgang Deifel Dr. Ruff Software GmbH, 5100 Aachen, Juelicherstr. 65-67, W-Germany uucp: ...!uunet{!mcvax}!unido!rwthinf!ruso!wolfgang - phone : +49 241 156038
bph@buengc.BU.EDU (Blair P. Houghton) (08/15/89)
In article <474.nlhp3@oracle.nl> bengsig@oracle.nl (Bjorn Engsig) writes: >Article <3726@buengc.BU.EDU> by bph@buengc.bu.edu (Blair P. Houghton) says: >> "Y'see, I was trying to >> add two NULL pointers >> together, and..." > >Can you add two pointers ? [...] :-) Well, _I_ can. :-) --Blair "But not in this dump."
gwyn@smoke.BRL.MIL (Doug Gwyn) (08/16/89)
In article <828@ruso.UUCP> wolfgang@ruso.UUCP (Wolfgang Deifel) writes: >it will return (char*)0 ( regardless of the machine or the compiler ), >and here it's legal to add '1' ( the result is (char*)1 ). What is this, a time-shift phenomenon? We keep getting a sprinkling of articles making this incorrect claim. It is NOT legal to perform arithmetic on a null pointer.
chris@mimsy.UUCP (Chris Torek) (08/16/89)
In article <828@ruso.UUCP> wolfgang@ruso.UUCP (Wolfgang Deifel) writes: >I think it's a difference if you write " NULL + 1 " ( which is non- >portable C, NULL is a machine dependent macro ) and " strchr(...) + 1 ". There is a difference. However: >strchr() is a function that returns always a legal value. If strchr fails >it will return (char*)0 ( regardless of the machine or the compiler ), >and here it's legal to add '1' ( the result is (char*)1 ). this is what we just got finished saying is false: it is NOT legal to add 1 to (char *)0; indeed, it is not legal to add 1 to any of the various infinite varieties of nil pointer. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
ruud@targon.UUCP (Ruud Harmsen) (08/16/89)
In article <18996@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >Consider: If we write > char *cp; > int *ip; > ip = cp; > >the compiler must issue some kind of diagnostic (it says so in the >proposed ANSI C specification, and it says in K&R-1 that this operation >is machine-dependent, ... I suppose this is machine-dependent because of alignment: char-pointers can point to just about anywhere, but int-pointers on many machines have to be aligned properly. My question is: can I make sure in my program, that though generally non-portable this IS portable? I tried this once in the following way: The char-pointer gets its value from malloc, which the manual says gives pointers properly aligned for any type. I never change that char-pointer other than by adding multiples of sizeof(int) to it. Is a "ip = cp" guaranteed safe under these conditions, so can I ignore the compiler-warning?
ccdn@levels.sait.edu.au (DAVID NEWALL) (08/17/89)
A while ago, ccdn@levels.sait.edu.au (That's me!) wrote: > do > ... > while ((s = strchr(s, ',') + 1) - 1); > > I've been told that this is not valid C Thanks, everyone, for your opinions. I'll remember the rule in future: (Offsets to NULL are non-portable, and should never be used). David Newall Phone: +61 8 343 3160 Unix Systems Programmer Fax: +61 8 349 6939 Academic Computing Service E-mail: ccdn@levels.sait.oz.au SA Institute of Technology Post: The Levels, South Australia, 5095
dfp@cbnewsl.ATT.COM (david.f.prosser) (08/18/89)
In article <597@targon.UUCP> ruud@targon.UUCP (Ruud Harmsen) writes: >I suppose this is machine-dependent because of alignment: char-pointers can >point to just about anywhere, but int-pointers on many machines have to be >aligned properly. My question is: can I make sure in my program, that >though generally non-portable this IS portable? I tried this once in the >following way: >The char-pointer gets its value from malloc, which the manual says gives >pointers properly aligned for any type. I never change that char-pointer >other than by adding multiples of sizeof(int) to it. >Is a "ip = cp" guaranteed safe under these conditions, so can I ignore >the compiler-warning? Almost. Strictly speaking, malloc must return a pointer to an object that can be accessed by a type commensurate with its size in bytes. For example, ``malloc(1)'' need not return a pointer that is appropriately aligned for a pointer-to-int. Moreover, it may well be possible to argue that unless the requested size is a multiple of the size of an int, the returned pointer need not be aligned appropriately for an int. For example, ``malloc(5)''. However, the rest of your conditions are sufficient for the guarantee of correct behavior. Dave Prosser ...not an official X3J11 answer...
gwyn@smoke.BRL.MIL (Doug Gwyn) (08/19/89)
In article <597@targon.UUCP> ruud@targon.UUCP (Ruud Harmsen) writes: >Is a "ip = cp" guaranteed safe under these conditions, so can I ignore >the compiler-warning? If you use a cast it is.
casper@betty.fwi.uva.nl (Casper H.S. Dik) (08/20/89)
In article <597@targon.UUCP> ruud@targon.UUCP (Ruud Harmsen) writes: > >I suppose this is machine-dependent because of alignment: char-pointers can >point to just about anywhere, but int-pointers on many machines have to be >aligned properly. My question is: can I make sure in my program, that >though generally non-portable this IS portable? I tried this once in the >following way: >The char-pointer gets its value from malloc, which the manual says gives >pointers properly aligned for any type. I never change that char-pointer >other than by adding multiples of sizeof(int) to it. >Is a "ip = cp" guaranteed safe under these conditions, so can I ignore >the compiler-warning? No. It is not safe. If you ever want to run your program on a Data General MV, among others, you should use "ip = (int *) cp". Since pointers to anything except char are word aligned on MV machines, they decided that they could drop the last bit of the address and shift it. A char pointer pointing to the second byte of memory is represented with 0x2. A word pointer to the same location is represented by 0x1. This gave problems when porting programs. Most programmers write "newp = (type *) malloc (sizeof type)" but many forget the cast to char with free: "free(oldp)" instead of "free((char *) oldp)" This works fine in most cases, but not on machines that shift pointers when casting. --cd Casper H.S. Dik VCP/HIP: +31205922022 University of Amsterdam | casper@fwi.uva.nl The Netherlands | casper%fwi.uva.nl@hp4nl.nluug.nl
ruud@targon.UUCP (Ruud Harmsen) (08/22/89)
In article <781@janus.UUCP> casper@fwi.uva.nl (Casper H.S. Dik) writes: >> The char-pointer gets its value from malloc, and I never change that char- >> pointer other than by adding multiples of sizeof(int) to it. Is a "ip = >> cp" guaranteed safe under these conditions, so can I ignore the compiler- >> warning? >No. It is not safe. If you ever want to run your program on a Data General >MV, among others, you should use "ip = (int *) cp". You're right, of course. As a matter of fact, I did use the cast in my program. Sorry I didn't mention that in the original article. Ruud Harmsen