ark@alice.UUCP (Andrew Koenig) (07/26/85)
I am collecting examples of how C can bite the unwary user. Consider, for example, the following program fragment: int i, a[10]; for (i = 0; i <= 10; i++) a[i] = 0; On many implementations, this will result in an infinite loop. If you have any good examples, and you don't mind my re-publishing them (with attribution), send them along! Andrew Koenig AT&T Bell Laboratories 600 Mountain Avenue Murray Hill NJ 07974 research!ark (or alice!ark)
matt@prism.UUCP (07/30/85)
> /* Written 8:36 pm Jul 25, 1985 by ark@alice in prism:net.lang.c */ > /* ---------- "how has C bitten you?" ---------- */ > > I am collecting examples of how C can bite the unwary user. > > Consider, for example, the following program fragment: > > int i, a[10]; > for (i = 0; i <= 10; i++) > a[i] = 0; > > On many implementations, this will result in an infinite loop. > > /* End of text from prism:net.lang.c */ This looks to me like it will simply overwrite one int's worth of memory beyond the end of the array "a" with the value 0. Granted, depending on what happens to be after "a", this can have disastrous results, but is there really an implementation in which it will (reliably) lead to infinte looping? On the other hand, in an implementation where char's are unsigned, this common construct WILL lead to an unterminating loop. I have been bitten by this several times porting code that assumed signed characters to implementation of C without them. char x; while (--x) { do anything... and then some... } I sure wish that while the ANSI committee was adding "signed" to the language, they had standardized whether the default for "char" was signed or unsigned. As long as compilers have to provide them both anyway, what's the harm in choosing one as the default? (Well, maybe the C programming community will eschew the use of "char" and always use either "signed char" or "unsigned char" as appropriate. Wanna bet?) ----------------------------------------------------------------------------- Matt Landau {cca, datacube, ihnp4, inmet, mit-eddie, wjh12}... Mirror Systems, Inc. ...mirror!prism!matt Cambridge, MA (617) 661-0777 ----------------------------------------------------------------------------- "Replace this mandolin with your wombat..."
lcc.niket@LOCUS.UCLA.EDU (Niket K. Patwardhan) (07/30/85)
Andrew: Regarding int i,a[10]; for(i=0; i<=10; i++) a[i] = 0; you should have expected some problems, as you are writing one past the end of the array! The correct test to use is < not <=!
ark@alice.UUCP (Andrew Koenig) (07/31/85)
> Andrew: > Regarding > int i,a[10]; > for(i=0; i<=10; i++) > a[i] = 0; > you should have expected some problems, as you are writing one past the end of > the array! The correct test to use is < not <=! Yes, I know that! The point is that this is an example of something that looks reasonable at first glance and isn't, because of a property that C does not share with many other languages (in most languages, a 10-element array has an element #10). Read my article again. I am looking for examples for my collection, not asking for advice on how to solve this particular problem.
john@frog.UUCP (John Woods) (07/31/85)
My favorite ouch is the following:
if ( thingy_bits & TEST_ME == 0) {
}
"When in doubt, parenthesize." -Kernighan and Plaugher
--
John Woods, Charles River Data Systems, Framingham MA, (617) 626-1101
...!decvax!frog!john, ...!mit-eddie!jfw, jfw%mit-ccc@MIT-XX.ARPA
peter@kitty.UUCP (Peter DaSilva) (08/01/85)
> I am collecting examples of how C can bite the unwary user. > > Consider, for example, the following program fragment: > > int i, a[10]; > for (i = 0; i <= 10; i++) > a[i] = 0; > > On many implementations, this will result in an infinite loop. I assume you mean that auto's are allocated on the stack so &a[10]==&i. I don't see an easy solution to this, except for built-in range checking. I think "Safe/C" has this... Anyone who uses "<=" in a for(;;) loop to initialise an array should be strung up by their index(3) fingers and forced to listen to Sonny Bono chanting "Zero Origin Arrays" until their ears fall off [:->].
david@ecrhub.UUCP (David M. Haynes) (08/02/85)
One of my all time favourites is the non-orthagonality between scanf and printf. Especially the following: scanf("%D %F", long, double); or scanf("%ld %lf", long, double); vs. printf("%ld %f", long, double); Why no %F or %D on printf? And why %lf vs %f? fun! -- -------------------------------------------------------------------------- David M. Haynes Exegetics Inc. ..!utzoo!ecrhub!david "I am my own employer, so I guess my opinions are my own and that of my company."
chris@umcp-cs.UUCP (Chris Torek) (08/02/85)
>> int i, a[10]; >> for (i = 0; i <= 10; i++) >> a[i] = 0; >> >> On many implementations, this will result in an infinite loop. >This looks to me like it will simply overwrite one int's worth of >memory beyond the end of the array "a" with the value 0. Granted, >depending on what happens to be after "a", this can have disastrous >results, but is there really an implementation in which it will >(reliably) lead to infinte looping? How does "every PCC implementation" grab you? (Actually, I suspect there may three or four PCC implementations in which it won't run forever, but it *will* run forever on 4BSD Vaxen.) >On the other hand, in an implementation where char's are unsigned, >this common construct WILL lead to an unterminating loop. > > char x; > while (--x) I assume you mean "while (--x >= 0)". I only use this on "register int"s (especially since it generates a sobgeq if the loop's small enough). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland
bright@dataio.UUCP (Walter Bright) (08/03/85)
In article <5400010@prism.UUCP> matt@prism.UUCP writes: >I sure wish that while the ANSI committee was adding "signed" to the >language, they had standardized whether the default for "char" was >signed or unsigned. As long as compilers have to provide them both >anyway, what's the harm in choosing one as the default? The reason why some compilers default to signed chars and others default to unsigned can be found in the instruction set of the underlying machine. Some machines support signed chars easier than unsigned ones, and vice versa. Some examples: The 8088 can only sign extend a byte to a word from the AL register, whereas a zero extension can cheaply be done from the AL,BL,CL or DL register. Thus, most 8088 C compilers default to unsigned chars. The pdpll, when reading a byte from memory, automatically sign extends the byte. Thus, implementing unsigned chars costs an extra mask instruction for each char read. Not suprisingly, chars default to being signed. For most applications, it doesn't matter whether a char is signed or not, and so it is appropriate for the compiler to select that which can be implemented most efficiently. When it does matter, the programmer should take care (by explicitly declaring it as signed or unsigned).
mjs@eagle.UUCP (M.J.Shannon) (08/03/85)
Then of course is the obvious: if (fd = open("/etc/passwd", 0) == -1) panic("No password file?"); This, too goes in the category of "when in doubt, parenthesize". -- Marty Shannon UUCP: ihnp4!eagle!mjs Phone: +1 201 522 6063 Warped people are throwbacks from the days of the United Federation of Planets.
qwerty@drutx.UUCP (Brian Jones) (08/05/85)
> One of my all time favourites is the non-orthagonality between > scanf and printf. Especially the following: > > scanf("%D %F", long, double); or > scanf("%ld %lf", long, double); > vs. > printf("%ld %f", long, double); > > Why no %F or %D on printf? > And why %lf vs %f? fun! > > -- > -------------------------------------------------------------------------- > David M. Haynes > Exegetics Inc. > ..!utzoo!ecrhub!david > > "I am my own employer, so I guess my opinions are my own and that of > my company." scanf can be given a pointer to any data type: char (string) int, long, float, double; When you put arguments on stack, expansion rules are followed. char => int float => double So, printf can never get a float as an argument, it always gets a double. Therefore, %lf or %F are meaningless to printf. Note that printf does support %d and %ld, and will happily screw up if there is a disagreement between the args and their specification in the format string. ie. %d given a long arg, or %ld given a short. (machine dependent!!). -- Brian Jones aka {ihnp4,}!drutx!qwerty @ AT&T-IS
preece@ccvaxa.UUCP (08/05/85)
> > int i, a[10]; > > for (i = 0; i <= 10; i++) > > a[i] = 0; > > > This looks to me like it will simply overwrite one int's worth of > memory beyond the end of the array "a" with the value 0. Granted, > depending on what happens to be after "a", this can have disastrous > results, but is there really an implementation in which it will > (reliably) lead to infinte looping? ---------- Yes. Any implementation that allocates the space for i following the space for a.
tim@callan.UUCP (Tim Smith) (08/06/85)
> > Consider, for example, the following program fragment: > > > > int i, a[10]; > > for (i = 0; i <= 10; i++) > > a[i] = 0; > > > > On many implementations, this will result in an infinite loop. > > This looks to me like it will simply overwrite one int's worth of > memory beyond the end of the array "a" with the value 0. Granted, > depending on what happens to be after "a", this can have disastrous > results, but is there really an implementation in which it will > (reliably) lead to infinte looping? > The UniSoft System V C compiler for the 68k will reliably produce an infinite loop here. Note that i and a[] are both on the stack. This is what you get: ( high address higher up on page ) i: 4 bytes a[9]: 4 bytes . . . a[0] 4 bytes a[10] will overwrite i. -- Tim Smith ihnp4!{cithep,wlbr!callan}!tim 661
bet@ecsvax.UUCP (Bennett E. Todd III) (08/06/85)
In article <243@ecrhub.UUCP> david@ecrhub.UUCP (David M. Haynes) writes: >One of my all time favourites is the non-orthagonality between >scanf and printf. Especially the following: > > scanf("%D %F", long, double); or > scanf("%ld %lf", long, double); >vs. > printf("%ld %f", long, double); Interesting. The mismatch in formatting arguments was something I had never noticed; I was always amused by this one instance where C's call-by-value catches every C programmer, at least once. (I have never heard anybody claim to have never been bitten by this one -- and it's worst for those who had heard of it before it bit them.) printf("%d", i); seems to cause people to want to try scanf("%d", i); After you have been bitten once or twice you get really paranoid about making sure you pass the *address* of i, not its value: scanf("%d", &i); I am certain that this belongs on the list of all-time most popular blunders. -Bennett -- "Some people are lucky; the rest of us have to work at it." Bennett Todd -- Duke Computation Center, Durham, NC 27706-7756; (919) 684-3695 ...{decvax,seismo,philabs,ihnp4,akgua}!mcnc!ecsvax!bet or dbtodd@tucc.BITNET
david@ecrhub.UUCP (David M. Haynes) (08/07/85)
>> One of my all time favourites is the non-orthagonality between ^^^^^^^^^^^^^^^^^ >> scanf and printf. Especially the following: >> >> scanf("%D %F", long, double); or >> scanf("%ld %lf", long, double); >> vs. >> printf("%ld %f", long, double); >> >> Why no %F or %D on printf? >> And why %lf vs %f? fun! >> > >scanf can be given a pointer to any data type: > char (string) > int, > long, > float, > double; > >When you put arguments on stack, expansion rules are followed. > > char => int > float => double > >So, printf can never get a float as an argument, it always gets a double. >Therefore, %lf or %F are meaningless to printf. > >Note that printf does support %d and %ld, and will happily screw up if >there is a disagreement between the args and their specification in the >format string. ie. %d given a long arg, or %ld given a short. (machine >dependent!!). >Brian Jones aka {ihnp4,}!drutx!qwerty @ AT&T-IS Yes, I did realize that, but (and this is where the show really starts..) the problem I reported is one of NON-ORTHAGONALITY not implementation. Your explanation is quite correct, but why should I (as a programmer) have to worry about translation to stack? Why doesn't printf take %F and %D and translate for me so that the orthagonality of the two system calls (which are considered by most to be related functions) is the same? B.T.W. This originally was in response to the "How has C bitten you?" question but has digressed at this point. Apologies and I'll mail further discussion directly. -- -------------------------------------------------------------------------- David M. Haynes Exegetics Inc. ..!utzoo!ecrhub!david "I am my own employer, so I guess my opinions are my own and that of my company."
mbarker@BBNZ.ARPA (Michael Barker) (08/08/85)
...omitted >So, printf can never get a float as an argument, it always gets a double. >Therefore, %lf or %F are meaningless to printf. > >Brian Jones aka {ihnp4,}!drutx!qwerty @ AT&T-IS Brian (et al) - the reasoning is correct, but printf could easily be changed to accept %lf or %F (or any useful convention) as formatting directions for a value with the knowledge that the value will *actually* be a double. Let's try to avoid letting the implementation details run rough-shod over the abstraction. In this case, the original poster indicated that the mnemonics are incomplete (you can't match up the type of variable and the formatting string in all cases). I think this is a very valid point. The fact that the implementation of printf will receive both types of variables as double shouldn't stop us from providing a complete set of mnemonics. "The sleep of reason produces monsters" mike ARPA: mbarker@bbnz.ARPA UUCP: harvard!bbnccv!mbarker
roy@phri.UUCP (Roy Smith) (08/10/85)
Here's one that just got me: if (sv > score); <----- note extraneous semi-colon score = sv; This was in a series of computations which gave various scores; the fragment above was repeated in various places to pick out the maximum. Of course, the test is a no-op and the assignment was always done. Naturally, this passes lint (even with the -h flag which uses "heuristic tests to attempt to intuit bugs") without any complaint. -- Roy Smith <allegra!phri!roy> System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016
atbowler@watmath.UUCP (Alan T. Bowler [SDG]) (08/11/85)
In article <505@brl-tgr.ARPA> mbarker@BBNZ.ARPA (Michael Barker) writes: >>So, printf can never get a float as an argument, it always gets a double. >>Therefore, %lf or %F are meaningless to printf. >> >>Brian Jones aka {ihnp4,}!drutx!qwerty @ AT&T-IS > >Brian (et al) - the reasoning is correct, but printf could easily be changed to >accept %lf or %F (or any useful convention) as formatting directions for a >value with the knowledge that the value will *actually* be a double. Let's try I thought that the implicit promotion of float to double on passing an argument was one of the things that was going away with the new C standard. It certainly has been high on my personal hit list. I grant that there was a reasonable case for it when C was just for PDP-11's. But these days when there is a good possibility that floating point is being handled by a software implementation of the IEEE standard, it is a loser. In this situation the conversion between float and double is a reasonably expensive operation, and really should only be done when the programmer explicitly asks for it.
ken@turtlevax.UUCP (Ken Turkowski) (08/12/85)
In a previous article Brian Jones writes: >>So, printf can never get a float as an argument, it always gets a double. >>Therefore, %lf or %F are meaningless to printf. PLEASE, don't use %F, when you can use %lf, and similarly for %E, %G, %X, etc. The biggest mistake in the implementation of printf is a disregard to the standard in outputting hexadecimal and e-type output. In the rest of the programming world, hexadecimal is output as (for example): 10AD rather than 10ad and floating-point e-type output as: 3.1415926E+00 rather than 3.141592654e+00 Some implementations of printf intrepret %E and %G to mean "use 'E' rather than 'e'". Similarly, %X means "use the character set [0123456789ABCDEF] rather than [0123456789abcdef] to print hexadecimal numbers." If you want to print out a long using cap hex, you would use the format specifier "%lX". Does anyone know what the proposed ANSI standard says about this? -- Ken Turkowski @ CADLINC, Menlo Park, CA UUCP: {amd,decwrl,hplabs,nsc,seismo,spar}!turtlevax!ken ARPA: turtlevax!ken@DECWRL.ARPA
mouse@mcgill-vision.UUCP (der Mouse) (08/13/85)
> scanf("%D %F", long, double); > scanf("%ld %lf", long, double); [should be &long, &double in both cases] > vs. > printf("%ld %f", long, double); > Why no %F or %D on printf? Good question. Belongs there. So don't use it in scanf and there's no problem. > And why %lf vs %f? fun! Disclaimer first: What I say here is based on my hacking on a VAX. Lots of my comments may well be invalid elsewhere. The C compiler produces exactly the same code for printf(format,long,double) as printf(format,long,float) Remember in K&R how all floats are converted to doubles all the time? This also happens in function calls. Printf may support %lf; I haven't checked. But it would necessarily be treated exactly the same as %f because of this extension. Scanf does not have the same problem (feature?) because you pass a pointer, you don't pass the value directly. By the way (this is very VAX-dependent), you can scanf into a double and tell scanf it's a float (use %f rather than %lf). This works because the first 4 bytes of a double form a valid float. The extra precision will be unchanged, but for user input, the data generally isn't that precise anyway. -- der Mouse System hacker and general troublemaker CVaRL, McGill University Hacker: One responsible for destroying / Wizard: One responsible for recovering it afterward
guy@sun.uucp (Guy Harris) (08/13/85)
> Does anyone know what the proposed ANSI standard says about (%X meaning > "print hexadecimal with capital A-F" instead of "print a "long" in > hexadecimal with lower-case a-f", and likewise for %E and %G)? It agrees with Systems III and V, Sun 4.2BSD, and, I believe, 4.3BSD - %X means print an "int" in hex with capital A-F. (Note that if you use %D, put a number followed by something else using a %<something> format, and put your code under SCCS, you get a *big* surprise - %D% gets expanded into the date...) Guy Harris
jmoore@mips.UUCP (Jim Moore) (08/13/85)
> > Here's one that just got me: > > if (sv > score); <----- note extraneous semi-colon > score = sv; > .... > -- > Roy Smith <allegra!phri!roy> > System Administrator, Public Health Research Institute > 455 First Avenue, New York, NY 10016 I have seen this bug many times, especially in code written by people who routinely switch programming languages. It does seem that the compiler should warn that that test is a no-operation. The problem in general is that there are 2 copies of the same information: the control flow of the program. The compilers copy is contained strictly in the syntax of the program, while the programmers copy is more loosely defined by program layout conventions. It is strictly up to the programmer to keep the 2 copies in sync in some situations. There was a paper given at a USENIX (Toronto?) describing an experiment with different program layout techniques. The programs were written without any explicit grouping brackets, and were specified by the layout and indentation. A program filter would add all the required brackets and buzzard wings before feeding it to the compiler. Jim Moore MIPS Computer Systems Mountain View, Ca [ucbvax | decvax]!decwrl!mips!jmoore
mash@mips.UUCP (John Mashey) (08/13/85)
> > Here's one that just got me: > > > > if (sv > score); <----- note extraneous semi-colon > > score = sv; > I have seen this bug many times, especially in code written by people > who routinely switch programming languages.... There was a paper given at a > USENIX (Toronto?) describing an experiment with different program layout > techniques. The programs were written without any explicit grouping brackets, > and were specified by the layout and indentation. A program filter would > add all the required brackets and buzzard wings before feeding it to the > compiler. As I recall, there was a related bug in MERT, way back, of the form: if (something) stmt1; stmt2; stmt3; where the the {}'s were "invisible". The one I always remember most of the C bites was the truly infamous bug in chksum in uucp/pk0.c. (This was actually a code bug, masked by bug in VAX compiler and irrelevant on 16-bit machines; it caused almost every 68K port (that used the MIT C compiler, anyway) to break uucp, in that the 68Ks could talk to each other, but not to VAXEn or 16-bit machines). The bug was in lines of code that looked like: short s; unsigned short t; ... if ((unsigned) s <= t) ... where they really meant if ((unsigned short)s <= t). The VAX did (incorrectly) a 16-bit compare, rather than all of the correct conversions. I'd call this a C bite, simply because psychologically, it "feels" like (unsigned) type should mean (unsigned type) type, although it clearly does not. -- -john mashey UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash DDD: 415-960-1200 USPS: MIPS Computer Systems, 1330 Charleston Rd, Mtn View, CA 94043
henry@utzoo.UUCP (Henry Spencer) (08/14/85)
> I thought that the implicit promotion of float to double on passing > an argument was one of the things that was going away with the > new C standard. ... Not quite. Making it go away would break many, many programs. What has actually happened is a bit more complex. Implicit float->double in most contexts is now at the compiler's discretion, i.e. a compiler for a Cray would probably opt to do it only if asked. Function calls are messier. If there's a function prototype in scope, then conversions get done to the types in the prototype, so your function prototypes can all just say "float" for the parameters in question and there will be no implicit widening to double. If there is *no* function prototype in scope, or if the prototype ends with the "..." syntax for a variable-length argument list, then the old behavior still applies and floats widen to double. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
tps@sdchema.UUCP (Tom Stockfisch) (08/14/85)
<> roy@phri.UUCP (Roy Smith) writes: > Here's one that just got me: > > if (sv > score); <----- note extraneous semi-colon > score = sv; This type of error is easy to find with cb(1), which indents your code according to its logic. The above fragment is turned by cb into if (sv > score); score = sv; cb is particularly useful if you have macro functions, as these can easily cause unexpected control-of-flow problems and are expanded on one long line. I often do cc -E prog.c | cb | cat -s The -E flag just runs the preprocessor, and the cat -s is to get rid of the masses of white space which lines like "#include <stdio.h>" cause. -- Tom Stockfisch
meissner@rtp47.UUCP (Michael Meissner) (08/15/85)
In article <860@turtlevax.UUCP> ken@turtlevax.UUCP (Ken Turkowski) writes: > >Some implementations of printf intrepret %E and %G to mean "use 'E' >rather than 'e'". Similarly, %X means "use the character set >[0123456789ABCDEF] rather than [0123456789abcdef] to print hexadecimal >numbers." If you want to print out a long using cap hex, you would >use the format specifier "%lX". > >Does anyone know what the proposed ANSI standard says about this? > ANSI requires this behavior (as does system III, V, V.2, IEEE P1003, and /usr/group). -- Michael Meissner Data General ...{ ihnp4, decvax }!mcnc!rti-sel!rtp47!meissner
mike@whuxl.UUCP (BALDWIN) (08/15/85)
> The biggest mistake in the implementation of printf is a disregard to > the standard in outputting hexadecimal and e-type output. In the rest > of the programming world, hexadecimal is output as (for example): > > 10AD rather than 10ad > > and floating-point e-type output as: > > 3.1415926E+00 rather than 3.141592654e+00 > > Some implementations of printf intrepret %E and %G to mean "use 'E' > rather than 'e'". Similarly, %X means "use the character set > [0123456789ABCDEF] rather than [0123456789abcdef] to print hexadecimal > numbers." If you want to print out a long using cap hex, you would > use the format specifier "%lX". > > Does anyone know what the proposed ANSI standard says about this? April 30 X3J11C uses %x -> "abcdefg", %X -> "ABCDEFG" %e -> "e", %E -> "E", %g -> "e", %G -> "E". -- Michael Baldwin AT&T Bell Labs harpo!whuxl!mike
conrad@ucsfcca.UUCP (Conrad Huang) (08/15/85)
This one got me:
foo(a, b)
int a[16], b[16];
{
bcopy((char *) a, (char *) b, sizeof a);
...
}
'sizeof a' is, of course, 4 (here).
Eric
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (08/15/85)
> if (sv > score); <----- note extraneous semi-colon > score = sv; This sort of thing makes me think that a few extra keywords are called for programming languages like this. E.g. if <bool_expr> then <stmt> fi while <bool_expr> do <stmt> od Something to keep in mind when you design an Algol-like language.
greenber@timeinc.UUCP (Ross M. Greenberg) (08/15/85)
One that has bitten me on more occasions than I'm willing to admit is difference in precedence operations: Imagine transmitting a two byte checksum: putc(highbyte, fd); putc(lowbyte, fd); and then reading it on the other side: crc = (getch(fd) * 256) + getch(fd); Different machines (compilers) do the two getch's in different orders. -- ------------------------------------------------------------------ Ross M. Greenberg @ Time Inc, New York --------->{vax135 | ihnp4}!timeinc!greenber<--------- I highly doubt that Time Inc. would make me their spokesperson. ---
rbp@investor.UUCP (Bob Peirce) (08/16/85)
Here's one that trapped me this week. It took much head scratching and debug prints to figure it out. int dial(telno) char *telno; { if(telno){ /* should be if(*telno) */ dial it; } else{ hang up; } } Print statements showed the telno was being handed to the routine, but the if said nothing was there. Turns out, on my system, the address of telno is NULL. I needed to check the contents not the address! -- Bob Peirce, Pittsburgh, PA uucp: ...!{allegra, bellcore, cadre, idis} !pitt!darth!investor!rbp 412-471-5320 NOTE: Mail must be < 30,000 bytes/message
michael@python.UUCP (M. Cain) (08/16/85)
This one didn't bite me directly, but my wife spent most of a day finding a more complicated instance of it in someone else's code. Start with two source files: foo.c: main() { sub2(1); } sub1() { } bar.c: extern sub1(a,b); sub2(x) int x; { printf("a = %d, b = %d, x = %d\n",a,b,x); } Compiling with "cc foo.c bar.c" produced no error messages at all. But when a.out is executed, the output was a = 1, b = junk, x = junk This was all done under XENIX on a Sritek 68000 board. Same kind of screw-up in both AT&T and Berkeley universes on a Pyramid. Lint on the Pyramid complains that sub2() has a variable number of argu- ments. Two different 68000 cross-compilers make the same mistake. Our VAX running System V correctly tagged the extern statement as incorrect. My 6809 OS-9 system missed the extern statement, but at least pointed out that a and b are undefined within sub2(). Michael Cain Bell Communications Research ..!bellcore!python!michael
peter@baylor.UUCP (Peter da Silva) (08/18/85)
> > if (sv > score); <----- note extraneous semi-colon > > score = sv; > > This sort of thing makes me think that a few extra keywords > are called for programming languages like this. E.g. > if <bool_expr> then <stmt> fi > while <bool_expr> do <stmt> od > Something to keep in mind when you design an Algol-like language. ICK ICK ICK! I hate languages that do that. Ever considered using "cb" as a debugging tool? I have an MS-DOS version if anyone wants it... -- Peter da Silva (the mad Australian werewolf) UUCP: ...!shell!neuro1!{hyd-ptd,baylor,datafac}!peter MCI: PDASILVA; CIS: 70216,1076
ark@alice.UUCP (Andrew Koenig) (08/18/85)
> int dial(telno) > char *telno; > { > if(telno){ /* should be if(*telno) */ > dial it; > } > else{ > hang up; > } > } Bob Pierce says that this program failed because it should have been checking *telno instead of telno. If telno is NULL, you had better not look at *telno; it's illegal. If the address of a legal character item is NULL, your compiler is not implementing the language properly.
ludemann@ubc-cs.UUCP (Peter Ludemann) (08/18/85)
Here's my favourite bite in the neck (apologies if I've made any typos - this is just an example): typedef union { int u1; char u2; } union_type; typedef struct { int f1; union_type f2; } struct_type; struct_type s; s.u1 = 0; /* should be: s.f1.u1 = 0 */ This has the effect of "s.f1 = 0" with no complaint from the compiler (lint, of course, is another matter). Truly spectacular results can occur if "f1" is a pointer to another area. The really annoying thing is that K&R (page 186) says: A primary expression followed by a dot followed by an identifier is an expression. The first expression must be an lvalue naming a structure or union, and the identifier must name a member of the structure or union. In other words, type checking almost as strong as Pascal's (yes, I know about the case where two structures have the first fields declared the same). However, K&R (page 209) says "... this restriction is not firmly enforced by the compiler." It is sad that the defects of the original C compiler have been slavishly copied by subsequent ims. If backward compatibility were important a "don't check structures strictly" switch could have been added to the compilers. -- ludemann%ubc-vision@ubc-cs.uucp (ubc-cs!ludemann@ubc-vision.uucp) ludemann@cs.ubc.cdn ludemann@ubc-cs.csnet Peter_Ludemann@UBC.mailnet
rbp@investor.UUCP (Bob Peirce) (08/19/85)
> What's worse, the optimiser has in this case hidden a program bug!!! > > Thus the moral: > > "Don't just test your code once. Test it again, this time > turn the optimiser OFF first". and vice versa! -- Bob Peirce, Pittsburgh, PA uucp: ...!{allegra, bellcore, cadre, idis} !pitt!darth!investor!rbp 412-471-5320 NOTE: Mail must be < 30,000 bytes/message
peter@baylor.UUCP (Peter da Silva) (08/19/85)
> cc -E prog.c | cb | cat -s
ANOTHER FLAG FOR CAT!?!?!? How many places have cat -s?
--
Peter (Made in Australia) da Silva
UUCP: ...!shell!neuro1!{hyd-ptd,baylor,datafac}!peter
MCI: PDASILVA; CIS: 70216,1076
levy@ttrdc.UUCP (Daniel R. Levy) (08/19/85)
In article <389@phri.UUCP>, roy@phri.UUCP (Roy Smith) writes: > Here's one that just got me: > > if (sv > score); <----- note extraneous semi-colon > score = sv; > > This was in a series of computations which gave various scores; the >fragment above was repeated in various places to pick out the maximum. Of >course, the test is a no-op and the assignment was always done. Naturally, >this passes lint (even with the -h flag which uses "heuristic tests to >attempt to intuit bugs") without any complaint. >-- >Roy Smith <allegra!phri!roy> Sounds like a question of style hiding function. Why not stick to something like if (sv > score) score = sv; ? I can't think of anything much more straightforward than that. -- ------------------------------- Disclaimer: The views contained herein are | dan levy | yvel nad | my own and are not at all those of my em- | an engihacker @ | ployer, my pets, my plants, my boss, or the | at&t computer systems division | s.a. of any computer upon which I may hack. | skokie, illinois | | "go for it" | Path: ..!ihnp4!ttrdc!levy -------------------------------- or: ..!ihnp4!iheds!ttbcad!levy
ark@alice.UUCP (Andrew Koenig) (08/21/85)
> typedef union { > int u1; > char u2; > } union_type; > > typedef struct { > int f1; > union_type f2; > } struct_type; > > struct_type s; > > s.u1 = 0; /* should be: s.f1.u1 = 0 */ Gee, our compiler certainly complains about this one.
cdshaw@watmum.UUCP (Chris Shaw) (08/22/85)
In article <372@ttrdc.UUCP> levy@ttrdc.UUCP (Daniel R. Levy) writes: >In article <389@phri.UUCP>, roy@phri.UUCP (Roy Smith) writes: >> Here's one that just got me: >> if (sv > score); <----- note extraneous semi-colon >> score = sv; > >Sounds like a question of style hiding function. Why not stick to something >like > if (sv > score) score = sv; >? >| dan levy ...because if(sv>score||this==that+the_other||fopen("crap","r"))save=the+whales+fur+christ++; is the kind of statement where bugs really happen. Can you seriously spend less than two seconds reading that to comprehend what's going on ? If you answered yes, how about this (more important) question: Can you read a whole FILE of this kind of crap and then be able to find a variable at will ? I doubt it. I can think of more straightforward ways of producing code, some of which include programming while awake, so that the errors like the one in the original posting don't happen. Others include using a self-consistent style, which Mr Levy's is not. Compound if statements should look the same as simple if statements. Mr Levy's style of if statement has an equivalent in English called "the run-on-sentence". What's silly about the whole thing is that a program formatter can make this stuff QUITE readable, and will probably find the bug that "bit" Mr Smith. The most important element of a readable programming style is the use of white space. I personally can't stand the K&R style because I get visually confused when I read it. It's similar to an English paragraphing that doesn't use indenting or spaces between paragraphs. In the C book itself, this isn't bad, because the program fragments are small and the structures are simple. In real programs, however, there are lots of programs which are unreadable until passed through "indent" (on 4.2). Chris Shaw watmath!watmum!cdshaw or cdshaw@watmath University of Waterloo In doubt? Eat hot high-speed death -- the experts' choice in gastric vileness !
rlk@chinet.UUCP (Richard L. Klappal) (08/22/85)
In article <471@baylor.UUCP> peter@baylor.UUCP (Peter da Silva) writes: >> cc -E prog.c | cb | cat -s > >ANOTHER FLAG FOR CAT!?!?!? How many places have cat -s? >-- > Peter (Made in Australia) da Silva The Fortune 32:16 has it. Means force single spacing on output (kinda like uniq) to get rid of excessive blank lines. PS: Peter: Could you post the MSDOS version of cb. (if legal to do so). I friend uses the Idiot/Barely Moron with Lattice, and would appreciate having cb. (+vi + ... UN*X :-)). Richard Klappal UUCP: ..!ihnp4!chinet!uklpl!rlk | "Money is truthful. If a man MCIMail: rklappal | speaks of his honor, make him Compuserve: 74106,1021 | pay cash." USPS: 1 S 299 Danby Street | Villa Park IL 60181 | Lazarus Long TEL: (312) 620-4988 | (aka R. Heinlein) -------------------------------------------------------------------------
alan@drivax.UUCP (Alan Fargusson) (08/22/85)
> Here's one that just got me: > > if (sv > score); <----- note extraneous semi-colon > score = sv; > > This was in a series of computations which gave various scores; the >fragment above was repeated in various places to pick out the maximum. Of >course, the test is a no-op and the assignment was always done. Naturally, >this passes lint (even with the -h flag which uses "heuristic tests to >attempt to intuit bugs") without any complaint. >-- >Roy Smith <allegra!phri!roy> I have to tell you that I got bit the same way in PASCAL when I was a student. This is not just a C problem. I think that all of the strucutred languages I have seen (except Modula-2, and Algol 68) have this problem. -- Alan Fargusson. { ihnp4, amdahl, mot }!drivax!alan
mouse@mcgill-vision.UUCP (der Mouse) (08/23/85)
[ ... ] > if(telno){ /* should be if(*telno) */ [ ... ] > Print statements showed the telno was being handed to the routine, > but the if said nothing was there. Turns out, on my system, the > address of telno is NULL. I needed to check the contents not the > address! Gee....and I thought a zero pointer was guaranteed not to point to anything valid (K&R says this). Or is NULL not a zero?! No, you are comparing to 0 not NULL. -- der Mouse {ihnp4,decvax,akgua,etc}!utcsri!mcgill-vision!mouse philabs!micomvax!musocs!mcgill-vision!mouse Hacker: One responsible for destroying / Wizard: One responsible for recovering it afterward
lam@btnix.UUCP (lam) (08/23/85)
[*** The Phantom Article Gobbler Strikes Again ***] > > > int i, a[10]; > > > for (i = 0; i <= 10; i++) > > > a[i] = 0; > > > > > This looks to me like it will simply overwrite one int's worth of > > memory beyond the end of the array "a" with the value 0. Granted, > > depending on what happens to be after "a", this can have disastrous > > results, but is there really an implementation in which it will > > (reliably) lead to infinte looping? > ---------- > Yes. Any implementation that allocates the space for i following the > space for a. The cause of the infinite loop is due to the storage allocation. i.e. &i == &a[10] causing i to be overwritten with 0 when i is 10. The more interesting thing is that on some compilers, the infinite loop does NOT occur. Lo and behold, the OPTIMISER comes into play. If i is put in a Register at the start of the for(), a[10] = 0 will indeed overwrite i in memory but not the register !!! and the loop terminates normally. What's worse, the optimiser has in this case hidden a program bug!!! Thus the moral: "Don't just test your code once. Test it again, this time turn the optimiser OFF first". ------------------------------------------------------------------ Onward Lam CAP Group, Reading, England.
root@bu-cs.UUCP (Barry Shein) (08/24/85)
Not really a bite, but I remember when I was first learning C I was quite bewildered by the fact that you couldn't really declare your own 'argv', that is, you couldn't declare an array of pointers to fixed length buffers except perhaps by: char *myargv[] = { "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", etc I mean, argv seemed kinda holy to me, disturbing. -Barry Shein, Boston University P.S. I know argv is var length, but that would be even harder to declare!
guy@sun.uucp (Guy Harris) (08/25/85)
> [ ... ] > > if(telno){ /* should be if(*telno) */ > [ ... ] > > > Print statements showed the telno was being handed to the routine, > > but the if said nothing was there. Turns out, on my system, the > > address of telno is NULL. I needed to check the contents not the > > address! > > Gee....and I thought a zero pointer was guaranteed not to point to > anything valid (K&R says this). All valid implementations of C guarantee this. Obviously, the implementation of C that this was done on is not valid. He should complain to the vendor. (Yes, there have been such implementations; one well-known chip maker's first UNIX release didn't put the necessary shim at data location 0 on a separate I&D space program. They fixed it shortly afterwards.) > Or is NULL not a zero?! No, you are comparing to 0 not NULL. If you compare a pointer against 0, the actual code compiled compares it against a null pointer. NULL *is* 0, if you're talking from the standpoint of "what does the '#define' in <stdio.h> and other places say": /* @(#)stdio.h 1.2 85/01/21 SMI; from UCB 1.4 06/30/83 */ ... #define NULL 0 (and you'll find the same thing in V7, 4.2, 4.3, S3, S5, ...). In any context where it is known to the compiler that something is supposed to be a pointer to a specific data type, any zero that appears there is treated as a null pointer of the type "pointer to that data type" (obviously, not a null pointer to an object of that data type, since a null pointer can't point to anything). These contexts include comparisons and assignments, so the two assignments in register struct frobozz *p; p = 0; p = (struct frobozz *)0; are equivalent and the two comparisons in if (p == 0) foo(); if (p == (struct frobozz *)0) foo(); are equivalent. Procedure calls, however, are not such a context, so the two procedure calls in bar(0); bar((struct frobozz *)0); are very definitely *not* equivalent. In ANSI Standard C, there is a syntax to specify that "bar" takes an argument of type "struct frobozz *"; if you declared "bar" in such a manner, the two procedure calls would be equivalent. Guy Harris
peters@cubsvax.UUCP (Peter S. Shenkin) (08/26/85)
I've had several bugs involving code hidden in macro definitions which have
been very difficult to find. One I recall offhand went something like this:
/* OPEN MOUTH *****************************************************************/
#define Coords(I) (complicated.structure.redirection[I].x, \
complicated.structure.redirection[I].y, \
complicated.structure.redirection[I].z )
main()
{
...
subr(Coords(i)); /* BITE */
...
}
/***************************************************************************/
subr(x,y,z)
float x,y,z;
{...}
/* SWALLOW ******************************************************************/
Problem is, when expanded, the call to subr looks like
subr((exp1,exp2,exp3));
The comma operator is applied, and subr() gets only exp1 !!! The interesting
thing is that if anyone had asked me, whether (something), ((something)),
and (((something))) mean the same in C, I would have said "Yes," without
thinking. Obviously, I would have been wrong.
Peter S. Shenkin philabs!cubsvax!peters Columbia Univ. Biology
mab@druca.UUCP (BlandMA) (08/28/85)
I was amused when I realized why this statement didn't print anything: printf("toggle ">" verbosity\n"); -- Alan Bland {ihnp4|allegra}!druca!mab AT&T Information Systems, Denver CO
lee@eel.UUCP (08/29/85)
>>Gee....and I thought a zero pointer was guaranteed not to point to >>anything valid (K&R says this). >All valid implementations of C guarantee this. Obviously, the >implementation of C that this was done on is not valid. He should complain >to the vendor. (Yes, there have been such implementations; one well-known >chip maker's first UNIX release didn't put the necessary shim at data >location 0 on a separate I&D space program. They fixed it shortly >afterwards.) Speaking of issues that have been beaten to death! K&R says only that the value 0 is distinguishable from pointers that point to objects, and that therefore the value zero is not a "valid" pointer. It certainly does not say that the 0 pointer will give you the "null" or empty value of any object, and in particular it does not promise that there will be an integer zero if you dereference (int*)0, or a character zero if you dereference (char*)0, nor a memory fault if you reference (foo*)0. NO, you cannot depend upon the value obtained by dereferencing ANY pointer that has been assigned the value zero. It does not point to any object; the implementation of C does not guarantee to protect you from erroneously trying to access that object and the result is unpredictable over various implementations.
darryl@ISM780.UUCP (08/29/85)
[] One final, subtle, point. K&R does not guarantee that the *value* 0 is distinguishable from all other pointers, but rather, that the *constant* 0 is. That is to say, you may compare against 0 to determine the validity of a pointer (or assign to guarantee invalidity), but you may not assume that comparison against (or assignment of) an int variable whose value is 0 will have the same result. This picky distinction probably doesn't affect any of the better known chips, but might be important on a machine where a null pointer is not a bit string of 0s. --Darryl Richman, INTERACTIVE Systems Corp. ...!cca!ima!ism780!darryl The views expressed above are my opinions only. P.S.: I know that this sounds amazing, so look at the top of K&R p190, under the section 7.7, equality operators (second paragraph), and again on top of p192, section 7.14, assignment operators.
dave@lsuc.UUCP (David Sherman) (08/30/85)
> >ANOTHER FLAG FOR CAT!?!?!? How many places have cat -s? > > The Fortune 32:16 has it. Means force single spacing on output > (kinda like uniq) to get rid of excessive blank lines. For those with BSD systems, or (as in our case) systems with some BSD utilities, the ssp(1) program does this. (It's used by man(1) for output to a terminal.) Dave Sherman The Law Society of Upper Canada Toronto -- { ihnp4!utzoo pesnta utcs hcr decvax!utcsri } !lsuc!dave
ark@alice.UucP (Andrew Koenig) (08/30/85)
>>All valid implementations of C guarantee this. Obviously, the >>implementation of C that this was done on is not valid. He should complain >>to the vendor. (Yes, there have been such implementations; one well-known >>chip maker's first UNIX release didn't put the necessary shim at data >>location 0 on a separate I&D space program. They fixed it shortly >>afterwards.) >Speaking of issues that have been beaten to death! K&R says only that the >value 0 is distinguishable from pointers that point to objects, and that >therefore the value zero is not a "valid" pointer. It certainly does not >say that the 0 pointer will give you the "null" or empty value of any >object, and in particular it does not promise that there will be an integer >zero if you dereference (int*)0, or a character zero if you dereference >(char*)0, nor a memory fault if you reference (foo*)0. >NO, you cannot depend upon the value obtained by dereferencing ANY pointer >that has been assigned the value zero. It does not point to any object; >the implementation of C does not guarantee to protect you from erroneously >trying to access that object and the result is unpredictable over various >implementations. I think the "necessary shim" referred to in the first note quoted above has nothing to do with a value intended to ensure that *(int*)0 give a defined value. Rather, it is a dummy variable located at location 0 designed to ensure the NOTHING ELSE find itself at location 0 by accident! The trouble with putting a variable at location 0 is that its address will then erroneously appear to be NULL.
guy@sun.uucp (Guy Harris) (08/31/85)
> Not really a bite, but I remember when I was first learning C > I was quite bewildered by the fact that you couldn't really > declare your own 'argv', that is, you couldn't declare an > array of pointers to fixed length buffers except perhaps by: > > char *myargv[] = { > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", > > etc > > I mean, argv seemed kinda holy to me, disturbing. If you want an array of pointers to fixed-length buffers, you can declare it as long as the number of such pointers can be determined at the time you write the code. char bufs[3][20]; char *bufps[3] = { bufs[0], bufs[1], bufs[2], }; If the number can't be fixed when you write the code, you can set up "bufps" at run time. Also note that "argv" isn't a pointer to an array of pointers to fixed-length buffers, it's a pointer to an array of pointers to strings, which you *can* declare. > P.S. I know argv is var length, but that would be even harder to declare! The secret is that "argv" (or, more correctly, what "argv" points to) *isn't* declared. Pointers need not point to things which have been declared; "malloc" returns pointers to objects fabricated on the fly. If you have "n" arguments ("n" is a variable here), just do register char **argv; argv = (char **)malloc(n * sizeof(char *)); And you can fill them in. Guy Harris
gostas@kuling.UUCP (G|sta Simil{/ml) (09/01/85)
In article <2702@sun.uucp> guy@sun.uucp (Guy Harris) writes: >Procedure calls, however, are not such a context, so the >two procedure calls in > > bar(0); > bar((struct frobozz *)0); > >are very definitely *not* equivalent. In ANSI Standard C, there is a syntax >to specify that "bar" takes an argument of type "struct frobozz *"; if you >declared "bar" in such a manner, the two procedure calls would be equivalent. > > Guy Harris Is it also possible to give a NULL-pointer to a procedure as a parameter, if for example the procedure would return several values, and we are not interested in all of them? wait(0) works at least here (4.2BSD), but something like this does not: skip(fd, n) /* skip n bytes om streams that don't allow lseek() */ int fd, n; { (void)read(fd, 0, n); } G|sta Simil{ gostas@kuling.UUCP
lee@eel.UUCP (09/02/85)
One final, subtle, point. K&R does not guarantee that the *value* 0 is distinguishable from all other pointers, but rather, that the *constant* 0 is. That is to say, you may compare against 0 to determine the validity of a pointer (or assign to guarantee invalidity), but you may not assume that comparison against (or assignment of) an int variable whose value is 0 will have the same result. This picky distinction probably doesn't affect any of the better known chips, but might be important on a machine where a null pointer is not a bit string of 0s. While the quotation is true, I think that it refers to the automatic coercion that is required to give the constant 0 the proper distinguishable pattern in the appropriate pointer type. I think we all fairly assume that char *p=0, *q="a"; main() {if (p==q) printf("bogus");} will fail to print because one of the pointers has been assigned the constant 0 and one has been assigned a pointer to a real object. Therefore the value 0 does persist after assignment to any pointer type and is distinguishable from the values in other pointers as well. And two such pointers to the same type both of which have been assigned the value 0 will compare equal. I don't see why the restriction applies to non-pointer variables. As long as type coercions are explicit, this should apply to all values of zero, whether encountered as a literal in the program or as the value of a variable of integral type. I think it is not unreasonable, tho it is certainly not covered anywhere, that coercions between pointers of different types should map the 0 value properly so that, for example, int *p=0; char *q=0; main() {if (p==(int *)q) printf("this is right");} should produce output. We all know that 0 cannot be interpreted as a pointer without knowing what it is a pointer to, but given that we know the types of the pointers involved, the "I don't point to anything" values should be considered equivalent in assignments and comparisons.
guy@sun.uucp (Guy Harris) (09/02/85)
> >>All valid implementations of C guarantee (that a null pointer doesn't > >>point to anything valid). ... (Yes, there have been (invalid) > >>implementations; one well-known chip maker's first UNIX release didn't > >>put the necessary shim at data location 0 on a separate I&D space program. > > >Speaking of issues that have been beaten to death! K&R says only that the > >value 0 is distinguishable from pointers that point to objects, and that > >therefore the value zero is not a "valid" pointer. It certainly does not > >say that the 0 pointer will give you the "null" or empty value of any > >object ... > > I think the "necessary shim" referred to in the first note quoted > above has nothing to do with a value intended to ensure that *(int*)0 > give a defined value. Yes, that is exactly what I was referring to. Ideally, if possible, location zero should literally have nothing there - i.e., your program should get a segmentation violation if it tries to use the contents of location 0. (This hits errant programs upside the head at a nice early stage in their lives.) If not, however, you must ensure that it doesn't have any code or data there - you have to stick a shim in there to prevent this on separate I&D systems (the startup code acts as a shim in most non-separate I&D systems). Guy Harris
peter@graffiti.UUCP (Peter da Silva) (09/03/85)
> > Not really a bite, but I remember when I was first learning C > > I was quite bewildered by the fact that you couldn't really > > declare your own 'argv', that is, you couldn't declare an > > array of pointers to fixed length buffers except perhaps by: > > > > char *myargv[] = { > > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", > > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", > > What are you talking about? char *myargv[5] = { "/bin/sh", "sh", "-c", "echo 'well it worked'", NULL }; What's so holy about this?
guy@sun.uucp (Guy Harris) (09/04/85)
> One final, subtle, point. K&R does not guarantee that the *value* 0 > is distinguishable from all other pointers, but rather, that the > *constant* 0 is. That is to say, you may compare against 0 to > determine the validity of a pointer (or assign to guarantee > invalidity), but you may not assume that comparison against (or > assignment of) an int variable whose value is 0 will have the same > result. > > I don't see why the restriction applies to non-pointer variables. As long > as type coercions are explicit, this should apply to all values of zero, > whether encountered as a literal in the program or as the value of a > variable of integral type. ("Oh no, Mabel! Here comes another K&R quote!") 7.7 Equality operators A pointer may be compared to an integer, but the result is machine independent unless the integer is *the constant* 0. (Italics mine) 7.13 Conditional operator ...otherwise, one must be a pointer and the other *the constant* 0, and the result has the type of the pointer. (Italics mine) 7.14 Assignment operators ...However, it is guaranteed that assignment of *the constant* 0 to a pointer will produce a null pointer distinguishable from a pointer to any object. (Italics mine) I'd say the intent of K and R was pretty clear here, wouldn't you? As for "why" - think of a machine where a null pointer *didn't* have the same bit pattern as the integer 0. Every time you assigned an integer to a pointer, you'd have to check whether the integer was zero or not and assign a null pointer instead (unless the computation you had to do to convert an integer to a pointer did this anyway). Why penalize those assignments solely to make assigning a 0 other than a constant 0 set the pointer to a null pointer? Guy Harris
henry@utzoo.UUCP (Henry Spencer) (09/06/85)
> Is it also possible to give a NULL-pointer to a procedure as a parameter, > if for example the procedure would return several values, and we are not > interested in all of them? > > wait(0) works at least here (4.2BSD), but something like this does not: > ... > (void)read(fd, 0, n); Passing NULL only works if the function is prepared for the possibility and explicitly checks for it. wait() does; read() does not. See the documentation. By the way, that should be "wait( (int *)0 )", to make sure the type is right; the Unix documentation is often sloppy about this particular detail, since it originated on machines where the sloppiness didn't cause any problems. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
mouse@mcgill-vision.UUCP (der Mouse) (09/07/85)
>> ... so the two procedure calls in >> bar(0); >> bar((struct frobozz *)0); >>are very definitely *not* equivalent. >> Guy Harris >Is it also possible to give a NULL-pointer to a procedure as a parameter, >if for example the procedure would return several values, and we are not >interested in all of them? > >wait(0) works at least here (4.2BSD), but something like this does not: > >skip(fd, n) /* skip n bytes om streams that don't allow lseek() */ >int fd, n; >{ > (void)read(fd, 0, n); >} Let me be the 18th of 69 netters (we are a leaf node so there's a k-day delay, for some small integer k, between you and us) to point out that.... Wait(0) works because the wait code *specifically* checks for a 0 argument. I believe the code reads something like wait(stpointer) struct status *stpointer; { .... if (stpointer) { *stpointer = ststruct; } .... } A lot of code (sigvec, for instance) works this way. However, for calls like read(), where the lack of interest is a *very* exceptional case, this check is omitted. Some machines, notably the 68K family, will catch a zero pointer because there's no memory there. Some, notably VAXen, will not. However, for syscalls involving writing into memory, for most (-z format, see ld(1)) executable files, attempting to write into address 0 will fault (syscalls return EFAULT, user code get SEGV errors). Read(fd,0,n) *should* give you a memory error (EFAULT returned). The only case I know of in which it won't is when you are running on a VAX, so there is memory at address 0 (it's usually the C startup code from crt0.o), and the executable file is in the old old old format which doesn't do sharing of text segments, so the text segment is writeable. In this case, read will happily overwrite the first n bytes of the text segment. Normally (because that *is* the crt0 code, which doesn't get reentered), you won't notice unless n is big. -- der Mouse {ihnp4,decvax,akgua,etc}!utcsri!mcgill-vision!mouse philabs!micomvax!musocs!mcgill-vision!mouse Hacker: One responsible for destroying / Wizard: One responsible for recovering it afterward
henry@utzoo.UUCP (Henry Spencer) (09/08/85)
> Exactly, but also consider what K&R says in section 7.14: > > The compilers currently allow a pointer to be assigned to an integer, an > integer to a pointer, and a pointer to a pointer of another type. The > assignment is a pure copy operation, with no conversion. Note that they do not say that this is a legitimate feature of the language! All they say is that the current compilers will let you get away with it. This is no longer generally true, by the way. K&R is quite old. > Also, in section 14.4: > > A pointer may be converted to any of the integral types large enough to > hold it. [...] The mapping function is also machine dependent, but is > intended to be unsurprising to those who know the addressing structure > of the machine. > > Although this does not seal it up completely, it seems that K&R had it in > mind that putting pointers into integers (and taking them back again) would > have no overhead.... True, but there is a subtle point here: they say you can convert pointers to (sufficiently large) integers, they may say that you can convert the result back, but they don't say what the integer will look like. A NULL pointer will not necessarily show up as an integer zero. The equality between NULL pointers and 0 works only when 0 is a literal constant, in which case it is (potentially) treated specially by the compiler when encountered in a "pointer" context. The conversion of literal 0 to the NULL pointer is *not* an instance of the general "putting pointers into integers (and taking them back again)" conversion. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
darryl%ism780.uucp@BRL.ARPA (09/08/85)
>> K&R does not guarantee that the *value* 0 >> is distinguishable from all other pointers, but rather, that the >> *constant* 0 is. >> >> I don't see why the restriction applies to non-pointer variables. As long >> as type coercions are explicit, this should apply to all values of zero, >> whether encountered as a literal in the program or as the value of a >> variable of integral type. > >As for "why" - think of a machine where a null pointer *didn't* have the >same bit pattern as the integer 0. Every time you assigned an integer to a >pointer, you'd have to check whether the integer was zero or not and assign >a null pointer instead (unless the computation you had to do to convert an >integer to a pointer did this anyway). Exactly, but also consider what K&R says in section 7.14: The compilers currently allow a pointer to be assigned to an integer, an integer to a pointer, and a pointer to a pointer of another type. The assignment is a pure copy operation, with no conversion. Also, in section 14.4: A pointer may be converted to any of the integral types large enough to hold it. [...] The mapping function is also machine dependent, but is intended to be unsurprising to those who know the addressing structure of the machine. Although this does not seal it up completely, it seems that K&R had it in mind that putting pointers into integers (and taking them back again) would have no overhead. Checking for a 0 *value* probably is more overhead than they had in mind. --Darryl Richman, INTERACTIVE Systems Corp. ...!cca!ima!ism780!darryl The views expressed above are my opinions only.
darryl@ISM780.UUCP (09/10/85)
>> Although this does not seal it up completely, it seems that K&R had it in >> mind that putting pointers into integers (and taking them back again) would >> have no overhead.... > >True, but there is a subtle point here: they say you can convert pointers >to (sufficiently large) integers, they may say that you can convert the >result back, but they don't say what the integer will look like. Henry, you and I are NOT arguing; I agree that the implicit conversion of 0 to a null pointer only happens for constant 0s. Perhaps I was less than completely clear, but I wanted to be sure (hah!) that the netters would understand that 0 and an int variable containing the value 0 are (may be) treated differently here. --Darryl Richman, INTERACTIVE Systems Corp. ...!cca!ima!ism780!darryl The views expressed above are my opinions only.
peterc@ecr2.UUCP (Peter Curran) (09/11/85)
Although the topic of Null pointers has been beaten to death many times, there is one point that I have never seen discussed. External variables are to be initialized to 0, according to the C Reference Manual (I don't have a copy of K&R handy, but I'm pretty sure it says the same thing.) These means that integers get 0, and pointers get NULL. (I don't know whatis supposed to happen to variables for which 0 is not valid - what really happens is they get 0 anyhow, of course). Since this includes all variables not explicitly initialized, it includes unions. It is hard to imagine an implementation of this that allows a block of memory representing simultaneously one or more pointers and one or more integers to be initialized correctly unless the bit pattern for a null pointer is identical to the bit pattern for a 0 integer of the same size (assuming one exists - otherwise concatenations of integers, or whatever else is required). I can think of at least two ways it could happen. First, I believe a compiler is free to treat 'union' as equivalent to 'struct' - i.e. ignore the intended overlaying of memory. It could then initialize the two sets of variables entirely independently. Second, I can imagine some form of tagged memory architecture in which the tags are only used in conjunction with instructions that use the memory as an address, so the non-tag (i.e. the integer) is zero, but the entire location (including the tag) is non-zero. I don't know enough about tagged memory architectures to pursue this very far, but it seems too complex to be really credible. Therefore, unless you accept a brain-damaged compiler that treats 'union' as equivalent to 'struct,' it seems hard to avoid the conclusion that C requires that the bit-pattern for a "null" pointer be identical to the bit pattern of "(int) 0" (except possibly in length).
gwyn@BRL.ARPA (VLD/VMB) (09/19/85)
The last X3J11 draft that I have a copy of states that objects with static storage duration that are not initialized explicitly are initialized implicitly as if every scalar member were assigned the integer constant 0. This does not imply anything about bit patterns for null pointers.