hokey@plus5.UUCP (Hokey) (06/18/87)
Is there a good or overriding reason why the sizeof operator must return an explicitly unsigned value? We have an application which stores strings in a "local" array if the strings are small enough, otherwise it malloc()s space and saves the string in the malloc()ed space. In any event, we keep track of the string length. There are 3 interesting cases: no string, a zero-length string, and >0 length strings. We thought it would be a swell idea to denote the "no string" case with a length counter of -1. Normally, this is a neat idea. However, there are cases in which we only want to mess around with strings which are bigger than the local array size. For these cases, we can not use the expression length > sizeof local_string_buffer because the -1 length (no string available) becomes slightly larger than the number of bytes in the local string buffer when converted to unsigned. For what it is worth: This is happening in a speed-critical section of code. We are very thorough and very lazy. We *could* add 1 to all our lengths and solve this problem (and create some others). I can think of several other ways to "get around" the problem (presently, we cast most sizeof operators to int). None of these issues are key - it seems to me having sizeof return an explicitly unsigned value violates "the principle of least astonishment". One of the primary uses of sizeof is to make code more readable, portable, and maintainable. If I am on a machine with 4 byte character pointers and I want to write nonportable code by not using sizeof(char *), I am more likely to use "4" than "4U" (which will only work on "new" compilers anyway). Then again, I think the size_t is pretty useless and that the proposed C standard contains far too many typedefs that work against the good programmer. -- Hokey
gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/19/87)
In article <1748@plus5.UUCP> hokey@plus5.UUCP (Hokey) writes: >None of these issues are key - it seems to me having sizeof return an >explicitly unsigned value violates "the principle of least astonishment". (a) AT&T "sizeof" has been this way for many years now. (b) Since sizeof(thing) inherently cannot be negative, an unsigned integer value for the sizeof operator seems exactly right. (c) Your example wasn't very convincing (to me at least).
karl@haddock.UUCP (Karl Heuer) (06/19/87)
In article <1748@plus5.UUCP> hokey@plus5.UUCP (Hokey) writes: >[In our application] there are 3 interesting cases: no string, a zero-length >string, and >0 length strings. We thought it would be a swell idea to denote >the "no string" case with a length counter of -1. [But we sometimes want to >make the test] "length > sizeof local_string_buffer" [which causes problems >because the special-case value of length is coerced to (unsigned)-1]. It >seems to me having sizeof return an explicitly unsigned value violates "the >principle of least astonishment". I think the crux of the problem is that your variable "length" is logically a union of "size_t" and a single out-of-band value which has nothing to do with sizes. It is not all that astonishing that you get incorrect results if you use an untested variable containing the OOB value as if it were a size_t. To answer the title question, size_t is unsigned because (a) it's always nonnegative, and (b) the corresponding signed datatype may not be wide enough. Having it be unsigned only on those machines where it's necessary would cause even more astonishment when trying to port code. >Then again, I think the size_t is pretty useless and that the proposed C >standard contains far too many typedefs that work against the good programmer. If size_t were not part of the standard, what type would you use for, say, the argument to malloc()? Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
wesommer@athena.mit.edu (William Sommerfeld) (06/21/87)
In article <6001@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >In article <1748@plus5.UUCP> hokey@plus5.UUCP (Hokey) writes: >>None of these issues are key - it seems to me having sizeof return an >>explicitly unsigned value violates "the principle of least astonishment". > >(a) AT&T "sizeof" has been this way for many years now. But Berkeley's hasn't. >(b) Since sizeof(thing) inherently cannot be negative, an unsigned >integer value for the sizeof operator seems exactly right. This can run into problems in existing code.. For example, a couple of months ago, I spent several hours helping someone track down what appeared to be a DBM bug on the IBM RT/PC under IBM's BSD port. I finally tracked it down to the following statement: if (i1 <= (i2+3) * sizeof(short)) return (0); (in additem(), in lib/libc/gen/ndbm.c). By examining the assembler output by the compiler, I determined that it was probably generating an unsigned compare for that. (I say "probably" because I can't read RT assembler very well), and the behavior of the program was as if an unsigned test was being generated there. It turns out that the compiler was an ANSI C compiler. I mistakenly reported it as a compiler bug, and was corrected on it by someone who had actually done work on an ANSI C interpreter. Bill Sommerfeld wesommer@athena.mit.edu
hokey@plus5.UUCP (Hokey) (06/25/87)
sizeof is an operator, not a function. It doesn't "return" something in the classic sense. I see no reason to have this operator have a "side effect" of an explicit cast of its "value", especially when this cast is hidden and arguably unnecessary. One can write subroutines like malloc without a size_t. Use function prototypes. This has the added benefit of permitting the compiler to warn you if you will loose precision as part of the cast. If one wishes, one could say our desire to treat a "length" counter of -1 is out of band data. Using that logic, a nil pointer is out of band data, too. Perhaps we should get rid of the nil pointer, and use a structure of a flag value and a pointer, and only use the pointer if the flag value is true. -- Hokey
mpl@sfsup.UUCP (M.P.Lindner) (06/26/87)
In article <1750@plus5.UUCP>, hokey@plus5.UUCP writes: > sizeof is an operator, not a function. correct, go on... > It doesn't "return" something in the classic sense. au contrare, operators return values, otherwise 1+1 wouldn't be 2. However, if you mean they don't have an explicit "return" statement, granted (although this is not relevant). > I see no reason to have this operator have a "side effect" of an explicit > cast of its "value", especially when this cast is hidden and arguably > unnecessary. No "side effect" is necessary. Just as a == b returns an int regardless of the type of its operands, sizeof returns an unsigned. No cast is involved, just as you don't have to say (int) (a == b). > One can write subroutines like malloc without a size_t. Use function > prototypes. This has the added benefit of permitting the compiler to > warn you if you will loose precision as part of the cast. Right on! > If one wishes, one could say our desire to treat a "length" counter of -1 > is out of band data. Using that logic, a nil pointer is out of band data, too. The fact is, a "length" of -1 is not just out of band data, it's data that can't be held by the type of the sizeof operator. "nil" is out of band, but can still be represented in a pointer. For an analogy, "not a number" is out of band for a float, but is still representable. > Perhaps we should get rid of the nil pointer, and use a structure of a flag > value and a pointer, and only use the pointer if the flag value is true. > -- > Hokey I'm not sure I follow this, but I will assume it's sarcasm.
guy%gorodish@Sun.COM (Guy Harris) (06/27/87)
> The fact is, a "length" of -1 is not just out of band data, it's data > that can't be held by the type of the sizeof operator. You're missing the point. The fact is, there are C compilers where the type of "sizeof" is "int", not "unsigned int", and thus -1 is NOT "data that can't be held by the type of the 'sizeof' operator". (See the document "The C Enviroment of UNIX/TS", supplied with the System III documentation. It states: 3.2.2 Unsigned numbers The value returned by "sizeof" is now "unsigned" rather than "int", so care must be exercised in the use of "sizeof" in a few strange cases. I seem to remember seeing this change mentioned elsewhere, but I don't remember where. I don't know when this change was made; was it made in V7 or afterwards? The odd thing is that the 4BSD C compiler is based on the System III VAX C compiler, but has "sizeof" yield an "int". I don't know if 1) the VAX C compiler hadn't been changed as of System III, 2) the change antedated V7, and Berkeley changed the compiler back to the V7 rules for backward compatiblity, or 3) something else happened. I seem to remember seeing *something* about such a change in or before V7, but I also seem not to remember noticing the type of "sizeof" differing between the System III and 4BSD C compiler.) The complaint being made is that having "sizeof" yield a value of type "unsigned int", rather than "int", precludes having -1 as an out-of-band value for routines that normally take a "sizeof". Saying "'sizeof' yields a value of type 'unsigned int', so you can't use -1 as an out-of-band value anyway" doesn't argue that the complaint is invalid, it just points out why the complaint is being made in the first place! (I have no strong opinion either way on this. I merely point out that "the type of 'sizeof' is 'unsigned int'" is not, by itself, a valid argument against the proposition that it should be "int".) Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
franka@mmintl.UUCP (Frank Adams) (07/02/87)
In article <22242@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes: |The complaint being made is that having "sizeof" yield a value of |type "unsigned int", rather than "int", precludes having -1 as an |out-of-band value for routines that normally take a "sizeof". Saying |"'sizeof' yields a value of type 'unsigned int', so you can't use -1 |as an out-of-band value anyway" doesn't argue that the complaint is |invalid, it just points out why the complaint is being made in the |first place! (I have no strong opinion either way on this. I merely |point out that "the type of 'sizeof' is 'unsigned int'" is not, by |itself, a valid argument against the proposition that it should be |"int".) The point nobody seemed to notice is that on some machines, sizeof *has* to be unsigned -- "int" isn't big enough. Thus code which uses -1 as an out- of-band value in an integer holding a sizeof result is not portable. A standardization which causes non-portable code to no longer compile is a good thing. -- Frank Adams ihnp4!philabs!pwa-b!mmintl!franka Ashton-Tate 52 Oakland Ave North E. Hartford, CT 06108