kyle@xanth.UUCP (04/07/87)
I would like to see the sizes of C integral types standardized. It would be much easier to write portable code if when I define a variable as an 'int' I could automatically know that its range is -128 to 127, or -32768 to 32768, etc. One proposal might be: char 8 bits short 16 bits int 32 bits long 64 bits Having compilers on 16-bit machines generate code to handle 64-bit long's may be cumbersome, but just knowing how "large" each type is would (probably) make programmers more conscientious about which types they use. Thus those "huge" long's won't be needlessly used as often. Besides savaging quite a few C implementations, what are the other drawbacks to this? Has this already been proposed? kyle@xanth.cs.odu.edu (kyle jones @ old dominion university, norfolk, va)
dlnash@ut-ngp.UUCP (04/08/87)
In article <791@xanth.UUCP>, kyle@xanth.UUCP (kyle jones) writes: > I would like to see the sizes of C integral types standardized. > [...] > > One proposal might be: > > char 8 bits > short 16 bits > int 32 bits > long 64 bits > This is a good idea in theory, but not in practice. What do you do about machines which don't have a word size which is some power of two (like CDC Cybers or DEC-20s)? Doing 64 bit arithmetic on a machine with a 60 bit word size (Cybers) or a 36 bit word size (DEC-20s) would be extremely difficult. Another idea would be something like this: char at least 8 bits (possibly more) short at least 16 bits (possibly more) int at least 32 bits (possibly more) long at least 64 bits (possibly more) Then you would have at least the range you expected. The compiler could then choose the size most advantageous for your machine. If you need to construct bit masks which depend on the operand size (not usually necessary, usually a bad idea), you can use our old friend sizeof. Don Nash UUCP: ...!{ihnp4, allegra, seismo!ut-sally}!ut-ngp!dlnash ARPA: dlnash@ngp.UTEXAS.EDU BITNET: CCEU001@UTADNX, DLNASH@UTADNX TEXNET: UTADNX::CCEU001, UTADNX::DLNASH
msb@sq.UUCP (04/09/87)
Johnathan Tainter (jtr485@umich.UUCP) writes: > short, int, long is the dumbest part of C. The language should have said > there will be a class of types int<ext> where <ext> is the number of bits. > [You should] add macros ... so you CANNOT use int, long etc. And Kyle Jones (kyle@xanth.UUCP) writes: ] I would like to see the sizes of C integral types standardized. It would be ] much easier to write portable code if when I define a variable as an 'int' I ] could automatically know... its range... [e.g.] char 8 bits, short 16, ... This comes up all the time, but perhaps it is worth rebutting it again. The first reason these notions are bad is the presumption that there are only a very small number of word sizes. What do you do on a 36-bit machine, for instance? The second thing is, yes, efficiency. The Draft Standard DOES specify MINIMUM ranges for the different types. In effect it guarantees... char <= short <= int <= long char >= 8 bits, short >= 16 bits, int >= 16 bits, long >= 32 bits ... with the further presumption, which has been part of C for a long time, that int operations are at least as efficient as other types. What this means is that if you always use if you may need more than 16 bits, long else if time efficiency matters more than space efficiency, int else short or char then your compiler will give you what you really need in the way most suited to WHATEVER machine you may run on. Now how can you do better than that? Well, the world is not quite as perfect as I am implying here. If you require variables exceeding 32 bits, your code is certainly nonportable whatever you do, because no variable is guaranteed such accuracy. Also, the Draft is not yet a Standard, and I've heard of compilers where short = 8 bits. (No, I don't remember which ones.) Finally, char may be signed or unsigned on different machines. (The Draft takes care of this by defining a "signed char"; then "char" may be like "signed char" or like "unsigned char" depending on the machine, and should only be used for actual character values.) But for the usual range of sizes and portability issues, following the above guidelines works just fine. Thinking you have to know the range of values of a type is a holdover from other languages; knowing the MINIMUM range is quite sufficient. Oh, and by the way... > > casting. The assignment operator is pretty forgiving...it > > knows what the types on both sides need be. > Assignment had better be forgiving since type casts are defined in > terms of it. Not in the Draft Standard. They're both defined in terms of type conversion. Indeed, the Draft REQUIRES certain conversions involving pointers to be done by casting, e.g. char *cp; int *ip; ip = (int *) cp; Mark Brader
greg@utcsri.UUCP (04/11/87)
In article <791@xanth.UUCP> kyle@xanth.UUCP writes: >One proposal might be: > char 8 bits > short 16 bits > int 32 bits > long 64 bits > >Having compilers on 16-bit machines generate code to handle 64-bit long's may >be cumbersome... Not anywhere near as cumbersome as having those same compilers generate code to handle 32-bit ints. -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...
mwm@eris.UUCP (04/14/87)
In article <1987Apr9.155110.28398@sq.uucp> msb@sq.UUCP (Mark Brader) writes: >The second thing is, yes, efficiency. The Draft Standard DOES specify >MINIMUM ranges for the different types. In effect it guarantees... > > char <= short <= int <= long > char >= 8 bits, short >= 16 bits, int >= 16 bits, long >= 32 bits > >... with the further presumption, which has been part of C for a long >time, that int operations are at least as efficient as other types. Unfortunately, these are only specified by implication (so far as I can tell). If someone can provide a paragraph number, where all of this is specified (or anywhere where "int operations are at least..." is specified), I'd appreciate it. >then your compiler will give you what you really need in the way most >suited to WHATEVER machine you may run on. Now how can you do better >than that? Something that did what you said. For instance, consider a hypothetical Queer Machine for C (QM/C), which has 18 bit words (word addressed), and instructions for dealing with double words. The obvious implementation has int = short = 18 bits, and long = 36 bits. Now, supposed I need 16 bits of magnitude on a signed value. For this machine, declaring things as int works just fine. But the program will not work on "standard" machines, because it really wants longs for that value. But if I declare things as "long," I chew up twice the space for storage, and presumably more time. In other words, following your advice doesn't get me what I really need, and does it a way incredibly inappropriate for QM/C. >Well, the world is not quite as perfect as I am implying here. If you >require variables exceeding 32 bits, your code is certainly nonportable >whatever you do, because no variable is guaranteed such accuracy. True. It would be nice if I could declare things so that the program would break at _compile_ time. There's actually an easy way to do this, at zero cost to those who don't need bit-level control of their data types. <mike -- Here's a song about absolutely nothing. Mike Meyer It's not about me, not about anyone else, ucbvax!mwm Not about love, not about being young. mwm@berkeley.edu Not about anything else, either. mwm@ucbjade.BITNET
manis@ubc-cs.UUCP (04/14/87)
In article <3162@jade.BERKELEY.EDU> mwm@eris.BERKELEY.EDU (Mike (My watch has windows) Meyer) writes: >Unfortunately, these are only specified by implication (so far as I >can tell). If someone can provide a paragraph number, where all of >this is specified (or anywhere where "int operations are at least..." >is specified), I'd appreciate it. This is a good place for an appendix which is not part of the standard, much like the "common extensions" appendix. >Something that did what you said. For instance, consider a >hypothetical Queer Machine for C (QM/C), which has 18 bit words (word >addressed), and instructions for dealing with double words. The >obvious implementation has int = short = 18 bits, and long = 36 bits. If the QM/C uses one's complement, you have the PDP-9. I never heard of a PDP-9 C compiler... This isn't a standardisation issue, but if I wrote a C compiler for a machine with short words, you would have a pragma which set the precision of 'int'. ----- Vincent Manis {seismo,uw-beaver}!ubc-vision!ubc-cs!manis Dept. of Computer Science manis@cs.ubc.cdn Univ. of British Columbia manis%ubc.csnet@csnet-relay.arpa Vancouver, B.C. V6T 1W5 manis@ubc.csnet (604) 228-6770 or 228-3061 "Long live the ideals of Marxism-Lennonism! May the thoughts of Groucho and John guide us in word, thought, and deed!"
rbutterworth@watmath.UUCP (04/15/87)
In article <3162@jade.BERKELEY.EDU>, mwm@eris.BERKELEY.EDU (Mike (My watch has windows) Meyer) writes: > Something that did what you said. For instance, consider a > hypothetical Queer Machine for C (QM/C), which has 18 bit words (word > addressed), and instructions for dealing with double words. The > obvious implementation has int = short = 18 bits, and long = 36 bits. > Now, supposed I need 16 bits of magnitude on a signed value. For this > machine, declaring things as int works just fine. But the program will > not work on "standard" machines, because it really wants longs for > that value. But if I declare things as "long," I chew up twice the > space for storage, and presumably more time. I've never actually done this myself, but if you are really worried about specifying minimum integer size and having the source still portable, you could set up a header file for each different architecture/compiler something like this: typedef int int1; typedef int int2; ... typedef int int18; typedef long int19; typedef long int20; ... typedef long int36; Then when you have a variable that needs at least 12 bits, you can declare it as "int12 var;". The typedef in the header file will give you the most efficient type for the variable. "int5" would be typedefed to "int" on most machines, to "short" on those machines whose short instructions are as fast and as small as for ints, and as "signed char" on those machines that have fast char arithmetic. In those applications where you ask for "int35", it would not compile on machines that can't handle 35 bit integers. This is of course exactly what you want. Note that "extern char x,y,z; x=y+z;" can generate several times the amount of code as "extern int x,y,z; x=y+z;" on some machines. Thus if you want 5 bit integers, "int5" would give you "int" on those machines. Note that a parallel set of typedefs (e.g. pack8) would be needed when the concern is for saving space (e.g. large arrays) and not code efficiency.
msb@sq.UUCP (04/15/87)
Mike (My watch has windows) (!) Meyer (mwm@eris.BERKELEY.EDU) writes: > In article <1987Apr9.155110.28398@sq.uucp> msb@sq.UUCP (Mark Brader) writes: > >... The Draft Standard ... In effect ... guarantees ... > > > > char <= short <= int <= long > > char >= 8 bits, short >= 16 bits, int >= 16 bits, long >= 32 bits > > > >... with the further presumption, which has been part of C for a long > >time, that int operations are at least as efficient as other types. > > Unfortunately, these are only specified by implication (so far as I > can tell). If someone can provide a paragraph number ... Well, I did say "in effect". I guess you want the actual wording. Section 3.1.2.5 reads in part... # There are four types of signed integers, called signed char, short int, # int, and long int. ... # # A signed char occupies the same amount of storage as a "plain" char. # A "plain" int has the natural size suggested by the architecture of the # execution environment. ... The set of values of each signed integral # type is a subset of the values of the next type in the list above. This covers the first set of inequalities and the efficient-ints rule. The actual minimum sizes are implied by the values of SCHAR_MAX, SHRT_MAX, INT_MAX, and LONG_MAX, and the corresponding _MIN values, tabulated in Section 2.2.4.2. (Notice, incidentally, that the values are chosen in such a way that sign-magnitude or 1's complement arithmetic is okay within the word lengths mentioned; thus the smallest known-to-be-a-valid-int value is -32767 and not -32768.) > ... consider a > hypothetical Queer Machine for C (QM/C), which has 18 bit words ... > obvious implementation has int = short = 18 bits, and long = 36 bits. > Now, suppose I need 16 bits of magnitude on a signed value. Right, you have to declare it long for portability even though int would work on such a machine. I think I alluded to this in my own posting. But the thing is, this is a RARE CASE. If you have variables that you KNOW will need 17 bits of sign and magnitude but not as many as 19, AND you have enough of them or they are frequently enough used that efficiency on QM/C's is a problem, THEN by all means do tricks with ifdefs and typedefs. For normal cases, the guidelines I suggested before will work. > It would be nice if I could declare things so that the program > would break at _compile_ time [if you need variables >32 bits]. Try: #if (1UL << 35 == 0) sorry, machine must have at least 36-bit longs #endif (I'm not positive whether the U is necessary -- the Draft Standard doesn't mention what << does in case of overflow when the left operand is signed, and I don't want to think about it.) Mark Brader
henry@utzoo.UUCP (Henry Spencer) (04/16/87)
> I would like to see the sizes of C integral types standardized. It would be > much easier to write portable code if when I define a variable as an 'int' I > could automatically know that its range is -128 to 127, or -32768 to 32768, > etc. > ... > Besides savaging quite a few C implementations, what are the other drawbacks > to this?... It loses us the current semantics of "int", which are "the form of integer that is most efficient on the machine in question". Since most well-written code doesn't care whether int is 16 or 32 bits (N.B. there is a lot of badly-written code in the world), current C compilers are free to pick the one that runs faster. This can make quite a difference in performance. All the world is *not* a VAX. To quote Dennis Ritchie: "if you want PL/I, you know where to find it". -- "We must choose: the stars or Henry Spencer @ U of Toronto Zoology the dust. Which shall it be?" {allegra,ihnp4,decvax,pyramid}!utzoo!henry
doug@edge.UUCP (Doug Pardee) (04/16/87)
> The second thing is, yes, efficiency. The Draft Standard DOES specify > MINIMUM ranges for the different types. In effect it guarantees... > > char <= short <= int <= long > char >= 8 bits, short >= 16 bits, int >= 16 bits, long >= 32 bits > > ... with the further presumption, which has been part of C for a long > time, that int operations are at least as efficient as other types. I dunno about this last presumption having been part of C for a long time. (I take that to mean K&R spec). The closest I can find is in K&R sec. 2.2, which says (twice) that "int" reflects the "natural size of integers on the host machine." I bring this up because on the C compilers I've used on the 68000, "int" has always been a 32-bit quantity. This is almost a necessity, because of the well-known bad habit of assuming that a pointer will fit in an int. But 32-bit ints on the 68000 are nowhere near as efficient as 16-bit ints. They require twice as many memory accesses, and multiplication and division have to be performed with subroutines. This latter point can turn a simple subscripting operation into a performance catastrophe. -- Doug Pardee -- Edge Computer Corp. -- Scottsdale, Arizona
msb@sq.UUCP (04/20/87)
> > The second thing is, yes, efficiency. The Draft Standard DOES specify > > MINIMUM ranges for the different types. In effect it guarantees... > > ... with the further presumption, which has been part of C for a long > > time, that int operations are at least as efficient as other types. > > I dunno about this last presumption having been part of C for a long time. > (I take that to mean K&R spec). The closest I can find is in K&R sec. 2.2, > which says (twice) that "int" reflects the "natural size of integers on > the host machine." This is what I meant; much the same language is in the Draft (sec. 3.1.2.5). I think there's a general presumption that "natural" implies "most efficient". > I bring this up because on the C compilers I've used on the 68000, "int" > has always been a 32-bit quantity. This is almost a necessity, because > of the well-known bad habit of assuming that a pointer will fit in an int. > But 32-bit ints on the 68000 are nowhere near as efficient as 16-bit ints. If this is accurate, it means that the compiler writers had to choose between making existing "well-known" badly-written code run at all without being fixed, and making well-written code run more slowly than it should. The decision strikes me as -- no pun intended -- short-sighted. But my assumption is that proper typing will become more widespread in the future, which may be wrong; and, as someone else pointed out recently, getting the right answer certainly beats getting the wrong answer fast. Mark Brader
guy%gorodish@Sun.COM (04/21/87)
>I bring this up because on the C compilers I've used on the 68000, "int" >has always been a 32-bit quantity. This is almost a necessity, because >of the well-known bad habit of assuming that a pointer will fit in an int. "Almost". I worked on a 68000-based machine that had a C implementation 16-bit "int"s and 32-bit pointers; I didn't find it particularly painful to use, except when I had to fix other people's code to be type-correct (and I knew enough to blame *that* on the other people who wrote that code, not on the C implementation). >But 32-bit ints on the 68000 are nowhere near as efficient as 16-bit ints. >They require twice as many memory accesses, and multiplication and division >have to be performed with subroutines. This latter point can turn a simple >subscripting operation into a performance catastrophe. Nope. Multiplication by a constant can be done in-line, with a sequence of shifts and adds, and the most common type of multiplication in subscripting operations is multiplication by a constant.