[comp.lang.c] Sizes and types

mwm@eris.UUCP (10/28/87)

[Followups have been pointed to net.lang.c.]

In article <3626@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
<In article <2262@sfsup.UUCP> mpl@sfsup.UUCP (M.P.Lindner) writes:
<>3. as for the people who say "Never use the basic types", I say the following:
<>	1. use "char" to mean a character or a byte (guaranteed in K&R)
<
<Use char to mean a character.  Use "typedef char applicationtype" when the best
<implementation for the intended use happens to be a char on your machine.  My
<six bit char may not give the range you assumed because of your chars are
<twelve bits.
<

Sigh. This seems to come up every so often, and generates the same
discussion each time. Why not go right to the end this time?

First, lets discuss sizes.  K&R says that shorts are no bigger than
ints, and longs are no shorter than ints. dpANS adds that shorts are
at least 16 bits, and longs at least 32. Since K&R alone allows a
simple a 2-bit typefor short, int and long, it's kind of useless. So
we'll assume the dpANS.

Given those sizes, the "simple" rules are as follows (MP linder got
them right).

	1. Use "char" to mean characters, or bytes.
	2. Use "short" for small integers which need to be small.
	3. Use "int" for small objects for which size doesn't matter.
	4. Use "long" for anything that won't fit in 16 bits.

Now, some notes:

1) Never use "char" to hold something being used as an integer. You
have no idea how big it may be on some other machine. Of course, if
something external requires a byte instead of a short for such, you
have no choice.

2) "Small" integers fit in 16 bits. If they don't fit in 16 bits, use
long, period.

3) ints should be the fastest type that holds at least 16 bits.
Unfortunately, dpANS kept the K&R "most natural" wording, which means
there may be environments where ints *aren't* the fastest type.

The above will mean your code ports to dpANS compilers with no
problem, except for longs > 32 bits on your machine. Also, if your
machine doesn't look like an "ansi" machine (18 bit shorts, say), then
you may not be getting the best type for the machine. To avoid that,
you can add a level of artificial types between the compiler and your
code.

To wit, you create typedefs for "int#" and "uint#". These types are
signed (unsigned) ints with # bits of magnitutde. This kind of thing
is trivial to do. For instance the following works for most
byte-addressed machines with eight-bit bytes:

typedef signed char 	int1, int2, int3, int4, int5, int6, int7;
typedef unsigned char	uint1, uint2, uint3, uint4, uint5, uint6, uint7, uint8;
typedef signed short	int8, int9, int10, int11, int12, int13, in14, int15;
typedef unsigned short	uint9, uint10, uint11, uint12, uint13, uin14, uint15,
			uint16;
typedef signed long	int16, int17, int18, int19, int20, int21, int22, int23,
			int24, int25, int26, int27, int28, int29, int30, int31;
typedef unsigned long	uint17, uint18, uint19, uint20, uint21, uint22, uint23,
			uint24, uint25, uint26, uint27, uint28, uint29, uint30,
			uint31, uint32.

Include the file (which should be tuned for your machine, and then
declare everything that needs to be larger than 16 bits, or everything
for which being small is important, of the type appropriate for it's
size.  For things that fit in 16 bits for which size doesn't matter,
use int.  Portability to any architechture that supports the limits
you've chosen, and (if the file is set up right) you get the smallest
size that works correctly for your application.

Notes again:

1) I chose the smallest size for each that worked. Things for which
size matters win this way. Small things that need to be fast wind up
as int. So all that's left is things larger than 16 bits that need to
be fast. Doing "fint#" and "fuint#" would be possible, but it's not
clear how useful such is.

2) Not an int# is # bits of *magnitutde*. Thus, there's no int32 on a
machine with 32 bit longs. Likewise int16 needs to be a long, and int8
needs to be a short.

3) I violated the rule about using chars as small integers. But the
file is set up on a per-machine basis, and so should be right for the
machine you're on.

4) This obviously takes care of machines with odd-sized words not
getting the smallest size that works under the simple rules. However,
it also helps with the problem of applicatins that really need longs
larger than the dpANS minimum, because those types won't exist on
machines that don't support them. The resulting code dies at compile
time when moved to such a machine, and names the variables that need
more bits than they can get. While not as good as running correctly
and dieing on overflows, this is much better than producing incorrect
results when the "long" would have overflown.

Finally, as Lawrence Crowl notes, you should use typedefs for
applications sizes whenever it's appropriate. If you're following the
simple rules, this makes it easier to change the code to take
advantage of odd-sized words on machines that have them. It leads to
more readable code in either case.

The sized int rules are still a win even if everything is typedef'ed
to some application name, because 1) the size used by the object is
always documented, and 2) all the size-specific types can be fixed by
editing one short file.

	<mike
--
[Our regularly scheduled .signature preempted.]		Mike Meyer
The Amiga 1000: Let's build _the_ hackers machine.	mwm@berkeley.edu
The Amiga 500: Let's build one as cheaply as possible!	ucbvax!mwm
The Amiga 2000: Let's build one inside an IBM PC!	mwm@ucbjade.BITNET

gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/30/87)

In article <5657@jade.BERKELEY.EDU> mwm@eris.BERKELEY.EDU (Mike (My watch has windows) Meyer) writes:
>1) Never use "char" to hold something being used as an integer. You
>have no idea how big it may be on some other machine.

You know it's at least 8 bits.  A more serious problem is that a plain
"char" may be signed or unsigned, depending on the implementation, so
if more than the range 0..127 is needed, portability suffers.  If one
has a modern C compiler, explicit "signed char" or "unsigned char" can
be specified, but not all compilers support that yet.