[net.lang.c] Uses of "short" ?

dwb@houxh.UUCP (D.BECK) (09/05/85)

<>

What are the reasons for using the type short in C ?  On
machines that have a different size for "int" and "short"
the reason seems obvious, space.  However, I would appreciate any thoughts
on there usefulness there also.  Specifically, I am interested
in its use on machines where sizeof (int) = sizeof (short).

Thx
..... ! [allegra | ihnp4 ]! hru3c!peb

guy@sun.uucp (Guy Harris) (09/06/85)

> What are the reasons for using the type short in C ?  On
> machines that have a different size for "int" and "short"
> the reason seems obvious, space.  However, I would appreciate any thoughts
> on there usefulness there also.  Specifically, I am interested
> in its use on machines where sizeof (int) = sizeof (short).

AIEEEEEEEEEEEEEEEEEEEEEE!

You don't choose "int" vs. "short" or "int" vs. "long" based on how big
"int" is on your machine.  Doing that is not writing C code; it is writing
code for that particular machine that just happens to be written in C.  If
you write code that uses "int" instead of "long" because it happens to work
on your machine, sure as Goddess made little green apples *somebody* is
going to be on your doorstep bitching because your code breaks on their
machine.

The penalty for using "int" instead of "short" may not be as high (unless
you're describing some externally-defined data object, like something
recorded on a file or sent over a network, but in those circumstances the
size of data objects is the least of your worries - there's byte order, 1's
complement vs. 2's complement, character code, floating point format,
structure padding, etc.), but if you use "int" instead of "short" on a
machine where sizeof(int) == sizeof(short), when that code gets moved to a
machine where sizeof(int) != sizeof(short), your program is going to waste
some space.  If the object being declared is a huge array, it is going to
waste a huge amount of space.

There is an unfortunate tendency for C programmers to think in terms of a
concrete machine that they're programming for, rather than an abstract
machine - or, even better, an abstract model of the particular computation
they're performing.  Thinking of data objects not as lumps of machine words
but as abstractions will, I suspect, improve the quality of your code in
general, and specifically its portability.

	Guy Harris

henry@utzoo.UUCP (Henry Spencer) (09/06/85)

> What are the reasons for using the type short in C ?  ...
> Specifically... on machines where sizeof (int) = sizeof (short).

Sloppiness in this is common (although not nearly as disastrous as
being sloppy on machines where sizeof(int) == sizeof(long)!).  The
main reason for distinguishing between the two is portability, i.e.
making your programs work on machines which violate this property
as well as machines which satisfy it.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

nather@utastro.UUCP (Ed Nather) (09/07/85)

> There is an unfortunate tendency for C programmers to think in terms of a
> concrete machine that they're programming for, rather than an abstract
> machine - [...]
> 
> 	Guy Harris

The only concrete machine I ever programmed just let you dial in how long
it was supposed to run before the stuff got all mixed up.  :-)
-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{allegra,ihnp4}!{noao,ut-sally}!utastro!nather
nather@astro.UTEXAS.EDU

franka@mmintl.UUCP (Frank Adams) (09/10/85)

Or to put it a little differently, use 'int' to optimize access time for the
variable, and 'short' to optimize space used.

laura@l5.uucp (Laura Creighton) (09/10/85)

Remember that any code that you write is likely to be ported to machines
you never dreamed of. [There are gremlins who wait until the crack of
7 a.m. when all hackers are blissfully asleep to scrounge copies of your
source and distribute it to 50 random net sites all over the world.] If
your int is the same size as your short it may make no difference to you
whether you use int or short, but it may matter to someone else a great
deal. [The worse problem is people who think that sizeof(int) == sizeof(long)
mostly because that is the way it works on their vax. Move their code to a
pdp 11 and watch it dump core when things just aren't big enough.]

There are days when I think that the declaration ``int'' should be banned
to see if the code quality would improve.


-- 
Laura Creighton		(note new address!)
sun!l5!laura		(that is ell-five, not fifteen)
l5!laura@lll-crg.arpa

preece@ccvaxa.UUCP (09/10/85)

> There is an unfortunate tendency for C programmers to think in terms of
> a concrete machine that they're programming for, rather than an
> abstract machine - or, even better, an abstract model of the particular
> computation they're performing.  Thinking of data objects not as lumps
> of machine words but as abstractions will, I suspect, improve the
> quality of your code in general, and specifically its portability.  /*
> Written 12:58 am  Sep  6, 1985 by guy@sun.uucp in ccvaxa:net.lang.c */
----------
But 'int' is a perfectly good abstraction; more abstract that 'short'
or 'long.'  The restriction of certain values to certain ranges CAN
be part of an abstraction, but it can also be an incidental factor
that is only useful because some machines make a distinction that
makes it useful.  That says to me that use of 'short' or 'long'
instead of 'int' shows more attention to machine specificity.

It may be the case that in a certain piece of code it is possible
to prove that a variable's value must lie in a particular range.
If the programmer specifies that range somehow, compilers for
languages that support that distinction can produce code taking
advantage of it.  From the programmer's point of view, however,
that provable range is probably not significant to her view of
the process.  Often it doesn't matter to the programmer whether the
variable is real or integer, either, but that abstraction is more
deeply ingrained.

-- 
scott preece
gould/csd - urbana
ihnp4!uiucdcs!ccvaxa!preece

bc@cyb-eng.UUCP (Bill Crews) (09/10/85)

> What are the reasons for using the type short in C ?  On
> machines that have a different size for "int" and "short"
> the reason seems obvious, space.  However, I would appreciate any thoughts
> on there usefulness there also.  Specifically, I am interested
> in its use on machines where sizeof (int) = sizeof (short).
> 
> Thx

The percentage of machines in the universe that are either 16-bit or 32-bit
machines is very highgh.  Therefore, one can get a great degree (but certainly
not total) compatibility of communicated information by refraining from
declaring ints in structures, but instead declaring either shorts or longs.
All compilers for such machines of which I am aware implement short as 16 bits
and long as 32 bits, although the definition of int may go either way.  So, for
my data files and communications protocols, this is what I do.  If I run into
a 36-bit or 48-bit machine, I may have some work to do.

By the way, the uint16-type stuff mentioned lately on the net can help here
also.
-- 
  /  \    Bill Crews
 ( bc )   Cyb Systems, Inc
  \__/    Austin, Texas

[ gatech | ihnp4 | nbires | seismo | ucbvax ] ! ut-sally ! cyb-eng ! bc

bc@cyb-eng.UUCP (Bill Crews) (09/10/85)

> There is an unfortunate tendency for C programmers to think in terms of a
> concrete machine that they're programming for, rather than an abstract
> machine - or, even better, an abstract model of the particular computation
> they're performing.  Thinking of data objects not as lumps of machine words
> but as abstractions will, I suspect, improve the quality of your code in
> general, and specifically its portability.
> 
> 	Guy Harris

This sounds great.  I agree with it to a point.  But doesn't it depend upon
what one is trying to accomplish?  Certainly those implementing communications
protocols DO care.  Kernel hackers probably care in lots of places, too,
especially those writing device drivers.

The net is that sometimes one wants to get close to the machine, and sometimes
one wants to use C as a high(er)-level language.  That's the beauty of C; it
CAN be used either way.  But it is up to the programmer.

If you really believe what you say, would you support the abolition of the
short and long data types?  Double as well?  I'd be interested.
-- 
  /  \    Bill Crews
 ( bc )   Cyb Systems, Inc
  \__/    Austin, Texas

[ gatech | ihnp4 | nbires | seismo | ucbvax ] ! ut-sally ! cyb-eng ! bc

guy@sun.uucp (Guy Harris) (09/12/85)

> > Thinking of data objects not as lumps of machine words
> > but as abstractions will, I suspect, improve the quality of your code in
> > general, and specifically its portability.
> 
> This sounds great.  I agree with it to a point.  But doesn't it depend upon
> what one is trying to accomplish?  Certainly those implementing
> communications protocols DO care.  Kernel hackers probably care in lots
> of places, too, especially those writing device drivers.

"Do care" about what?  Even device drivers, protocol modules, etc. are
probably much more likely to be correct if you think of the objects they
manipulate abstractly.  Some bits of code in device drivers on machines with
memory-mapped I/O might have to treat structures which represent device
registers, say, as low-level objects, but 99% of the data the OS manipulates
- and probably a high percentage of the data the device driver manipulates -
are not such low-level objects.

> The net is that sometimes one wants to get close to the machine, and
> sometimes one wants to use C as a high(er)-level language.  That's the
> beauty of C; it CAN be used either way.  But it is up to the programmer.

Even when writing grubby device driver code, you should use C as a
higher-level language.  There's no benefit to be gained from thinking of,
say, the I/O operation/buffer header queue of a disk driver as a bunch of
words, some of which contain addresses, some of which contain counts, etc..
The ability of C to get "close to the machine" is vastly overemphasized; 99%
of the code people write, even in OSes and the like, doesn't need to get
"close to the machine" in the same sense as an assembler language gets
"close to the machine" and, in most cases, *doesn't* get "close to the
machine" in that sense.

> If you really believe what you say, would you support the abolition of the
> short and long data types?  Double as well?  I'd be interested.

No, I don't support the abolition of those data types.  "int" means "most
convenient integral type, guaranteed to hold numbers in the range -32767 to
32767 but not guaranteed to hold anything outside".  Needless to say, this
is an extremely inappropriate type to represent "number in the range -1
million to 1 million".  The type "long" is needed for this (in ANSI C,
"long"s are guaranteed to hold values between -(2^31-1) to (2^31-1)).  "int"
is "closer to the machine" than "long", so if anything "int" should go if
you're trying to increase the distance of code from the machine, not "long"
or "short".

"int" should not be abolished, though, because in a lot of cases "short" and
"long" overspecify the type and don't allow the language implementation
enough freedom to choose the most appropriate type.  If you have some
variable which can use any reasonable amount of space without any serious
effect on the space requirements of the program, and which is *never* going
to be outside the range -32767 to 32767, "int" is the appropriate choice.

	Guy Harris

bc@cyb-eng.UUCP (Bill Crews) (09/13/85)

> Even when writing grubby device driver code, you should use C as a
> higher-level language.  There's no benefit to be gained from thinking of,
> say, the I/O operation/buffer header queue of a disk driver as a bunch of
> words, some of which contain addresses, some of which contain counts, etc..
> The ability of C to get "close to the machine" is vastly overemphasized; 99%
> of the code people write, even in OSes and the like, doesn't need to get
> "close to the machine" in the same sense as an assembler language gets
> "close to the machine" and, in most cases, *doesn't* get "close to the
> machine" in that sense.
> 
> 	Guy Harris

You are invited to write a 3Com Ethernet driver that works on a PC
(16-bit int) and on a Cyb machine (32-bit int) without referring to longs
or shorts.  I.e., a 16-bit hardware register is a 16-bit hardware register,
despite your desire for abstraction.

I doubt we disagree fundamentally; I just think you are stating the case too
strongly.  My goal is to introduce environment dependencies only as needed.
I nevertheless believe, for instance, that when Internet protocol specifies
that the header checksum is a 16-bit 2's-complement number, it is wise to
comply, if you want the packets to fly properly.
-- 
  /  \    Bill Crews
 ( bc )   Cyb Systems, Inc
  \__/    Austin, Texas

[ gatech | ihnp4 | nbires | seismo | ucbvax ] ! ut-sally ! cyb-eng ! bc

guy@sun.uucp (Guy Harris) (09/14/85)

> But 'int' is a perfectly good abstraction; more abstract that 'short'
> or 'long.'  The restriction of certain values to certain ranges CAN
> be part of an abstraction, but it can also be an incidental factor
> that is only useful because some machines make a distinction that
> makes it useful.  That says to me that use of 'short' or 'long'
> instead of 'int' shows more attention to machine specificity.

The use of "long" instead of "int" shows more attention to machine
specificity?  OK, we have the object "Internet address".  This object can be
represented as, among other things, a 32-bit quantity.  (We neglect the
problem of non-binary machines for the nonce.) Implementing this object with
an "int" shows a hell of a lot of attention to machine specificity, since it
won't work worth a damn on a PDP-11, or any other machine with "int"s less
than 32 bits (like machines based on current 8086-family chips, or
68000/68010/68008 machines with 16-bit-"int" compilers).  Implementing it
with a "long" shows a lot less machine specificity, since (according to the
ANSI C standard) a "long" can hold numbers in the range -2147483647 to
2147483647.  (On a two's complement machine, or a one's complement machine,
or even a sign-magnitude machine, this requires 32 bits.)  The same argument
applies to "unsigned int" vs. "unsigned long".

If the 4.xBSD networking code had been written with less implicit knowledge
of the machines it would work on - i.e., if the type "long" had been used
where the C specification says it should be (the information explicitly
described in the ANSI C standard is here considered part of the "implicit"
specification of C - yes, it's folklore, but UNIX is still dominated by
folklore) - it would have moved to 2.9BSD more easily.  I believe a certain
popular news-reading system had much the same problem; it stored the length
of an article in an "int" instead of a "long".  Earlier versions of Berkeley
Mail had the same problem (and "mailx" is based on one of those earlier
versions, alas).

"int" is to be used to implement "integer" objects whose value will *never*
be outside the range -32767 to 32767, and where the amount of space taken up
by the object is less important than the amount of time required to
manipulate it.  (Well, modulo machines with 16-bit data paths and large
address spaces, where "int"s are often 32 bits even though it takes more
time to manipulate them than it does to manipulate 16-bit quantities.)
"short" is to be used to manipulate "integer" objects whose value will never
be outside that range but where the amount of space taken up by the object
is more important than the amount of time required to manipulate it (either
because there's a limit on the address space or physical memory available,
or because the object's representation must conform to some
externally-imposed restrictions).  "long" is to be used to manipulate
"integer" objects whose value can be outside the aforementioned range, or
whose representation must conform to some externally-imposed restriction
that requires the use of "long".

Code that uses "int" to implement objects known to have values outside the
range -32767 to 32767 is incorrect C.  The ANSI standard explicitly
indicates this.  Even in the absence of such an explicit indication in an
official language specification document, this information should be
imparted to all people being taught C.

If you removed "long" from the C language, you would either

	1) have a language incapable of talking about numbers outside the
	   range -32767 to 32767

or

	2) have a language which requires at least an 18-bit machine and
	   probably at least a 32-bit machine.

"short" is less commonly used, since it provides no guarantees about the
range of integral values it can represent that "int" doesn't provide.
However, it should be obvious to anyone who is aware of the fact that
correct C code can, in most if not all cases, be moved from one machine to
another (assuming no operating system dependencies) simply by recompiling it
that there *is* a reason to use "short" instead of "int" even if
sizeof(short) == sizeof(int) and even if the data doesn't have to conform to
some external specification.  Thinking of "short" as a compact form of
"int" and using it wherever space-efficiency is *or might be* of primary
importance will yield code that is more likely to run happily on a variety
of machines (and is less likely to piss off the guy who has to get the
program running efficiently on a computer other than one of the ones the
original programmer had in their shop).

> It may be the case that in a certain piece of code it is possible
> to prove that a variable's value must lie in a particular range.
> If the programmer specifies that range somehow, compilers for
> languages that support that distinction can produce code taking
> advantage of it.  From the programmer's point of view, however,
> that provable range is probably not significant to her view of
> the process.

In a lot of cases, I damn well hope that the provable range is significant
to the programmer's view of the process.  In the code

	int a[10];
	int i;

	i = <some value>;
	a[i]++;

if the programmer's view of the process does not include the (provable)
condition that "i" will never have a value outside the range 0 to 9, this
code is incorrect and, by Murphy's Law, will proceed to demonstrate that
fact at the worst possible moment.  Plenty of code demonstrates its
incorrectness in similar fashion; the code

	FILE *foo;
	char buf[SIZE];

	foo = fopen(<some_file_name>, "r");
	fgets(buf, SIZE, foo);

will so demonstrate on a Sun (or a CCI Power 5/20 or a lot of other
machines) simply by being run after ensuring that the file to be opened and
read does not exist.  Replacing it with

	foo = fopen(...);
	if (foo != NULL)
		fgets(...);

will probably render it provable that the "fgets" call won't screw up - or,
at least, won't screw up by reading from an unopened file.

I don't know whether there are proof techniques which are powerful enough to
prove the correctness of code involving subrange types in all interesting
cases and which are practical.  If there are, I'd like to see them
incorporated into compilers and have the compiler refuse to generate code
unless 1) the necessary checks are put in or maybe 2) explicit directives
are inserted *into the code* to tell the compiler that you know what you're
doing and it should trust you.  (I don't want it to be a compiler option; I
want the code to explicitly indicate that it's being unsafe.)

	Guy Harris

throopw@rtp47.UUCP (Wayne Throop) (09/15/85)

> > Even when writing grubby device driver code, you should use C as a
> > higher-level language.
> >         Guy Harris

> You are invited to write a 3Com Ethernet driver that works on a PC
> (16-bit int) and on a Cyb machine (32-bit int) without referring to longs
> or shorts.  I.e., a 16-bit hardware register is a 16-bit hardware register,
> despite your desire for abstraction.
>           Bill Crews

Well, it surely can't be done without refering to C's primitive types,
but I think that it should still be done abstractly.  You need a module
that "knows" about the physical-to-C-types mappings that you need to
use, and there you say

    typedef packet_id_type  short;    /* network needs 16 bits signed */
    typedef packet_len_type unsigned; /* network needs 32 bits unsigned */
    typedef io_register_type long;    /* a 32 bit signed register */
    ... etc, etc

and everywhere else in the code, you refer to the abstract types
packet_id_type, packet_len_type, io_register_type, and so on.  Thus, the
machine-dependant part of things can be kept to a very small part of the
world, by abstracting the machine (or specification) dependant types.

Why is this an advantage?  Because it allows you to have a "handle" on
things that are, say, io_register_type.  If you just declared them
"long" everywhere, you couldn't tell them from file positions, or
other things that might be declared "long".

In this sense, using "long" or "short" or "int" directly is often "too
low level" a use of C.  If you want "a small integer that won't get too
large" use "short".  But if you want "a thing that must be mapped to a
specific size and shape in bits", use an abstract type, not the
(changable) primitive type.  In "real" code for large systems, I'd
hope to almost *never* see *anything* declared to be any primitive
type.
-- 
Wayne Throop at Data General, RTP, NC
<the-known-world>!mcnc!rti-sel!rtp47!throopw

guy@sun.uucp (Guy Harris) (09/16/85)

> You are invited to write a 3Com Ethernet driver that works on a PC
> (16-bit int) and on a Cyb machine (32-bit int) without referring to longs
> or shorts.  I.e., a 16-bit hardware register is a 16-bit hardware register,
> despite your desire for abstraction.

The object being described is a 16-bit hardware register.  As such,
technically, there's no abstract way to describe it in C except *maybe* a
bitfield.  What happens on an 18-bit machine?  One hopes that 1) the
register is an 18-bit register and 2) "short" is 18 bits on the machine (or
that both are 36 bits).

For that matter, what if the board has memory-mapped I/O on one machine, but
not on another (i.e., you have to issue I/O instructions to read the device
registers)?  If you truly want a portable driver, you'll have to pick a
level of abstraction *above* the device registers and have 99% of the code
in the driver deal with that abstraction rather than the device registers.

The Ethernet driver has to deal with a lot of objects that aren't part of
the board - a 4.2 driver, for instance, has to talk to the various protocol
modules above it.  That code doesn't have to get up-close and personal with
the machine to the degree that code that fiddles hardware registers does.

Also, if you can find *any* statement where I say that the use of "short"
and "long" violates proper use of abstractions, and should be avoided, I'd
really like to see it.  In fact, I believe quite the opposite.  In some
cases, "int" better fits the abstract object being described than "short" or
"long" does, and in some cases "short" or "long" better fit it.
"short" and "int" capture a requirement that -32767 to 32767 be the range of
values of this (integral) object equally well.
However, "short" specifies the amount of space required more stringently
than "int" does, and "int" (sort of) specifies the required efficiency for
instructions that manipulate the object better than "short" does.  Choose
the one that specifies the part you *do* care about (and in the case of a
16-bit device register, "short", while *not* perfectly specifying this, does
so better than "int" does).  If you want something that has a range of
values of -2147483647 to 2147483647 (*not* -2147483648 - the Sperry 100
users will get a bit annoyed if you assume that all UNIX machines can
represent -2147483648), you should use "long", even if you "know" that
you'll be running on a machine with 32-bit (or wider) "int"s.

> I nevertheless believe, for instance, that when Internet protocol specifies
> that the header checksum is a 16-bit 2's-complement number, it is wise to
> comply, if you want the packets to fly properly.

So, if your language has a primitive integral data type that specifies that
objects of the type are 16 bits and that arithmetic is done in 1's
complement (not 2's complement - check the IP spec again), use it.  There is
no way in C to say "I want a 16-bit 1's complement number."  (Note that the
4.2BSD VAX and Sun IP checksum routines both drop into assembler language at
times.)

	Guy Harris

tp@ndm20 (09/18/85)

>I doubt we disagree fundamentally; I just think you are stating the case too
>strongly.  My goal is to introduce environment dependencies only as needed.
>I nevertheless believe, for instance, that when Internet protocol specifies
>that the header checksum is a 16-bit 2's-complement number, it is wise to
>comply, if you want the packets to fly properly.

But what type is that in C.  On a Harris H-series machine, a short is
24 bits, as is an int.  A long is  48 bits.   Obviously  this kind of
code is machine dependent.  I believe that the point Guy is trying to
make is that you CAN NOT assume that  a short  is 16  bits on *every*
machine, because it isn't, so you should recognize that  what you are
doing is machine dependent, no matter HOW you code it  in C.   If the
exact number of bits is mandated, it is a machine dependence how that
will be handled.  If it is not, then we should deal with abstractions
to produce more portable code.  

I always use the guideline that if a number is  less than  16K, I use
short, and if it is known to be greater than that I use long.  If the
range of values is not well known, or space efficiency is not a great
concern, I use int, as that is presumably the  most efficient integer
type.  These aren't the best guidelines in the  world, and  I am open
to better suggestions.   It  would be  nice to  be able  to declare a
value  range  and  let the  compiler pick  the type  based on machine
characteristics (a la Pascal).

Terry Poot
Nathan D. Maier Consulting Engineers
(214)739-4741
Usenet: ...!{allegra|ihnp4}!convex!smu!ndm20!tp
CSNET:  ndm20!tp@smu
ARPA:   ndm20!tp%smu@csnet-relay.ARPA

preece@ccvaxa.UUCP (09/19/85)

> The use of "long" instead of "int" shows more attention to machine
> specificity?  /* Written  8:49 pm  Sep 13, 1985 by guy@sun.uucp in
> ccvaxa:net.lang.c */
----------
Well, yes, to my mind. The programmer, thinking abstractly about a
particular process, is likely to think of a variable quantity as
an integer or as a real.  Precision is a secondary consideration
that, generally, only becomes a primary consideration when the
simpler assumption fails.  The range of an integer is not likely to
be a concern until the programmer is faced with a case in which the
default assumption ("it's an integer") fails.  So I submit that
as a default, "int" is more abstract than "short" or "long." That's
why the white book says an int is "the natural size suggested by
the host machine architecture."  In practice, the programmer usually
will have a pretty good idea when a quantity is likely to violate
that default assumption on a particular machine and work around it
accordingly (whether by changing "int" to "long" or by providing
a specialized data type if "long" isn't long enough).

Now we're going to have a C standard pretty soon, in all probability,
and that may well change the default assumptions.  If "int" is
truly defined as a 16bit quantity, I will probably change my
default working habits to use long -- otherwise my default
abstraction would be violated too often.  Up until the present,
however, we have been working with a language in which the
definition of short, int, and long were specifically machine
dependent, and anyone porting software simply had to be aware
of the obvious places where machine dependencies showed up.

There are other languages where the purer abstraction is quite
acceptable.  In Common Lisp and integer can get to be any size
it needs to be; the system will take care of keeping it in an
appropriate kind of object for the size it currently has.  On
the other hand, there is a certain amount of overhead in that
approach that you might not want to swallow.  But it IS machine
independent.
----------
> Code that uses "int" to implement objects known to have values outside
> the range -32767 to 32767 is incorrect C.  The ANSI standard explicitly
> indicates this.  Even in the absence of such an explicit indication in
> an official language specification document, this information should be
> imparted to all people being taught C.
----------
Now that there is a reasonably firm draft standard, this is a reasonable
statement.  Not very long ago it was religious dogma.
----------
> In a lot of cases, I damn well hope that the provable range is
> significant to the programmer's view of the process.  [gives
> example of variable used as index into fixed size array]
----------
Well, yes and no.  It is significant to the programmer that the
value be a legal index into the array, but that may not be a fixed
range in the programmer's mental model of the process.  That is, it
may be temporarily fixed, simply because it is necessary to provide
a value for the declaration, but that size may be incidental.  In
such cases the well-bred programmer will have provided a #define,
with a suggestive name, for the range of the array and anything
that checks against it, but may not be able to provide a range for
the index other than "fitting within the array," which may not be
well modeled by architecturally convenient number sizes.  An array
with a dynamic size is another example.  The point is that "fitting
within the array" is a good and sufficient abstraction for the
programmer's view of the process.

-- 
scott preece
gould/csd - urbana
ihnp4!uiucdcs!ccvaxa!preece

peter@graffiti.UUCP (Peter da Silva) (09/25/85)

Would it be a horrid assault on the spirit of 'C' to allow the following:

int x:24;

Which will be allocated the appropriate number of words for the machine
involved? If this looks too much like a bit-field, and you're allergic
to that for some reason, how about:

int:24 x;

Then you can define

float foo:16;

if you really think you can do something useful with 16-bit floats. Someone
must use them for something...

jss@sjuvax.UUCP (J. Shapiro) (10/03/85)

I am inclined to prefer the use of int16, int32, int64, int8, char. These
leave it entirely unambiguous which one you wanted and can be typedefed by
a standard header file to avoid a lot of machine dependency. It ain't perfect,
but it's pretty damn close.

Anyone with a 48 bit int deserves what he gets;-)

Jon Shapiro
-- 
Jonathan S. Shapiro
Haverford College

	"It doesn't compile pseudo code... What do you expect for fifty
		dollars?" - M. Tiemann

guy@sun.uucp (Guy Harris) (10/05/85)

> Well, yes, to my mind. The programmer, thinking abstractly about a
> particular process, is likely to think of a variable quantity as
> an integer or as a real.

Which is a dangerous pattern of thought.  "int"s don't behave like integers
- you were never guaranteed that an "int" could hold numbers outside the
range -32768 to 32767 (the first C implementation, remember, was on a 16-bit
machine) and "float"s and "double"s definitly don't behave like real numbers
(otherwise, there wouldn't be a discipline called "numerical analysis").
Given that the maximum absolute value which an "int" can hold is fairly
small, it's not particularly safe to ignore considerations of range (it's
not safe to ignore considerations in precision - in "real"s -either, but
integers are always precise).

Furthermore, if this variable quantity is to be used as, say, an array
subscript, if the range of valid values for that quantity is a primary
consideration that program stands a good chance of getting a subscript range
exception or a wild reference.

> In practice, the programmer usually will have a pretty good idea when a
> quantity is likely to violate that default assumption on a particular
> machine

I don't want them to have a pretty good idea when it's going to violate that
default assumption on a particular machine.  I want them to have a pretty
good idea when it's going to violate that default assumption on a
16-bit-"int" machine; then there won't be so much d*mn code out there that
needs a good going-over by "lint" before it'll run on PDP-11s and
8086-family-chip based machines (or before it'll handle files, or other
objects not constrained by the address space, bigger than 64KB on such
machines.

> Now we're going to have a C standard pretty soon, in all probability,
> and that may well change the default assumptions.  If "int" is
> truly defined as a 16bit quantity, I will probably change my
> default working habits to use long -- otherwise my default
> abstraction would be violated too often.

Sorry, "there wasn't a rigorous enough standard for C until recently" is not
an excuse for using "int" to hold quantities which could, conceivably, be
outside the range -32768 to 32767.  K&R states quite clearly that "int"
takes up 16 bits on PDP-11s, so there exists at least one (very popular)
machine on which the "default assumption" about "int" would be violated.
Given the number of PDP-11s (and IBM PCs) out there, that assumption is
going to be violated quite often.  Even before the ANSI C standard
explicitly stated that you cannot count on an "int" holding values outside
that range (it does *not* "define it as a 16bit quantity"; it defines it as
a quantity which *may* hold values outside the range that fits in a 16-bit
quantity, but which is not *guaranteed* to be able to hold values outside
that range), it was unsafe and bad practice to so use "int".  (Consider all
the postings that say "our news system truncates items with more than 64KB,
so could you please repost XXX" for an example of why it is a bad practice.)

> Up until the present, however, we have been working with a language in
> which the definition of short, int, and long were specifically machine
> dependent, and anyone porting software simply had to be aware
> of the obvious places where machine dependencies showed up.

They're *still* machine-dependent; however, some of the constraints on them
that were implicitly stated by the enumeration of C implementations in K&R
are now stated explicitly.  People *porting* software shouldn't have to be
aware of those places.  People *writing* software - even if they "know" that
it'll never be ported - should be aware of them.  Unfortunately, the people
doing the porting get stuck with cleaning up after the mess left by the
original author, and those people may have no way to force the original
author to fix the problem for them.

> > Code that uses "int" to implement objects known to have values outside
> > the range -32767 to 32767 is incorrect C.  The ANSI standard explicitly
> > indicates this.  Even in the absence of such an explicit indication in
> > an official language specification document, this information should be
> > imparted to all people being taught C.
> ----------
> Now that there is a reasonably firm draft standard, this is a reasonable
> statement.  Not very long ago it was religious dogma.

Hogwash.  If you write code that assumes an "int" can hold things outside
that range, you should put something like

	#ifdef pdp11
		FORGET IT, NO WAY, GIVE UP AND GO HOME
	#endif

to emphasise that this code will not run on a PDP-11.  Not very long ago K&R
stated that "int" was a 16-bit quantity on the PDP-11.  At the very least,
people writing that code should acknowledge the fact that it won't work on a
PDP-11 (or IBM PC, or...).  If they'd read the C Reference Manual in K&R,
they'd have known that even before there was an ANSI standard.

	Guy Harris

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/06/85)

> I am inclined to prefer the use of int16, int32, int64, int8, char.

int16 => short
int32 => long
int64 => not a primitive data type on all implementations
int8  => signed char
char  => char

Why add more symbols when you already have what is needed in the language?

brooks@lll-crg.ARpA (Eugene D. Brooks III) (10/06/85)

>I am inclined to prefer int8, int16, int32, ....

If that is your inclination then #define or typedef them.
You can even put it in "handy.h".  We bother the net with it.

preece@ccvaxa.UUCP (10/07/85)

> /* Written  1:46 am  Oct  5, 1985 by guy@sun.uucp in ccvaxa:net.lang.c
> */ Given that the maximum absolute value which an "int" can hold is
> fairly small, it's not particularly safe to ignore considerations of
> range (it's not safe to ignore considerations in precision - in "real"s
> -either, but integers are always precise).
----------
Well, when I say "ignoring considerations in precision" I'm speaking
in ball-park terms.  I know when I have to worry about a value possibly
not fitting in 32 bits.  Most of the quantities most of us deal with
most of the time are small integers.
----------
> I don't want them to have a pretty good idea when it's going to violate
> that default assumption on a particular machine.  I want them to have a
> pretty good idea when it's going to violate that default assumption on
> a 16-bit-"int" machine;
----------
Well, I can see how that would make life easier for you, but it's not
really my problem.  The project I work on would have saved a lot of
time if the code we're porting hadn't been written for a system using
memory-mapped files, but I don't curse the authors for writing for the
environment they had.
----------
> K&R states quite clearly that "int" takes up 16 bits on PDP-11s, so
> there exists at least one (very popular) machine on which the "default
> assumption" about "int" would be violated.
----------
It also states quite clearly that on PDP-10s an int is 36 bits.  I
don't see why I should give more weight to one than the other.  The
definition of short, int, and long is EXPLICITLY machine dependent
in K&R.  The key phrase, in my opinion, is "'Plain' integers have the
natural size suggested by the host machine architecture; the others
are provided to meet special needs."  That is pretty explicit.
----------
> (Consider all the postings that say "our news system truncates items
> with more than 64KB, so could you please repost XXX" for an example of
> why it is a bad practice.)
----------
What has that to do with anything?  Somebody failed to anticipate
future needs and used a short when she should have used a long.
There are people who rely on two-digit year codes, too.
----------
> People *porting* software shouldn't have to be aware of those places.
> People *writing* software - even if they "know" that it'll never be
> ported - should be aware of them.
----------
Portability is one of many factors to be considered in setting local
coding standards.  I have spent a lot of time recently understanding
code written for a very different environment and converting it to C.
It had lots of size and byte-ordering problems.  That's the breaks.
It's not the authors' fault that I had different requirements than they.

Now, if I worked for a company in the business of writing software to
be marketed across a wide variety of machines and architectures,
we would have local conventions designed to make those transitions
easy.  But it is not our business to produce code that runs on PDP-11s,
let alone (as you requested in a previous posting) code that runs
efficiently on PDP-11s.
----------
> Hogwash.  If you write code that assumes an "int" can hold things
> outside that range, you should put something like
> 
> 	#ifdef pdp11
> 		FORGET IT, NO WAY, GIVE UP AND GO HOME
> 	#endif
>
> to emphasise that this code will not run on a PDP-11.
----------
I don't consider the PDP-11 to be a sacred special case.  We have code
that would break on a PDP-10, too, because it would NOT get arithmetic
exceptions where we expect them.  So?

Here's a concrete example.  I don't blame Gosling for alignment
problems we had in making Emacs run on our machines.  It wasn't
a consideration he should have had to worry about.  I DO fault
Unipress for having the same problem in the latest versions,
because that IS something they should have to worry about.

I have no objection to the principle that we should try, other things
being equal, to write portable code.  But the FIRST consideration of
good professional practice is to write code that is clear,
maintainable, and efficient in the environment for which we are paid
to produce it.  It is not bad practice to put that environment first.

-- 
scott preece
gould/csd - urbana
ihnp4!uiucdcs!ccvaxa!preece

seifert@hammer.UUCP (Snoopy) (10/07/85)

In article <1924@brl-tgr.ARPA> gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) writes:
>> I am inclined to prefer the use of int16, int32, int64, int8, char.
>
>int16 => short
>int32 => long
>int64 => not a primitive data type on all implementations
>int8  => signed char
>char  => char
>
>Why add more symbols when you already have what is needed in the language?

For clarity and portability.

Snoopy
tektronix!tekecs!doghouse.TEK!snoopy

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/09/85)

> >Why add more symbols when you already have what is needed in the language?
> 
> For clarity and portability.

Neither of those has been demonstrated.

mwm@ucbopal.BERKELEY.EDU (Mike (I'll be mellow when I'm dead) Meyer) (10/09/85)

In article <1924@brl-tgr.ARPA> gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) writes:
>> I am inclined to prefer the use of int16, int32, int64, int8, char.
>
>int16 => short
>int32 => long
>int64 => not a primitive data type on all implementations
>int8  => signed char
>char  => char
>
>Why add more symbols when you already have what is needed in the language?

Doug,

If someone on a machine that supports 60+ bit ints uses one in their code,
and later you have to port it, you should hope they did:

typedef	long	int60 ;		/* or whatever the type is for 60 bit ints */

and then used int60 instead of long.

You see, if they do that, then when you compile the program and notice it
giving you funny numbers (assuming, of course, that you notice :-), a single
grep will find all the places where variables you need to worry about are
declared.

Of course, if there were some standard place to look for those typedefs, and
they had included that, then when you compiled the program, it would give
you the same list as the grep.

Likewise, if *your* code uses int8/int16/whatever correctly, specifying the
number of bits needed, then there file of typedefs will get them a
reasonable type for those variables.

	<mike

hamilton@uiucuxc.CSO.UIUC.EDU (10/09/85)

too bad you can't do something like:
	#define INT(max)	\	/* /usr/include/int.h */
	#if max<32768		\	/* machine-dependent */
		short		\
	#else			\
		long		\
	#endif

	INT(20000) x;		/* -> short x; */
	INT(50000) y;		/* -> long y; */
with cpp.  maybe m4?  not real pretty, but then neither is "int16", etc.
i think it makes more sense to declare value range(s) than significant
bits.  at the least, it provides an extra degree of self-documentation.
(quick, somebody make me shut up before i say something nice about pascal!)

	wayne hamilton
UUCP:	{ihnp4,pur-ee,convex}!uiucdcs!uiucuxc!hamilton
ARPA:	hamilton@uiucuxc.cso.uiuc.edu
CSNET:	hamilton%uiucuxc@uiuc.csnet
USMail:	Box 476, Urbana, IL 61801
Phone:	(217)333-8703

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/10/85)

> typedef	long	int60 ;		/* or whatever the type is for 60 bit ints */

My point is, if more than 32 bits are needed then this code is
not going to port anyway, no matter what typedefs you use.

> Likewise, if *your* code uses int8/int16/whatever correctly, specifying the
> number of bits needed, then there file of typedefs will get them a
> reasonable type for those variables.

If you use the proper C data types, there will be no need to worry
about this at all; the code will work on all standard-conforming
implementations without any change whatsoever.  There is no need
to invent system-specific typedefs for any integer type through 32
bits, and for longer integer types typedefs are not sufficient.

mwm@ucbopal.BERKELEY.EDU (Mike (I'll be mellow when I'm dead) Meyer) (10/12/85)

In article <2032@brl-tgr.ARPA> gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) writes:
>> typedef	long	int60 ;		/* or whatever the type is for 60 bit ints */
>
>My point is, if more than 32 bits are needed then this code is
>not going to port anyway, no matter what typedefs you use.

If it's gotta be ported, then a port will consist of chasing down all the
really long variables, and replacing expressions that use them with the
appropriate calls on the mp library. It'd be nice if those variables were
tagged for you.

If it doesn't *have* to be ported, it would be nice if it wouldn't compile,
so that you avoid the possibility of getting bogus answers. Admittedly, this
problem could be solved by a C compiler that trapped integer overflows.
Anybody got one :-).

>> Likewise, if *your* code uses int8/int16/whatever correctly, specifying the
>> number of bits needed, then there file of typedefs will get them a
>> reasonable type for those variables.
>
>If you use the proper C data types, there will be no need to worry
>about this at all; the code will work on all standard-conforming
>implementations without any change whatsoever.  There is no need
>to invent system-specific typedefs for any integer type through 32
>bits, and for longer integer types typedefs are not sufficient.

You missed the key word, "reasonable." It's unreasonable to declare a large
array with types twice as large as they need to be. Its non-portable to use
the builtin name for the type of reasonable size. Ergo, a typedef that says
what size it is, and a standard file that turns those typedefs into the
correct builtin type for that machine. This allows for both reasonable and
portable code.

	<mike

guy@sun.uucp (Guy Harris) (10/13/85)

> > I don't want them to have a pretty good idea when it's going to violate
> > that default assumption on a particular machine.  I want them to have a
> > pretty good idea when it's going to violate that default assumption on
> > a 16-bit-"int" machine;

> Well, I can see how that would make life easier for you, but it's not
> really my problem.  The project I work on would have saved a lot of
> time if the code we're porting hadn't been written for a system using
> memory-mapped files, but I don't curse the authors for writing for the
> environment they had.

One can write:

	int	size_of_UNIX_file;

or one can write

	long	size_of_UNIX_file;

The former is incorrect, and the latter is correct.  The two are equivalent
on 32-bit machines, so there is NO reason to write the former rather than
the latter on a 32-bit machine.  If one can write code for a more general
environment with NO extra effort other than a little thought, one should
curse an author who didn't make that extra effort.

> > (Consider all the postings that say "our news system truncates items
> > with more than 64KB, so could you please repost XXX" for an example of
> > why it is a bad practice.)

> What has that to do with anything?  Somebody failed to anticipate
> future needs and used a short when she should have used a long.

The code in question uses an "int" where it should have used a "long".
Using a "short" would have been *more* acceptable; the documentation for
this system says

	<items> can be up to 65535 bytes long (2^32 bytes in 4.1c BSD),

Since the only *real* constraint on the size of items is the amount of disk
space available and the time taken to transmit the items, neither of which
is significantly affected by the width of a processor's ALU and registers,
the system should not make the maximum item size dependent on either of
those two factors.  The ideal would have been to use "long" instead of
"int"; however, if the cost of converting item databases on PDP-11s would
have been too high, using "short" would have been acceptable.  The ideal
would have been to do something like

	#ifdef BACKCOMPAT
	typedef itemsz_t unsigned int;
	#else
	typedef itemsz_t unsigned long;
	#endif

and *not* restrict items to 65535 bytes by default; if it's really too much
trouble for a site to convert its database, then they can build a version
which is backwards-compatible with older versions.

> There are people who rely on two-digit year codes, too.

Yes, but how many of them rely on two-digit year codes on 16-bit machines
and four-digit year codes on 32-bit machines?  Not planning for future needs
may be regarded as a misfortune; having a system like the aforementioned
meet future needs or not depending on the width of a machine's registers
looks like carelessness.  (Sorry, Oscar, but I didn't think you'd mind...)
There are cases where the difference between a 16-bit machine and a 32-bit
machine *is* relevant; an example would be a program which did FFTs of large
data sets.  I have no problem with

	1) the program being written very differently for a PDP-11, which
	would have to do overlaying, or provide a software virtual memory
	system, or perform some other technique to do the FFTing on disk,
	and for a VAX, where you could (assuming you could keep the entire
	data set in *physical* memory) write it in a more straightforward
	fashion (although, if it *didn't* all fit in physical memory,
	it would have to use some techniques similar to the PDP-11
	techniques to avoid thrashing)

or

	2) saying "this program needs a machine with a large address
	space".

> Portability is one of many factors to be considered in setting local
> coding standards.  I have spent a lot of time recently understanding
> code written for a very different environment and converting it to C.
> It had lots of size and byte-ordering problems.  That's the breaks.
> It's not the authors' fault that I had different requirements than they.

In many of these cases, there is little if any gain to be had by writing
software in a non-portable fashion.  Under those circumstances, it *is* the
authors' fault that they did something one way when they could have done it
another way with little or no extra effort.  In the case of byte ordering,
it takes more effort to write something so that data is portable between
machines.  If it's a question of a program which *doesn't* try to exchange
data between machines and *still* fails on machines with a different byte
order than the machine for which it was written, there'd better have been a
significant performance improvement gained by not writing it portably.  And
in the case of using "long" vs. "int", there is NOTHING to be gained from
using "int" instead of "long" on a 32-bit machine (on a truly 32-bit
machine, "long"s and "int"s will both be 32-bit quantities unless the
implementor of C was totally out to lunch), so it SHOULD NOT BE DONE.
Period.

> But it is not our business to produce code that runs on PDP-11s,
> let alone (as you requested in a previous posting) code that runs
> efficiently on PDP-11s.

I made no such request, but I'll let that pass.  If you can get code that
runs on PDP-11s with no effort other than getting people to use C properly,
it *is* your business to get them to so use C and write portable code
whenever possible.  If your system permits code to reference location 0 (or
whatever location a null pointer points to, assuming it doesn't have a
special "invalid pointer" bit pattern), it *is* your business not to write
code which dereferences null pointers - such code is NOT valid C.
Programmer X can get away with writing code like that, if they have such a
system; programmers Y, Z, and W who work for a company which does not permit
code to get away with dereferencing null pointers have every right to stick
it to programmer X when their company's customers stick it to them because
"your machine is broken and won't run this program".

Saying "programmer X is not at fault" is blaming the victim, not the
perpetrator.

> I have no objection to the principle that we should try, other things
> being equal, to write portable code.  But the FIRST consideration of
> good professional practice is to write code that is clear,
> maintainable, and efficient in the environment for which we are paid
> to produce it.  It is not bad practice to put that environment first.

If all other things are not equal, or close to it, I have no objection to
unportable code.  The trouble is that people don't even seem to try to write
portable code when they *are* equal.  It *is* bad practice to blindly assume
that the environment you're writing for is the only interesting environment.
Some minimum amount of thought should be given to portability, even if
portability concerns are rejected.  Can you absolutely guarantee that the
people who paid you to write that code won't ever try to build it in a
different environment?  If not, by writing non-portable code you may end up
costing them *more* money in the long run; it's more expensive to
retroactively fix non-portable code than to write it portably in the first
place.

If somebody says that, now that ANSI C finally "defines 'int's as 16-bit
quantities", they'll start thinking about when it's appropriate to use
"long" and when it's appropriate to use "int", they haven't given the proper
minimum amount of thought to portability.

	Guy Harris

henry@utzoo.UUCP (Henry Spencer) (10/13/85)

> I have no objection to the principle that we should try, other things
> being equal, to write portable code.  But the FIRST consideration of
> good professional practice is to write code that is clear,
> maintainable, and efficient in the environment for which we are paid
> to produce it.  It is not bad practice to put that environment first.

It must be nice to be so confident that your environment will never,
ever, ever, change radically.  Situations where such confidence is
justified are rare; perhaps your situation is such, but this is unusual.
One major advantage of Unix is that it does *not* tie you to any single
environment... but that advantage is wasted if your own code does.  We
may be especially conscious of this because our environment is scheduled
to change radically sometime in the next year or so, but the principle
is valid in general:  making your code machine-dependent limits its
lifetime.  This is sometimes appropriate... but only sometimes.  Less
often than most people think.

With certain specific exceptions (e.g. device drivers, the insides of hot
RasterOp implementations, the insides of strcpy(), etc.), portable C code
is portably efficient as well.  Clarity and maintainability are fairly
orthogonal to portability; if anything, there is a positive correlation,
because machine-dependent microsecond-grubbing tends to be unclear and
hard to maintain too.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

bc@cyb-eng.UUCP (Bill Crews) (10/14/85)

> I have no objection to the principle that we should try, other things
> being equal, to write portable code.  But the FIRST consideration of
> good professional practice is to write code that is clear,
> maintainable, and efficient in the environment for which we are paid
> to produce it.  It is not bad practice to put that environment first.
> -- 
> scott preece

Yeah, I know, but logic and rationality aren't near as much fun as religion!
It is MUCH better to tie up our phone lines and disk space with religious
ranting and raving than with . . . (yuck!) . . . rationality.            :-)
-- 
- bc -

..!{seismo,topaz,gatech,nbires,ihnp4}!ut-sally!cyb-eng!bc  (512) 458-6609

ado@elsie.UUCP (Arthur David Olson) (10/21/85)

> If you use the proper C data types, there will be no need to worry
> . . .at all; the code will work on all standard-conforming
> implementations without any change whatsoever. . .

Hmmm. . .last time I looked there were no (as in zero) standard-conforming
implementations (a small side effect of the standard not yet having
been agreed to, no doubt).
--
C is a Jack Benny/Mel Blanc trademark.
--
	UUCP: ..decvax!seismo!elsie!ado    ARPA: elsie!ado@seismo.ARPA
	DEC, VAX and Elsie are Digital Equipment and Borden trademarks
	(Is ARPA a DARPA trademark?)

jsdy@hadron.UUCP (Joseph S. D. Yao) (10/29/85)

In article <2883@sun.uucp> guy@sun.uucp (Guy Harris) writes:
>One can write:
>	int	size_of_UNIX_file;
>or one can write
>	long	size_of_UNIX_file;
>The former is incorrect, and the latter is correct.  ...

Actually, if you are trying to write portable code, NEITHER is correct.
This particular problem is exactly why we have the typedef, off_t.

	off_t size_of_UNIX_file;

is correct for portable code.

However, either of the above is correct for throwaway code on machines
for which each one happens to be true.  The trouble is, by not
developing good (read innocuous but portable) habits in throwaway code,
if you suddenly decide that you are an Implementor of Portable Code,
you will have a lot of trouble get used to the "new" way of writing
code.
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

franka@mmintl.UUCP (Frank Adams) (11/02/85)

[Not food]

In article <48@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:
>The trouble is, by not
>developing good (read innocuous but portable) habits in throwaway code,
>if you suddenly decide that you are an Implementor of Portable Code,
>you will have a lot of trouble get used to the "new" way of writing
>code.

The trouble is, you don't know what pieces of code aren't going to be
thrown away.  You may suddenly find that you *were* an Implementor of
[Not Very] Portable Code.  Better to do it right the first time.

Frank Adams                           ihpn4!philabs!pwa-b!mmintl!franka
Multimate International    52 Oakland Ave North    E. Hartford, CT 06108

meier@srcsip.UUCP (Christopher M. Meier) (11/05/85)

In article <48@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:
>
>Actually, if you are trying to write portable code, NEITHER is correct.
>This particular problem is exactly why we have the typedef, off_t.
>
>	off_t size_of_UNIX_file;
>
>is correct for portable code.
>
>However, either of the above is correct for throwaway code on machines
>for which each one happens to be true.  The trouble is, by not
>developing good (read innocuous but portable) habits in throwaway code,
>if you suddenly decide that you are an Implementor of Portable Code,
>you will have a lot of trouble get used to the "new" way of writing
>code.
>-- 
>
>	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

Can someone suggest a good reference (or references) for developing
good portable code?  We are writing code that will eventually be used
on machines other than our current 750 Vax running 4.2, and I would
like to make sure we won't have to spend time rewriting code.

Christopher Meier	{ihnp4!umn-cs,philabs}!srcsip!meier
Honeywell Systems & Research Center
Signal & Image Processing / AIT

guy@sun.uucp (Guy Harris) (11/11/85)

> Can someone suggest a good reference (or references) for developing
> good portable code?  We are writing code that will eventually be used
> on machines other than our current 750 Vax running 4.2, and I would
> like to make sure we won't have to spend time rewriting code.

No, but I think Harbison and Steele mentions some things in passing.
Laura Creighton is thinking of writing such a book.  Until it comes out,
here are some rules:

	Rule 1.  Run your code through "lint".
	Rule 2.  Be careful about using "int".
	Rule 3.  Run your code through "lint".
	Rule 4.  Be careful about declaring functions which return
		 things other than "int", like "long" or pointers.
	Rule 5.  Run your code through "lint".
	Rule 6.  There is NO rule 6.
	Rule 7.  Run your code through "lint".
	Rule 8.  Be careful about casting values to their proper type
		 when passing them as arguments - if a routine expects
		 a "long", or a pointer to a particular type, make sure
		 it gets it.  For instance, never pass 0 or NULL if the
		 routine expects a pointer - *always* cast it to a pointer
		 of the appropriate type.
	Rule 9.  Run your code through "lint".
	Rule 10. Never ever ever ever ever assume that you can dereference
		 a pointer which may be null.
	Rule 11. Run your code through "lint".
	Rule 12. Never assume that the bytes of a word or a longword are
		 in a particular order.
	Rule 13. Run your code through "lint".
	Rule 14. Never assume that "char" is signed.
	Rule 15. Run your code through "lint".
	Rule 16. Never assume that you can turn a "char *" which points
		 into the middle of a word or longword into a "short *",
		 "int *", or "long *" and use the pointer in question;
		 VAXes don't impose boundary alignment restrictions, but
		 lots and lots of other machines do.
	Rule 17. Run your code through "lint".
	Rule 18. Never assume what the padding between structure members
		 is - it's machine-dependent.
	Rule 19. Run your code through "lint".

	Guy Harris

jsdy@hadron.UUCP (Joseph S. D. Yao) (11/14/85)

Here are some more.
	>  Make as much static as possible.  (No, not electricity.)
	Rephrase: restrict the scope of all variables and functions
	as much as possible.  Use auto's and static's in preference
	to extern's.
	>  Declare all functions which return a value as such.
	>  If possible, declare non-value-returning functions as void.
	>  After (not if) you use lint, do as little type-casting as
	possible.  Instead, take a long look at what you're doing.
	Are you forgetting to check return values?  Passing more bits
	than you can use? ...  THEN cast types.
	>  If you are using an extern in exactly the same way in code
	in different functions in different modules, perhaps you can
	make a single function to do all of this, and reduce the
	scope of the extern.
	>  Do not use constants in functions.  Well, maybe 0.
	MAYBE 1 or -1.  Never 0 for a null pointer, or end-of-string.
	NEVER strings.
	>  NUL is not NULL.  ('\0' and (char *)0 may well be the
	same -- but it's not saying what you mean.  Or, did you mean
	to say that a nul character was the same object as a null
	pointer?
	>  Always check.  Your return values.  Your pointers.  Your
	data, before you divide.  This is not so much "portability"
	as "defensive programming," but what the hey.

More abounds.  Want more bounds?  Keep asking.	;-)
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}