[comp.lang.c] What breaks?

chu@acsu.buffalo.edu (john c chu) (01/15/91)

In article <1991Jan13.220958.16568@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:

[model deleted]
>It is intuitively appealing, but I would be surprised to see anyone
>implementing it:  it would break far too much badly-written software.

Can someone please tell me what would break under that model and why?
It's not that I don't believe it, or that I want to write code that
will break. It's that I want to avoid making the unwarranted
assumptions that would lead to my code breaking.

Reply via E-mail unless this is of general interest. (I have no idea.)

					john
				chu@autarch.acsu.buffalo.edu

henry@zoo.toronto.edu (Henry Spencer) (01/15/91)

In article <54379@eerie.acsu.Buffalo.EDU> chu@acsu.buffalo.edu (john c chu) writes:
>>It is intuitively appealing, but I would be surprised to see anyone
>>implementing it:  it would break far too much badly-written software.
>
>Can someone please tell me what would break under that model and why?

There is an awful lot of crufty, amateurish code -- notably the Berkeley
kernel networking stuff, but it's not alone -- which has truly pervasive
assumptions that int, long, and pointers are all the same size:  32 bits.

At least one manufacturer of 64-bit machines has 32-bit longs and 64-bit
long longs for exactly this reason.

The problem can largely be avoided if you define symbolic names for your
important types (say, for example, net32_t for a 32-bit number in a TCP/IP
header) and consistently use those types, with care taken when converting
between them, moving them in and out from external storage, and passing
them as parameters.  This is a nuisance.  It's a lot easier to just treat
all your major types as interchangeable, but God will get you for it.
-- 
If the Space Shuttle was the answer,   | Henry Spencer at U of Toronto Zoology
what was the question?                 |  henry@zoo.toronto.edu   utzoo!henry

adeboer@gjetor.geac.COM (Anthony DeBoer) (01/16/91)

In article <1991Jan15.053356.2631@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <54379@eerie.acsu.Buffalo.EDU> chu@acsu.buffalo.edu (john c chu) writes:
>>>It is intuitively appealing, but I would be surprised to see anyone
>>>implementing it:  it would break far too much badly-written software.
>>
>>Can someone please tell me what would break under that model and why?
>
>There is an awful lot of crufty, amateurish code -- notably the Berkeley
>kernel networking stuff, but it's not alone -- which has truly pervasive
>assumptions that int, long, and pointers are all the same size:  32 bits.
>
>At least one manufacturer of 64-bit machines has 32-bit longs and 64-bit
>long longs for exactly this reason.
>
>The problem can largely be avoided if you define symbolic names for your
>important types (say, for example, net32_t for a 32-bit number in a TCP/IP
>header) and consistently use those types, with care taken when converting
>between them, moving them in and out from external storage, and passing
>them as parameters.  This is a nuisance.  It's a lot easier to just treat
>all your major types as interchangeable, but God will get you for it.

It seems to me that there really isn't any _portable_ way to declare a 32-bit
long, for example.  Not that I would want to advocate changing the syntax of C
[again], but for most software the key thing is that the integer has at least
enough bits, rather than a precise number of them, so perhaps if there was
some declaration sequence like "int[n] variable", where n was the minimum
number of bits needed, and the compiler substituted the appropriate integer
size that met the requirement (so an int[10] declaration would get 16-bit
integers, for example), then the language might take a step toward
portability.  A bitsizeof() operator that told you how many bits you actually
had to play with might help too, but even then you'd have to allow for
machines that didn't use two's complement representation.

I suppose it's only a minor pain now: use a type called net32_t or int32_t and
define all your types in a short header file you rewrite on each new machine
on which you're going to compile.  There are enough funny situations possible
that it's probably best for a human programmer to look at the application and
the architecture and call the shots.  We've got to earn our salaries once in a
while :-)
-- 
Anthony DeBoer - NAUI #Z8800                           adeboer@gjetor.geac.com
Programmer, Geac J&E Systems Ltd.             uunet!jtsv16!geac!gjetor!adeboer
Toronto, Ontario, Canada             #include <std.random.opinions.disclaimer>

gwyn@smoke.brl.mil (Doug Gwyn) (01/18/91)

In article <1991Jan15.202123.14223@gjetor.geac.COM> adeboer@gjetor.geac.COM (Anthony DeBoer) writes:
>It seems to me that there really isn't any _portable_ way to declare a 32-bit
>long, for example.

There is no portable way to declare any integral type constrained to use
precisely 32 bits in its representation.  However, "long" portably declares
one that has AT LEAST 32 bits in its representation (or, you could express
this in terms of the guaranteed range of representable values).  net32_t
is hopeless for the first case and unnecessary for the second.

henry@zoo.toronto.edu (Henry Spencer) (01/18/91)

In article <14890@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>There is no portable way to declare any integral type constrained to use
>precisely 32 bits in its representation.  However, "long" portably declares
>one that has AT LEAST 32 bits in its representation (or, you could express
>this in terms of the guaranteed range of representable values).  net32_t
>is hopeless for the first case and unnecessary for the second.

Uh, Doug, please don't confuse the two slightly different threads of
discussion here.  You're thinking of int32 ("I want a way to ask for ints
of at least 32 bits"), not net32_t ("I'll adjust the definition of this
typedef so it gives me exactly 32 bits").

There is no portable way to declare a type with *exactly* 32 bits, and
a TCP/IP sequence number (for example) is exactly 32, no more.  Life with
64-bit longs would be a whole lot easier if the authors of certain kernel
networking software -- for example -- had consistently used a net32_t
typedef rather than int and long.
-- 
If the Space Shuttle was the answer,   | Henry Spencer at U of Toronto Zoology
what was the question?                 |  henry@zoo.toronto.edu   utzoo!henry

datangua@watmath.waterloo.edu (David Tanguay) (01/18/91)

In article <1991Jan18.044948.27943@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <14890@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>>There is no portable way to declare any integral type constrained to use
>>precisely 32 bits in its representation.
>
>There is no portable way to declare a type with *exactly* 32 bits, and
>a TCP/IP sequence number (for example) is exactly 32, no more.

How about: (Standard C only)

typedef struct { long it:32; } net32_t;
#define net32_t(var) var.it

and then all accesses to variables are wrapped in net32_t(var).

	void
happy( void )
{
	net32_t it;

	...
	net32_t(it) = 123456;
	++net32_t(it);
	net32_t(some_global) = net32_t(it) + 42;
	...
}

I don't like sticking the 32 in the type name, since that may change,
so just consider the above as an illustration of a technique.
Ugly, I think, but it should accomplish what you want.
-- 
David Tanguay            Software Development Group, University of Waterloo

gerst@ecs.umass.edu (01/18/91)

Reply-To: lloyd@ucs.umass.edu

>Subject: Re: What breaks? (was Re: 64 bit longs?)
>From: adeboer@gjetor.geac.COM (Anthony DeBoer)
>
>In article <1991Jan15.053356.2631@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>>In article <54379@eerie.acsu.Buffalo.EDU> chu@acsu.buffalo.edu (john c chu) writes:
>>>>It is intuitively appealing, but I would be surprised to see anyone
>>>>implementing it:  it would break far too much badly-written software.
>>>
>>>Can someone please tell me what would break under that model and why?
>>
>>There is an awful lot of crufty, amateurish code -- notably the Berkeley
>>kernel networking stuff, but it's not alone -- which has truly pervasive
>>assumptions that int, long, and pointers are all the same size:  32 bits.
>>
>>At least one manufacturer of 64-bit machines has 32-bit longs and 64-bit
>>long longs for exactly this reason.
>>
>>The problem can largely be avoided if you define symbolic names for your
>>important types (say, for example, net32_t for a 32-bit number in a TCP/IP
>>header) and consistently use those types, with care taken when converting
>>between them, moving them in and out from external storage, and passing
>>them as parameters.  This is a nuisance.  It's a lot easier to just treat
>>all your major types as interchangeable, but God will get you for it.
>
>It seems to me that there really isn't any _portable_ way to declare a 32-bit
>long, for example.  Not that I would want to advocate changing the syntax of C
>[again], but for most software the key thing is that the integer has at least
>enough bits, rather than a precise number of them, so perhaps if there was
>some declaration sequence like "int[n] variable", where n was the minimum
>number of bits needed, and the compiler substituted the appropriate integer
>size that met the requirement (so an int[10] declaration would get 16-bit
>integers, for example), then the language might take a step toward
>portability.  A bitsizeof() operator that told you how many bits you actually
>had to play with might help too, but even then you'd have to allow for
>machines that didn't use two's complement representation.

gack-o-matic! PL/1! run away! run away! :)

IMHO, the ideal language would have two forms of scalar values; 

     1) ranges (min..max) 
     2) bitfields on an non-struct basis.

This would solve sooooooooooooooo many headaches.  Of course C has neither
of these, thus giving me serious migraines :)  

[ stuff deleted ]

>-- 
>Anthony DeBoer - NAUI #Z8800                           adeboer@gjetor.geac.com
>Programmer, Geac J&E Systems Ltd.             uunet!jtsv16!geac!gjetor!adeboer
>Toronto, Ontario, Canada             #include <std.random.opinions.disclaimer>

Chris Lloyd - lloyd@ucs.umass.edu
"The more languages I learn, the more I dislike them all" - me

henry@zoo.toronto.edu (Henry Spencer) (01/19/91)

In article <1991Jan18.094133.16879@watmath.waterloo.edu> datangua@watmath.waterloo.edu (David Tanguay) writes:
>>There is no portable way to declare a type with *exactly* 32 bits, and
>>a TCP/IP sequence number (for example) is exactly 32, no more.
>
>How about: (Standard C only)
>
>typedef struct { long it:32; } net32_t;
>#define net32_t(var) var.it

Unfortunately, there is no guarantee that padding won't get added to the
end of that struct to bring it up to a size that the hardware likes.
The number will be only 32 bits -- modulo all the fuzziness of bitfields,
which are quite implementation-dependent -- but you won't be able to
declare headers using this.

>I don't like sticking the 32 in the type name, since that may change,

It would be better to use something like seq_t for a TCP/IP sequence
number, for graceful handling of later extensions.  However, the whole
point of this example is that the TCP/IP specification guarantees that
the field in the header structure is exactly, precisely 32 bits, and that
is not going to change.  (Extensions are by options that add data later,
not by changes to the basic structure.)  You would use net32_t in cases
where you really did want exactly 32 bits and no backtalk.
-- 
If the Space Shuttle was the answer,   | Henry Spencer at U of Toronto Zoology
what was the question?                 |  henry@zoo.toronto.edu   utzoo!henry

gwyn@smoke.brl.mil (Doug Gwyn) (01/19/91)

In article <1991Jan18.044948.27943@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <14890@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>>There is no portable way to declare any integral type constrained to use
>>precisely 32 bits in its representation.  However, "long" portably declares
>>one that has AT LEAST 32 bits in its representation (or, you could express
>>this in terms of the guaranteed range of representable values).  net32_t
>>is hopeless for the first case and unnecessary for the second.
>Uh, Doug, please don't confuse the two slightly different threads of
>discussion here.

But the discussion has bifurcated, and I was responding specifically to that
(which is why there are two distinct cases considered in my posting).

>There is no portable way to declare a type with *exactly* 32 bits, ...

And there is no guarantee that such a type even exists, which is why I
disagree to some extent with ...

>Life with 64-bit longs would be a whole lot easier if the authors of
>certain kernel networking software -- for example -- had consistently
>used a net32_t typedef rather than int and long.

Life would only be easier for systems whose C implementations provided
an integral type of exactly 32 bits.  On the other hand, if the code had
been designed to work anywhere that AT LEAST 32 bits were supported, it
would have made life even easier.  To do the latter it would have
sufficed to use (unsigned) long.  The main advantage of a typedef would
simply be for purposes of encapsulating an abstract data type (netaddr_t,
for example).  While that is important, I want to be sure that nobody
mistakenly believes that it legitimizes building in assumptions about
EXACT sizes for integral representations.

bright@nazgul.UUCP (Walter Bright) (01/19/91)

In article <54379@eerie.acsu.Buffalo.EDU> chu@acsu.buffalo.edu (john c chu) writes:
/In article <1991Jan13.220958.16568@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
/>It is intuitively appealing, but I would be surprised to see anyone
/>implementing it:  it would break far too much badly-written software.
/Can someone please tell me what would break under that model and why?
/It's not that I don't believe it, or that I want to write code that
/will break. It's that I want to avoid making the unwarranted
/assumptions that would lead to my code breaking.

Code that will break:
	char *p;
	unsigned long x;
	...
	/* Read 4 bytes out of buffer */
	x = *(unsigned long*)p;  /* assume bytes are in correct order */
Also:
	x >>= 24;
	/* Now assume that x < 256 */
Also:
	x &= 0xFFFFFFFE;	/* clear bit 0 */

henry@zoo.toronto.edu (Henry Spencer) (01/21/91)

In article <14896@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>>There is no portable way to declare a type with *exactly* 32 bits, ...
>
>And there is no guarantee that such a type even exists...

In that case, that machine is going to have real trouble declaring, say,
a TCP header structure.  It is always possible to find machines so badly
broken that you can't cope with them.
-- 
If the Space Shuttle was the answer,   | Henry Spencer at U of Toronto Zoology
what was the question?                 |  henry@zoo.toronto.edu   utzoo!henry

datangua@watmath.waterloo.edu (David Tanguay) (01/21/91)

In article <1991Jan21.025706.7152@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
|>>There is no portable way to declare a type with *exactly* 32 bits, ...
|>And there is no guarantee that such a type even exists...
|In that case, that machine is going to have real trouble declaring, say,
|a TCP header structure.  It is always possible to find machines so badly
|broken that you can't cope with them.

What about 36 bit machines? Or did you really mean "at least" 32 bits,
and not "exactly"? If the TCP software requires an (non-bitfield) integral
type of exactly 32 bits then it is the TCP software that is broken.
There should be no problem with slop bits in the internal representation
of the structure.
-- 
David Tanguay            Software Development Group, University of Waterloo

henry@zoo.toronto.edu (Henry Spencer) (01/22/91)

In article <1991Jan21.113942.24379@watmath.waterloo.edu> datangua@watmath.waterloo.edu (David Tanguay) writes:
>|>>There is no portable way to declare a type with *exactly* 32 bits, ...
>|In that case, that machine is going to have real trouble declaring, say,
>|a TCP header structure.
>
>What about 36 bit machines? Or did you really mean "at least" 32 bits,
>and not "exactly"? If the TCP software requires an (non-bitfield) integral
>type of exactly 32 bits then it is the TCP software that is broken.

The TCP header, being part of a network protocol, is defined down to the bit
independently of any specific machine.  Its address fields, for example,
are 32 bits.  Not 31, not 33, and certainly not 36.  32 and only 32.
A machine with a sufficiently weird structure may have trouble declaring
a TCP header using a C struct.  That is the machine's problem.  The 36-bit
machines have a long history of difficulty in dealing with protocols and
devices built around 8/16/32 bit conventions; they cope using various
kludges.
-- 
If the Space Shuttle was the answer,   | Henry Spencer at U of Toronto Zoology
what was the question?                 |  henry@zoo.toronto.edu   utzoo!henry

gwyn@smoke.brl.mil (Doug Gwyn) (01/22/91)

In article <1991Jan21.025706.7152@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <14896@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>>>There is no portable way to declare a type with *exactly* 32 bits, ...
>>And there is no guarantee that such a type even exists...
>In that case, that machine is going to have real trouble declaring, say,
>a TCP header structure.  It is always possible to find machines so badly
>broken that you can't cope with them.

I don't agree that a machine whose integer types all have sizes different
from 32 bits is "broken".  I've implemented programs that had similar data
packing requirements on architectures that didn't match the packing
boundaries; it's a solvable problem, although one has to take more care
than most hackers feel they can be bothered with.

gwyn@smoke.brl.mil (Doug Gwyn) (01/22/91)

In article <1991Jan21.130932.2982@odin.diku.dk> thorinn@diku.dk (Lars Henrik Mathiesen) writes:
>The TCP _protocol_ defines certain fields as exactly 32 bits in a
>specific order on the wire. For any machine the network interface will
>define a mapping between bits-on-the-wire and bits-in-machine-words;
>in a 36-bit-word machine this mapping cannot be as simple as in a
>byte-addressable machine.

It should involve roughly the same amount of effort.  For example, define
the system's internal representation of packet data as an array of words
with the data contained in the low-order 32 bits of each word (also define
whether big- or little-endian).  At some point the network hardware
interface is going to have to be spoon-fed the 32 bits in some form that
it can cope with; it is easy enough to take care of the unpacking at that
point.

I've had a fair amount of experience in dealing with communication between
36-bit and 8-bit addressable architectures (DEC mainframes vs. minis);
these issues have been solved long ago, and it wasn't very difficult.

datangua@watmath.waterloo.edu (David Tanguay) (01/22/91)

In article <1991Jan21.190008.1291@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>The TCP header, being part of a network protocol, is defined down to the bit
>independently of any specific machine.  Its address fields, for example,
>are 32 bits.  Not 31, not 33, and certainly not 36.  32 and only 32.
>A machine with a sufficiently weird structure may have trouble declaring
>a TCP header using a C struct.

32 bits is the size of the transmitted info. Why does the C structure
representing that header have to match it exactly? Only the routines
that read and write the header should have to care about whether the
C structure matches the transmitted packet bit for bit, and adjust
accordingly.  A bitfield might be required to get proper arithmetic on
the field.
-- 
David Tanguay            Software Development Group, University of Waterloo

henry@zoo.toronto.edu (Henry Spencer) (01/23/91)

In article <1991Jan22.111903.28782@watmath.waterloo.edu> datangua@watmath.waterloo.edu (David Tanguay) writes:
>32 bits is the size of the transmitted info. Why does the C structure
>representing that header have to match it exactly? Only the routines
>that read and write the header should have to care about whether the
>C structure matches the transmitted packet bit for bit...

In an orthodox TCP/IP implementation, those "routines" are all of TCP.
One of the things the funny machines end up doing is putting kludge
routines in between the network representation and the internal
representation.  You still have to write those routines, however.
That can be hard; C bitfields can't span words in most implementations,
so you may end up doing shift-and-mask instead.
-- 
If the Space Shuttle was the answer,   | Henry Spencer at U of Toronto Zoology
what was the question?                 |  henry@zoo.toronto.edu   utzoo!henry