[comp.std.c] representation of integers

karl@ima.isc.com (Karl Heuer) (11/05/90)

In article <1990Nov3.204327.19057@NCoast.ORG> catfood@NCoast.ORG (Mark W. Schumann) writes:
>You are simply not allowed to assume *anything* about the internal
>representation of integers if your programs are to be 100% portable ANSI.

Actually, the Standard guarantees that bitstrings with the sign bit clear
will have the obvious interpretation as a nonnegative integer.  When the sign
bit is set things get a bit murkier: apparently the Committee intended that
two's complement, one's complement, and sign-magnitude representations are all
legal.

The relevant rule says something about "strict binary except for the sign
bit".  One interpretation would be that this says nothing at all when the sign
bit is set, and so you could have something silly like normal binary for
positives and Gray code for negatives.

Alternately, it could mean that a bitstring b where b&HIBIT is set has the
value M*(b&~HIBIT)+A for some constants M and A.  This, together with the
required limit values (short int must include the range [-32767,+32767], etc.)
would imply that the only legal representations are the above three and a
skewed sign-magnitude (M=-1, A=-1).

A related issue is the question of whether it's required for U{type}_MAX+1 to
be a power of two, and if so, whether it must be 1<<sizeof(type)*CHAR_BIT
(false on some Cray machines, I'm told).

These are on my list of questions to send X3J11 for interpretation someday.
If somebody else gets around to it first, that's fine.

Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint
Followups to comp.std.c.

gwyn@smoke.brl.mil (Doug Gwyn) (11/05/90)

In article <5242@ima.ima.isc.com> karl@ima.isc.com (Karl Heuer) writes:
>Actually, the Standard guarantees that bitstrings with the sign bit clear
>will have the obvious interpretation as a nonnegative integer.  When the sign
>bit is set things get a bit murkier: apparently the Committee intended that
>two's complement, one's complement, and sign-magnitude representations are all
>legal.

Certainly those three representations are intended to be conformant.

>The relevant rule says something about "strict binary except for the sign
>bit".  One interpretation would be that this says nothing at all when the sign
>bit is set, and so you could have something silly like normal binary for
>positives and Gray code for negatives.

A "pure binary numeration system", as defined in the American National
Dictionary for Information Processing Systems, is required.  This is
elaborated in a footnote.  (This dictionary is in effect incorporated
into the C standard near the end of section 1.6.)  The bit with the
highest position need not represent a power of two, but the other bits
must represent successive powers of two, starting with 1.

This means that bitwise arithmetic on non-negative numbers is well defined
and portable (so long as no representation limit is exceeded).  Indeed,
that was the main reason for this requirement.

>A related issue is the question of whether it's required for U{type}_MAX+1 to
>be a power of two, and if so, whether it must be 1<<sizeof(type)*CHAR_BIT
>(false on some Cray machines, I'm told).

While there is no such explicit constraint, it is a logical consequence
of the integral representation requirements that the largest representable
value of any type of unsigned integer would be one less than a power of
two, even on a non-binary (e.g. decimal) machine (which could consequently
use less than its "natural" range of integers).

There is no requirement that all bit patterns in a representation have a
meaning.  Thus, Cray is correct to have UINT_MAX+1 not be a power of
UCHAR_MAX+1.