[comp.std.c] sizeof in 36-bits machines

dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) (10/12/89)

Assuming a 36-bit integer (e.g. DEC-10 :-) and 8-bit bytes, what should
`sizeof(int)' return: 4, 4.5 or 5 ?
I know 4.5 is not valid, because pANSI states that the type of `sizeof' is
`size_t' (unsigned integral), but on the other hand 4 is too small and
5 too big.
Or is it simply impossible to make a compliant ANSI C compiler for such
machine/memory configuration ?
-- 
Dolf Grunbauer          Tel: +31 55 432764  Internet dolf@idca.tds.philips.nl
Philips Telecommunication and Data Systems  UUCP ....!mcvax!philapd!dolf
Dept. SSP, P.O. Box 245, 7300 AE Apeldoorn, The Netherlands

flaps@dgp.toronto.edu (Alan J Rosenthal) (10/13/89)

dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) writes:

>Assuming a 36-bit integer (e.g. DEC-10 :-) and 8-bit bytes, what should
>`sizeof(int)' return: 4, 4.5 or 5 ?
...
>Or is it simply impossible to make a compliant ANSI C compiler for such
>machine/memory configuration ?

If your integers are 36 bits, you cannot choose 8 bits as a char size.  On the
dec-10, the natural choice would be 36 bit ints and 9 bit chars.  The byte
operations on the dec-10 make 9 bit and 8 bit chars equally easy, although the
native size of choice is 7 for packing reasons.

I think you'll find that word-addressed machines tend to have sophisticated
byte-grabbing operations which allow specification of the byte size.

ajr

karl@haddock.ima.isc.com (Karl Heuer) (10/13/89)

In article <272@ssp1.idca.tds.philips.nl> dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) writes:
>Assuming a 36-bit integer (e.g. DEC-10 :-) and 8-bit bytes...

No.  Objects must be composed of bytes; if the word size is 36, the only valid
byte sizes are {9, 12, 18, 36}.  In fact, this is an explicit example in the
Rationale document.

(An alternative would be to emulate a 32-bit implementation by always
discarding the upper four bits.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/14/89)

In article <272@ssp1.idca.tds.philips.nl>, dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) writes:
|  Assuming a 36-bit integer (e.g. DEC-10 :-) and 8-bit bytes, what should
|  `sizeof(int)' return: 4, 4.5 or 5 ?

  The answer is that you have made contradictory assumptions. The size
of a byte on a 36 bit machine is 9 bits (at least on the Honeywell is
was).

  There ARE machines on which the size of a word in bits is not a
multiple of the hardware addressable byte, but all of the 36 bit
machines I've used did 9 bit bytes. There may well be exceptions.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
"The world is filled with fools. They blindly follow their so-called
'reason' in the face of the church and common sense. Any fool can see
that the world is flat!" - anon

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/14/89)

In article <272@ssp1.idca.tds.philips.nl> dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) writes:
>Assuming a 36-bit integer (e.g. DEC-10 :-) and 8-bit bytes, what should
>`sizeof(int)' return: 4, 4.5 or 5 ?

C requires that all data types have sizes that are integral multiples
of the size of a "char" (aka "byte").  This means that a 36-bit
implementation would have to make "char" 9, 12, 18, or 36 bits.
(The minimum allowed size is 8 bits, but 8 does not divide 36 evenly.)

I don't say that I LIKE this; I would have preferred permitting such
implementations to express sizeof in terms of bits, not bytes.  However,
that's what was officially decided.

meissner@dg-rtp.dg.com (Michael Meissner) (10/14/89)

In article <272@ssp1.idca.tds.philips.nl> dolf@idca.tds.PHILIPS.nl
(Dolf Grunbauer) writes:

>  Assuming a 36-bit integer (e.g. DEC-10 :-) and 8-bit bytes, what should
>  `sizeof(int)' return: 4, 4.5 or 5 ?
>  I know 4.5 is not valid, because pANSI states that the type of `sizeof' is
>  `size_t' (unsigned integral), but on the other hand 4 is too small and
>  5 too big.
>  Or is it simply impossible to make a compliant ANSI C compiler for such
>  machine/memory configuration ?

The above implementation is not legal ANSI C, since ANSI C (in the
environment section, I think, but I don't my a copy of the draft at
home) mandates that each object must be an integral number of bytes.
In the case of the DEC-10, you would use 9-bit bytes.
--

Michael Meissner, Data General.				If compiles where much
Uucp:		...!mcnc!rti!xyzzy!meissner		faster, when would we
Internet:	meissner@dg-rtp.DG.COM			have time for netnews?

rhg@cpsolv.UUCP (Richard H. Gumpertz) (10/15/89)

I see no reason that char could not be 8 bits on a PDP-10.  sizeof(long) might
then be 4.  There is nothing in C that talks about the bit-size of any type
being a multiple of the bit-size of char.  There is only stuff that talks about
sizeof.  There is nothing that prohibits extra bits between chars (e.g. the
low-order 4 bits for PDP-10 style 8-bit chars) as long as the addressing
mechanism skips over them.  Hence, as long as adding 4 to a char * really adds
1 to the word-address portion of a PDP-10 byte pointer, all should work fine.
Note that char * might be implemented as a PDP-10 style (LDB) byte-pointer
while int * might be a word pointer.  Conversion would happen upon type-casting.

I forget whether char must hold at least 8 bits and don't have a standard in
front of me.  If 7 bits is legal, the above discussion applies there as well,
with sizeof(int) being 5!  This would match traditional PDP-10 style ASCII!

Alternatively, you could just make sizeof(int)=sizeof(char)=1 and then have a
more traditional addressing scheme (even though char[...] might be inefficient).

I prefer the former implementation.
-- 
==========================================================================
| Richard H. Gumpertz    rhg@cpsolv.UUCP -or- ...uunet!amgraf!cpsolv!rhg |
| Computer Problem Solving, 8905 Mohawk Lane, Leawood, Kansas 66206-1749 |
==========================================================================

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/16/89)

In article <398@cpsolv.UUCP> rhg@cpsolv.uucp (Richard H. Gumpertz) writes:
>I see no reason that char could not be 8 bits on a PDP-10.  sizeof(long) might
>then be 4.  There is nothing in C that talks about the bit-size of any type
>being a multiple of the bit-size of char.

Wrong.  "Except for bit-fields, objects are composed of contiguous
sequences of one or more bytes, ..." (Section 1.6).  And "byte",
"character", and "char" are defined as denoting essentially the same
thing (differing only in the property being emphasized).

>There is nothing that prohibits extra bits between chars (e.g. the low-order
>4 bits for PDP-10 style 8-bit chars) as long as the addressing mechanism
>skips over them.

This is allowed ONLY if the "extra" bits are also skipped in accessing
integer, etc. data.  In other words, the implementor need not use all the
bits in a word, but if he's going to ignore some of them in the char array
context, he must ignore them also in other contexts.  It would seem to be
a pretty dumb implementation decision to do that.

>I forget whether char must hold at least 8 bits ...

Yes, the Standard in effect requires that a char/byte have at least 8 bits.

>Alternatively, you could just make sizeof(int)=sizeof(char)=1

An implementor can do that, and it might be a reasonable choice on a word-
oriented machine where byte extraction is horribly expensive and there is
plenty of memory available for applications.

karl@haddock.ima.isc.com (Karl Heuer) (10/17/89)

In article <11300@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>[a 36-bit machine having 8-bit char would be] allowed ONLY if the "extra"
>bits are also skipped in accessing integer, etc. data.  In other words, the
>implementor need not use all the bits in a word, but if he's going to ignore
>some of them in the char array context, he must ignore them also in other
>contexts.  It would seem to be a pretty dumb implementation decision to do
>that.

I dunno, it might be useful for porting ATWAV code to a PDP-10.

>>Alternatively, you could just make sizeof(int)=sizeof(char)=1

It remains to be seen whether this is legal, and if so, what happens when the
input stream contains a bit pattern that compares equal to the value of EOF.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
#define ATWAV All-The-World's-A-Vax.

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/17/89)

In article <14904@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
>>>Alternatively, you could just make sizeof(int)=sizeof(char)=1
>It remains to be seen whether this is legal, and if so, what happens when the
>input stream contains a bit pattern that compares equal to the value of EOF.

There wouldn't be a problem for the characters corresponding to the C
source character set.  There certainly might be for other characters.
I think our tentative resolution of this was that it IS a permitted
implementation, although there was some question as to whether X3J11
had actually intended for it to be.  Thus it should be raised as a
formal request for interpretation, in order to get a definitive ruling.

diamond@csl.sony.co.jp (Norman Diamond) (10/18/89)

Some poster whose name was deleted:

>>>Alternatively, you could just make sizeof(int)=sizeof(char)=1

Doug Gwyn replied to the posting but did not comment on this sentence.

In article <14904@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:

>It remains to be seen whether this is legal, and if so, what happens when the
>input stream contains a bit pattern that compares equal to the value of EOF.

There is a reason why Mr. Gwyn did not comment on that particular
sentence.  If sizeof(int)==sizeof(char), indeed it is possible that the
input stream might contain a bit pattern that compares equal to the
value of EOF.  The programmer must test feof().  I believe Mr. Gwyn
once remarked that he found this distasteful but got used to it.

-- 
Norman Diamond, Sony Corp. (diamond%ws.sony.junet@uunet.uu.net seems to work)
  Should the preceding opinions be caught or     |  James Bond asked his
  killed, the sender will disavow all knowledge  |  ATT rep for a source
  of their activities or whereabouts.            |  licence to "kill".

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/19/89)

In article <10969@riks.csl.sony.co.jp> diamond@ws.sony.junet (Norman Diamond) writes:
->>>Alternatively, you could just make sizeof(int)=sizeof(char)=1
-Doug Gwyn replied to the posting but did not comment on this sentence.

Yes, I did.

-In article <14904@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
->It remains to be seen whether this is legal, and if so, what happens when the
->input stream contains a bit pattern that compares equal to the value of EOF.
-There is a reason why Mr. Gwyn did not comment on that particular
-sentence.  If sizeof(int)==sizeof(char), indeed it is possible that the
-input stream might contain a bit pattern that compares equal to the
-value of EOF.  The programmer must test feof().  I believe Mr. Gwyn
-once remarked that he found this distasteful but got used to it.

I also responded to Karl's followup that you cited.  Your last sentence
does not reflect what I said about the situation.