[comp.lang.c] Bit Addressable Architectures

crowl@cs.rochester.edu (Lawrence Crowl) (03/05/88)

In article <1988Mar3.182645.703@utzoo.uucp> henry@utzoo.uucp
(Henry Spencer) writes:
>I once had the opportunity to ask Bill Wulf what he thought of bit-oriented
>machines; his answer was "I wish they weren't so damned slow".  I'm afraid
>I haven't seen anything since that invalidates that assessment.  There is
>something to be said for providing bit addressability, but one must realize
>that actually exploiting it will be slow and that there will still be a
>large payoff for trying to work on byte or word boundaries whenever possible.

It seems to me that aligned access to all items larger than a bit would allow
a bit addressable machine to be every bit as fast as a byte or word addressable
machine.  Am I missing something?

A bit addressable machine would allow us to use single bits, nibbles, BCD, etc.
with much greater ease.  Besides, bit addressability seems "right".  (I know,
"right" isn't a rational statement!)
-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627

bcase@Apple.COM (Brian Case) (03/05/88)

In article <7374@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
>In article <1988Mar3.182645.703@utzoo.uucp> henry@utzoo.uucp
>(Henry Spencer) writes:
>It seems to me that aligned access to all items larger than a bit would allow
>a bit addressable machine to be every bit as fast as a byte or word addressable
>machine.  Am I missing something?

Yes, the alignment network is always there whether an instruction uses it or
not.

henry@utzoo.uucp (Henry Spencer) (03/06/88)

> >I once had the opportunity to ask Bill Wulf what he thought of bit-oriented
> >machines; his answer was "I wish they weren't so damned slow".  I'm afraid
> >I haven't seen anything since that invalidates that assessment.  There is
> >something to be said for providing bit addressability, but one must realize
> >that actually exploiting it will be slow and that there will still be a
> >large payoff for trying to work on byte or word boundaries whenever possible.
> 
> It seems to me that aligned access to all items larger than a bit would allow
> a bit addressable machine to be every bit as fast as a byte or word addressable
> machine.  Am I missing something?

No and yes.

No, in that this is exactly what I said in the last sentence of my comments,
although somewhat obscurely.  (Note that "bit-oriented" and "bit-addressable"
aren't the same thing in the terminology I was using.)  As an extreme case,
one can envision a bit-addressable machine -- that is, one whose pointers
use the low-order three bits to indicate a bit within a byte -- that traps
whenever those bits aren't zero, leaving the actual use of bit pointers
entirely up to the software.  When all accesses were in fact aligned, this
would incur essentially no overhead except the reduction in address space.

Yes, in that almost any attempt to make bit-aligned objects easier to handle
is going to mean extra hardware, quite possibly in a critical path where
every added gate slows the whole machine down.  Even if it's not in a
critical path, it will steal chip area from other things that could boost
performance.  The tradeoffs depend on the design details.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

earl@mips.COM (Earl Killian) (03/08/88)

In article <7374@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:

   A bit addressable machine would allow us to use single bits,
   nibbles, BCD, etc.  with much greater ease.  Besides, bit
   addressability seems "right".  (I know, "right" isn't a rational
   statement!)

It's more right in certain environments.  For example the TI 34010
graphics processor is bit-addressed, which is a good match for pixel
operations.  Also, when we have 64-bit addresses, using bit addresses
will make sense (this is independent of whether you have bit
load/stores).

bcase@Apple.COM (Brian Case) (03/09/88)

In article <1799@gumby.mips.COM> earl@mips.COM (Earl Killian) writes:
>It's [BIT ADDRESSABILITY] more right in certain environments.
>For example the TI 34010
>graphics processor is bit-addressed, which is a good match for pixel
>operations.  Also, when we have 64-bit addresses, using bit addresses
>will make sense (this is independent of whether you have bit
>load/stores).

Just as a point of interest, bit addressability does not win in certain
graphics environments; there are planar, chunky, and chunky-planar graphics
organizations (probably there are more, but I am not a graphics type),
and in chunky, bit addressability gains very little.  For 8-bits per
pixel in chunky, byte addressability is wonderful.  For 24-bits plus
alpha, 32-bit word addressability is great.  This is according to the
graphics guys here.  BTW, the TI 34010 is none too fast.

franka@mmintl.UUCP (Frank Adams) (03/09/88)

In article <1988Mar6.002518.945@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>As an extreme case,
>one can envision a bit-addressable machine -- that is, one whose pointers
>use the low-order three bits to indicate a bit within a byte -- that traps
>whenever those bits aren't zero, leaving the actual use of bit pointers
>entirely up to the software.  When all accesses were in fact aligned, this
>would incur essentially no overhead except the reduction in address space.

This may sound like an off the wall idea, but it makes a lot of sense to me.
This would mean that arithmetic on bit pointers can be done using the
standard arithmetic operations; and no special format is required for them.
Note that the software need not wait for a trap to deal with unaligned data
-- if it knows it is dealing with a bit pointer, it can extract and deal
with the low order bits itself.

As for the address space issue: I personally believe that 32 bit addresses
are too short, and that this will become apparent fairly quickly.  With a 64
bit address, one can afford to use 3 bits this way.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

ccplumb@watmath.waterloo.edu (Colin Plumb) (03/14/88)

franka@mmintl.UUCP (Frank Adams) wrote:
>henry@utzoo.uucp (Henry Spencer) writes:
>>... a bit-addressable machine ...
>
>This may sound like an off the wall idea, but it makes a lot of sense to me.

How is this off the wall?  I think it's a wonderful idea.  It seems
more sensible than having pointers on 32-bit machines count 8-bit hunks
of memory.  We have already observed that pointers based on the
machine's word length are a lose - we want to be able to address bytes,
at least.  With 64-bit machines, you want to use something smaller than
a full word for most things.  The other logical extreme is bit
addressability.  With 64-bit pointers, this reduces our address space
from 18,446,744,073,709,551,616 bytes to 2,305,843,009,213,693,952.
Big deal.  Call me a pessimist, but I don't think a single processor,
in any sense we use today, will be able to use this much memory.

Backwards compatibility?  With a C compiler insulating the user, the
only change is that sizeof(char) is now 8.  Pointer arithmetic still
works fine.  And, as all the processor architects here have based their
expectations of success on, if it doesn't uncover bugs in existing C
code, everybody likes it.

Theoretically, we want a pointer to be able to address any object we
can manipulate.  Even if the architecture does not directly support bit
operations, we can twiddle single bits.

>Note that the software need not wait for a trap to deal with unaligned data
>-- if it knows it is dealing with a bit pointer, it can extract and deal
>with the low order bits itself.

... In the spirit of RISC processors which ignore the low two bits of
the address in a word access.

>As for the address space issue: I personally believe that 32 bit addresses
>are too short, and that this will become apparent fairly quickly.  With a 64
>bit address, one can afford to use 3 bits this way.

Well, I think 32 bits will hold single-user computers for a while yet,
but I can see some uses for such a huge virtual address space, and I'm
sure its availability will spark more.  (I remember someone from the
New OED project pinting out that he was manipulating a 550 Meg
database, soon to grow to 800 and beyond, and having a pointer as an
atomic unit which could address any byte in this huge string simplified
many algorithms significantly.)

Of course, MMU designers will hate us for needing more tag bits. :-}


This can't be a new idea.  Why has no one implemented it before, when
32-bit pointers seemed infinite?  Perhaps that will uncover a flaw in
my reasoning.

>-- 
>Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
>Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108
--
	-Colin (watmath!ccplumb)

Zippy says:
Everywhere I look I see NEGATIVITY and ASPHALT...

ram@lscvax.UUCP (Ric Messier) (03/14/88)

In article <2760@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>In article <1988Mar6.002518.945@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>[stuff deleted]
While this is really rather fascinating, I just have one question. What
the hell is it doing in comp.lang.c?? I don't C the relevance here.


-- 
- Kilroy                                                 ram@lscvax.UUCP
'Just what cowpatch is Lyndonville, Vermont in anyway?'

                                                         *** Can't deal, &CRASH

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/14/88)

In article <17458@watmath.waterloo.edu> ccplumb@watmath.waterloo.edu (Colin Plumb) writes:
>This can't be a new idea.  Why has no one implemented it before, when
>32-bit pointers seemed infinite?  Perhaps that will uncover a flaw in
>my reasoning.

It's occasionally been tried, and there is nothing fundamentally wrong with
the idea.  The biggest reason for lack of popularity is that it doesn't help
much with the code generated for typical existing high-level langauges; they
often don't provide convenient access to bit-level data, so applications are
coded to access data in larger chunks and pick it apart themselves.

If direct bit-operation support is not built into some popular systems
programming language (such as a C successor), there will be little
incentive for manufacturers to provide the underlying hardware support.

The main categories of applications I've been involved in that would benefit
from being able to access bits as conveniently as words/bytes are:
	bit-map graphics (especially black-and-white)
	data compression
	encryption (also cryptanalysis of machine ciphers)
	bottom-up parsing (e.g. transitive closure of Boolean matrices)
	simulation
I'm sure there are others.

henry@utzoo.uucp (Henry Spencer) (03/15/88)

> Backwards compatibility?  With a C compiler insulating the user, the
> only change is that sizeof(char) is now 8...

Actually, even that incompatibility isn't necessary.  A C compiler is
perfectly free to decide that it still counts in bytes.  (This may in fact
be desirable, given that the hypothetical machine we are discussing does
not have bit operations, just bit addressing.)  The only situation in
which the compiler can't completely hide what is going on is if pointers
are converted to integers and examined, which is already an implementation-
dependent area.

Best news of all (heh, heh) is that on such a machine one would probably
want to print pointers in octal, so that the bit offset was cleanly broken
out in the low-order digit.  Since octal is the way God meant programmers
to count (the thumbs are parity bits) :-), this is clearly a Good Thing.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

franka@mmintl.UUCP (Frank Adams) (03/16/88)

In article <7452@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>It's occasionally been tried, and there is nothing fundamentally wrong with
>the idea.  The biggest reason for lack of popularity is that it doesn't help
>much with the code generated for typical existing high-level langauges; they
>often don't provide convenient access to bit-level data, so applications are
>coded to access data in larger chunks and pick it apart themselves.

Of course, high-level languages which provide convenient access to bit-level
data have been tried occasionally, and haven't been very popular.  The
biggest reason for this is that popular machine architectures don't provide
efficient access to bit-level data, so applications are coded to access data
in larger chunks and pick it apart themselves.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

lm@arizona.edu (Larry McVoy) (03/17/88)

In article <1988Mar14.193330.488@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>> Backwards compatibility?  With a C compiler insulating the user, the
>> only change is that sizeof(char) is now 8...
>
>Actually, even that incompatibility isn't necessary.  A C compiler is
>perfectly free to decide that it still counts in bytes.  
>............. given that the hypothetical machine we are discussing does

Hypothetical, my foot.  The ETA-10 compiler does exactly what you described.
Crazy thing also converts pointers into bit addresses (p<<3) when you 
put them into an int.  So think about what code this generates:

foo()
{
    register char* bar = (char*)malloc(123);
}

And then get out lint.

>out in the low-order digit.  Since octal is the way God meant programmers
>to count (the thumbs are parity bits) :-), this is clearly a Good Thing.
>-- 
>Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
>condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

Jeez, Henry, I finally found something to date you by :-)  Doncha know
that hex is the wave to future?  (Actually, hex is really nice when you
do network debugging: it's easy to see when the byte order is ``wrong''.)
-- 

Larry McVoy	lm@arizona.edu or ...!{uwvax,sun}!arizona.edu!lm

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/20/88)

In article <2767@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
-In article <7452@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
->It's occasionally been tried, and there is nothing fundamentally wrong with
->the idea.  The biggest reason for lack of popularity is that it doesn't help
->much with the code generated for typical existing high-level langauges; they
->often don't provide convenient access to bit-level data, so applications are
->coded to access data in larger chunks and pick it apart themselves.
-Of course, high-level languages which provide convenient access to bit-level
-data have been tried occasionally, and haven't been very popular.  The
-biggest reason for this is that popular machine architectures don't provide
-efficient access to bit-level data, so applications are coded to access data
-in larger chunks and pick it apart themselves.

No, the IMPLEMENTATION would do the work in that case.  Although it
amounts to the same thing at the nitty-gritty level, assuming the
particular hardware doesn't support bit operations, it makes
application programming much nicer.  And, when compiled on a machine
that DOES have bit operations, the object code runs much faster.

This vicious circle of cause-and-effect needs to be broken somehow.
The fact that there are several application areas that could benefit
(as I listed earlier) should be sufficient reason to try.

jk3k+@andrew.cmu.edu (Joe Keane) (03/24/88)

In article <1988Mar14.193330.488@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer)
writes:
> Since octal is the way God meant programmers
> to count (the thumbs are parity bits) :-), this is clearly a Good Thing.
Right reason, wrong answer.  Your hands can of course hold 10 bits.  Since you
say the thumbs are parity bits, that means they hold a byte.  That means each
hand stores - get this - a hex digit.  Down with octal!

--Joe

cudcv@daisy.warwick.ac.uk (Rob McMahon) (03/27/88)

Newsgroups: comp.lang.c,comp.arch
Subject: Re: Bit Addressable Architectures
References: <11702@brl-adm.ARPA> <243@eagle_snax.UUCP> <2245@geac.UUCP> <1988Mar6.002518.945@utzoo.uucp> <2760@mmintl.UUCP> <17458@watmath.waterloo.edu>
Reply-To: cudcv@titania.warwick.ac.uk (Rob McMahon)
Distribution: 
Organization: Computing Services, Warwick University, UK

In article <17458@watmath.waterloo.edu> ccplumb@watmath.waterloo.edu (Colin Plumb) writes:
>Backwards compatibility?  With a C compiler insulating the user, the
>only change is that sizeof(char) is now 8.

Only change ?  Sounds like a big "only" to me.  I wonder how much code
out there assumes that sizeof(char) == 1, that sizeof("constant string"), 
or sizeof(initialised_char_array) is the same as strlen(xx)+1 ?  Does
malloc now take number of bits required, or char's ?  It's going to
break either "malloc(n * sizeof(s))" or "malloc(strlen(s) + 1)", or does
everybody but me write "malloc((strlen(s)+1)*sizeof(char))" ?

Rob
-- 
UUCP:   ...!mcvax!ukc!warwick!cudcv	PHONE:  +44 203 523037
JANET:  cudcv@uk.ac.warwick.cu          ARPA:   cudcv@cu.warwick.ac.uk
Rob McMahon, Computing Services, Warwick University, Coventry CV4 7AL, England

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/29/88)

In article <504@sol.warwick.ac.uk> cudcv@cu.warwick.ac.uk (Rob McMahon) writes:
>I wonder how much code out there assumes that ...
>sizeof("constant string"), or sizeof(initialised_char_array)
>is the same as strlen(xx)+1 ?

There's a lot of code like that, no question.  It would continue to
work if sizeof(char) were allowed to be other than 1, on most current
systems, although it might not be portable to other systems or to
future compiler releases.

>Does malloc now take number of bits required, or char's ?

malloc() would be told the number of "bytes" required, where
sizeof(byte)==1.  By "byte" I mean the smallest addressable storage
unit, not necessarily 8 bits in size, nor 1 bit, nor big enough to
represent a character.  (In my proposal this was a "short char".)

Your concerns are legitimate, but so are those of programmers
who have to deal with so-called multi-byte character representations.
Anyway, X3J11 did not buy into the "short char" idea and I doubt they
will be willing to change to it now.

karl@haddock.ISC.COM (Karl Heuer) (03/30/88)

In article <504@sol.warwick.ac.uk> cudcv@cu.warwick.ac.uk (Rob McMahon) writes:
>... or does everybody but me write "malloc((strlen(s)+1)*sizeof(char))" ?

I do.  It automatically makes the argument the right type$ for malloc()
(assuming malloc's argument and sizeof's result are both unsigned, or size_t
in ANSI C); and it makes it easier to convert when you later decide that you
want to use some type other than char.  And, of course, it makes the code no
longer dependent on the questionable% sizeof(char)==1.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
Followups to comp.lang.c only.
________
$ Yes, strlen() is already a size_t in ANSI C, but this situation can also
occur with int-valued expressions.
% I don't question that it's true, just whether it's a good idea.

greg@csanta.UUCP (Greg Comeau) (04/01/88)

Hmm, I'm new on the net here, so excuse me for jumping into the middle of a
discussion, but sizeof(char) is always 1.  The number of bits in a char
is a whole 'nuther story.  This is usually 8, but need not be.  This is
true of both dpANSI C and K&R C.

gp@picuxa.UUCP (Greg Pasquariello X1190) (04/04/88)

In article <113@csanta.UUCP> greg@csanta.UUCP (Root) writes:
>Hmm, I'm new on the net here, so excuse me for jumping into the middle of a
>discussion, but sizeof(char) is always 1.  The number of bits in a char
>is a whole 'nuther story.  This is usually 8, but need not be.  This is
>true of both dpANSI C and K&R C.

This is true of the dpANSI C and K&R C _implementation_, but it is not neces-
sarily true of the C definition.  Sizeof yields "the size, in bytes, of it's 
operand" (K&R pg 188).   The fundamental type char, is "large enough to store
any member of the implementations character set" (K&R pg 182).  This _could_
be multiple bytes!


(God I hope what I just said is true :-))

Greg Pasquariello
ihnp4!picuxa!gp

mouse@mcgill-vision.UUCP (der Mouse) (04/08/88)

In article <0WG23wy00W07M9LkhH@andrew.cmu.edu>, jk3k+@andrew.cmu.edu (Joe Keane) writes:
> In article <1988Mar14.193330.488@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes:
>> Since octal is the way God meant programmers to count (the thumbs
>> are parity bits) :-), this is clearly a Good Thing.
> Right reason, wrong answer.  Your hands can of course hold 10 bits.
> Since you say the thumbs are parity bits, that means they hold a
> byte.  That means each hand stores - get this - a hex digit.  Down
> with octal!

That was my reaction too, until I thought about it.  When we count
normally on our fingers, we count to ten, not 1024 (or at least I do; I
don't know how many fingers you have :-).  So Henry would have us count
to eight, and the parity bit bit is just confusing in that it suggests
that each finger represents one bit.

					der Mouse

			uucp: mouse@mcgill-vision.uucp
			arpa: mouse@larry.mcrcim.mcgill.edu

mouse@mcgill-vision.UUCP (der Mouse) (04/10/88)

In article <7578@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
> In article <504@sol.warwick.ac.uk> cudcv@cu.warwick.ac.uk (Rob McMahon) writes:
>> I wonder how much code out there assumes that ...
>> sizeof("constant string"), or sizeof(initialised_char_array)
>> is the same as strlen(xx)+1 ?
> There's a lot of code like that, no question.  It would continue to
> work if sizeof(char) were allowed to be other than 1,

How could it?

char foo[] = "This is foo";

strlen(foo) is 11.  sizeof(foo) is 12*sizeof(char).  Or are you
redefining strlen() as well?

Personally, I tend towards the sizeof returning size in bits rather
than bytes.  And making bits full objects.  And lots of other
things...but this belongs in comp.lang.d.

					der Mouse

			uucp: mouse@mcgill-vision.uucp
			arpa: mouse@larry.mcrcim.mcgill.edu

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/10/88)

In article <1040@mcgill-vision.UUCP> mouse@mcgill-vision.UUCP (der Mouse) writes:
-In article <7578@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
-> There's a lot of code like that, no question.  It would continue to
-> work if sizeof(char) were allowed to be other than 1,
-How could it?

Naturally, if you truncate the explanation by stopping at the "," then
you missed it.  Why do people narrow their focus to such small contexts?

major@eleazar.Dartmouth.EDU (Lou Major) (04/13/88)

*ahem*
 
char foo[]="This is a test.";
 
sizeof (foo) == sizeof (char *)

NOT the number of machine bytes/words those characters take up. (16, for most
typical installations)

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/14/88)

In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU (Lou Major) writes:
>char foo[]="This is a test.";
>sizeof (foo) == sizeof (char *)

Since when?

I know that Gould had a bug in their UTX-32 compiler that made it think
sizeof"......"==sizeof(char *), but they fixed that and in any case
it's not the same as your example.  So what gives?

(I don't think the array name is turned into a pointer just because it's
surrounded by parentheses.)

ok@quintus.UUCP (Richard A. O'Keefe) (04/14/88)

In article <8646@eleazar.Dartmouth.EDU>, major@eleazar.Dartmouth.EDU (Lou Major) writes:
> *ahem*
> char foo[]="This is a test.";
> sizeof (foo) == sizeof (char *)
> NOT the number of machine bytes/words those characters take up. (16, for most
> typical installations)

Wrong.  The answer *is* 16.  This is one of the few cases where
foo and &(foo[0]) are different.  I _tried_ this to make sure I was right.
That's always a good idea.

edward@ucbarpa.Berkeley.EDU (Edward Wang) (04/14/88)

 In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU (Lou Major) writes:
 >
 >*ahem*
 > 
 >char foo[]="This is a test.";
 > 
 >sizeof (foo) == sizeof (char *)
 >
 >NOT the number of machine bytes/words those characters take up. (16, for most
 >typical installations)

 This is just plain false.

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (04/14/88)

In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU (Lou Major) writes:
...
| char foo[]="This is a test.";
|  
| sizeof (foo) == sizeof (char *)
| 
| NOT the number of machine bytes/words those characters take up. (16, for most
| typical installations)

Excuse me? Not on any machine I've ever seen, or in any standard I've
ever read. foo is an array, not a pointer. What you said would be true
if the declaration was:
	char *foo = "This is a test";

I've moved followup to comp.lang.c
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

emiller@bbn.com (ethan miller) (04/14/88)

Expires:

Sender:

Followup-To:

Keywords:


In article <877@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
=>In article <8646@eleazar.Dartmouth.EDU>, major@eleazar.Dartmouth.EDU (Lou Major) writes:
=>> char foo[]="This is a test.";
=>> sizeof (foo) == sizeof (char *)
=>> NOT the number of machine bytes/words those characters take up. (16, for most
=>> typical installations)
=>
=>Wrong.  The answer *is* 16.  This is one of the few cases where
=>foo and &(foo[0]) are different.  I _tried_ this to make sure I was right.
=>That's always a good idea.

Sure is.  What did you try?  _I_ just tried printing foo and &(foo[0]), and
they are the same.  BTW, I also tried sizeof (foo), and it is 16.

ethan
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
Ethan Miller			   | "If they don't keep on exercising
BBN Laboratories (Cambridge, MA)   |  their lips, he thought, their brains
ARPAnet: emiller@bbn.com           |  start working."
Disclaimer: BBN didn't write this. |   -- The Hitchhiker's Guide to the Galaxy
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

chris@mimsy.UUCP (Chris Torek) (04/15/88)

In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU
(Lou Major) writes:
>*ahem*
>char foo[]="This is a test.";
>sizeof (foo) == sizeof (char *)
>NOT the number of machine bytes/words those characters take up.

Quite wrong.

There *are* compilers that produce the wrong answer for

	sizeof("string")

(the `correct' number is 7), but any compiler that gets sizeof(foo)
above wrong (2 or 4 rather than 16) is so egregiously broken that
it is not worth discussing.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

root@mfci.UUCP (SuperUser) (04/15/88)

Expires:

Sender:

Followup-To:

Distribution:

Keywords:


In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU (Lou Major) writes:
}
}*ahem*
}
}char foo[]="This is a test.";
}
}sizeof (foo) == sizeof (char *)
}
}NOT the number of machine bytes/words those characters take up. (16, for most
}typical installations)

Not on this planet.  From K&R:  "When applied to an array, the result is the
total number of bytes in the array."  The parentheses don't alter this fact.
The example above is confusing because the number of characters in foo is
exactly 16 (15 in the quotes plus the final null character).

scjones@sdrc.UUCP (Larry Jones) (04/15/88)

In article <7684@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
> In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU (Lou Major) writes:
> >char foo[]="This is a test.";
> >sizeof (foo) == sizeof (char *)
> 
> Since when?
> 
> I know that Gould had a bug in their UTX-32 compiler that made it think
> sizeof"......"==sizeof(char *), but they fixed that and in any case
> it's not the same as your example.  So what gives?
> 
> (I don't think the array name is turned into a pointer just because it's
> surrounded by parentheses.)

If it ain't, the compiler's broke!  The sizeof operator can be applied to a
parenthesized type name or to an expression.  Since "foo" isn't a type name,
the operand of sizeof is an expression.  When an array name appears in an
expression and it's not the operand of & or sizeof (whose operand is the
parenthesized express, remember), it's converted into a pointer to the first
element.

----
Larry Jones                         UUCP: uunet!sdrc!scjones
SDRC                                MAIL: 2000 Eastman Dr., Milford, OH  45150
                                    AT&T: (513) 576-2070
"When all else fails, read the directions."

swarbric@tramp.Colorado.EDU (Frank Swarbrick) (04/15/88)

To Ethan Miller:

  He meant that sizeof(foo) and sizeof(&foo[0]) are not the same when foo
is declared 'char foo[] = "This is a test.";'.

  sizeof(foo) == 16

  sizeof(&foo[0]) == 2

Frank Swarbrick (and his cat)    p.s.  --Mal says hi.
swarbric@tramp.UUCP               swarbric@tramp.Colorado.EDU
...!{ncar|nbires}!boulder!tramp!swarbric
"Timothy Leary is dead..." 

ok@quintus.UUCP (Richard A. O'Keefe) (04/15/88)

In article <23396@bbn.COM>, emiller@bbn.com (ethan miller) writes:
> In article <877@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
> =>In article <8646@eleazar.Dartmouth.EDU>, major@eleazar.Dartmouth.EDU (Lou Major) writes:
> =>> sizeof (foo) == sizeof (char *)
> =>Wrong.  The answer *is* 16.  This is one of the few cases where
> =>foo and &(foo[0]) are different.  I _tried_ this to make sure I was right.
> =>That's always a good idea.
> Sure is.  What did you try?  _I_ just tried printing foo and &(foo[0]), and
> they are the same.  BTW, I also tried sizeof (foo), and it is 16.

foo and &(foo[0]) as expressions have different types:
	foo	  : array-of-16-chars
	&(foo[0]) : pointer-to-char
sizeof notices this difference.  In almost any other context, there is an
implicit conversion to pointer-to-char form.  In particular, print() is not
going to reveal the difference.  Consider
	short x = 1; int y = 1;
	printf("%d %d\n", x, y);
Printing obscures the difference between x and y, and in just the same way
it obscures the difference between foo and &(foo[0]).

mike@turing.UNM.EDU (Michael I. Bushnell) (04/15/88)

In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU (Lou Major) writes:

>*ahem*

>char foo[]="This is a test.";

>sizeof (foo) == sizeof (char *)

>NOT the number of machine bytes/words those characters take up. (16, for most
>typical installations)


Perhaps people have had the POINTER == ARRAY thing hammerred into
their skull too hard.  According to the 4.3 BSD "C Programming
Language Reference Manual, page 8 [PS1:1-8]", I find:

  The sizeof operator yeilds the size in bytes of its operand.  (...)
  When applied to an array, the result is the total number of bytes in
  the array.  The size is determined from the declarations of the
  objects in the expression....


But what about the compiler?  Here are the results.

for the code

char foo1[]="This is a test.";
int size1=sizeof foo1;
char *foo2="This is a test.";
int size2=sizeof foo2;

I get:

[4.3 BSD pcc]:

LL0:
	.data
	.data
	.globl	_foo1
_foo1:
	.long	0x73696854
	.long	0x20736920
	.long	0x65742061
	.long	0x2e7473
	.data
	.align	2
	.globl	_size1
_size1:
	.long	16		# NOTE: 16 for the array

	.align	2
	.globl	_foo2
_foo2:
	.data	2
L14:
	.ascii	"This is a test.\0"
	.data
	.long	L14
	.align	2
	.globl	_size2
_size2:
	.long	4		# NOTE: 4 for the pointer




[GNU C Compiler 1.18]:

#NO_APP
.globl _foo1
.data
	.align 0
_foo1:
	.ascii "This is a test.\0"
.globl _size1
.data
	.align 2
_size1:
	.long 16		# NOTE: 16 for the array

.globl _foo2
.text
	.align 0
LC0:
	.ascii "This is a test.\0"
.data
	.align 2
_foo2:
	.long LC0
.globl _size2
.data
	.align 2
_size2:
	.long 4			# NOTE: 4 for the pointer





Plug: Note how much easier it is to read the gcc stuff too...

                N u m q u a m   G l o r i a   D e o 

			Michael I. Bushnell
			HASA - "A" division
14308 Skyline Rd NE				Computer Science Dept.
Albuquerque, NM  87123		OR		Farris Engineering Ctr.
	OR					University of New Mexico
mike@turing.unm.edu				Albuquerque, NM  87131
{ucbvax,gatech}!unmvax!turing.unm.edu!mike

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/16/88)

In article <259@sdrc.UUCP> scjones@sdrc.UUCP (Larry Jones) writes:
-In article <7684@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
-> (I don't think the array name is turned into a pointer just because it's
-> surrounded by parentheses.)
-If it ain't, the compiler's broke!  The sizeof operator can be applied to a
-parenthesized type name or to an expression.  Since "foo" isn't a type name,
-the operand of sizeof is an expression.  When an array name appears in an
-expression and it's not the operand of & or sizeof (whose operand is the
-parenthesized express, remember), it's converted into a pointer to the first
-element.

My problem was that I couldn't find where in the dpANS the effect of
the parentheses operator was defined.  Is it in there somewhere?

Your explanation sounds right to me but I do want to see what we
say about the effect of parentheses.  Perhaps we removed too much!

tainter@ihlpg.ATT.COM (Tainter) (04/18/88)

In article <8646@eleazar.Dartmouth.EDU>, major@eleazar.Dartmouth.EDU (Lou Major) writes:

Mr. Major asserts that given:
> char foo[]="This is a test.";

then:
> sizeof (foo) == sizeof (char *)
> NOT the number of machine bytes/words those characters take up. (16, for most
> typical installations)

Which is patently absurd.

--j.a.tainter