[comp.arch] Bit Addressable Architectures

crowl@cs.rochester.edu (Lawrence Crowl) (03/05/88)

In article <1988Mar3.182645.703@utzoo.uucp> henry@utzoo.uucp
(Henry Spencer) writes:
>I once had the opportunity to ask Bill Wulf what he thought of bit-oriented
>machines; his answer was "I wish they weren't so damned slow".  I'm afraid
>I haven't seen anything since that invalidates that assessment.  There is
>something to be said for providing bit addressability, but one must realize
>that actually exploiting it will be slow and that there will still be a
>large payoff for trying to work on byte or word boundaries whenever possible.

It seems to me that aligned access to all items larger than a bit would allow
a bit addressable machine to be every bit as fast as a byte or word addressable
machine.  Am I missing something?

A bit addressable machine would allow us to use single bits, nibbles, BCD, etc.
with much greater ease.  Besides, bit addressability seems "right".  (I know,
"right" isn't a rational statement!)
-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627

lamaster@ames.arpa (Hugh LaMaster) (03/05/88)

In article <7374@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
>In article <1988Mar3.182645.703@utzoo.uucp> henry@utzoo.uucp
>(Henry Spencer) writes:
>>I once had the opportunity to ask Bill Wulf what he thought of bit-oriented
>>machines; his answer was "I wish they weren't so damned slow".  I'm afraid

>
>It seems to me that aligned access to all items larger than a bit would allow
>a bit addressable machine to be every bit as fast as a byte or word addressable
>machine.  Am I missing something?
>

The Cyber 205 is bit addressable and supports bit, byte (8 bit), halfword
(32 bit), and fullword (64 bit)
data accesses. 

(Somehow they left out 16 bit support directly as far as I can see from
memory to registers or memory, although there are some instructions for
16 bit data types in registers).  The actual memory word is the "sword"
(super word) which is 8 words (512 bits) in length.  The Load/Store
unit extracts the required number of bits from the sword.

Anyway, the Cyber 205 does require ALIGNMENT on data length boundaries.
Bit strings must begin on addresses that are a multiple of bits 
(obvious, right?) bytes on byte addresses, halfwords on
halfwords ,etc.  I expect most "fast" machines in the future will require
alignment because it is faster.  Arbitrary alignment would require a
shifter in the data path which is a significant performance degradation.

Code which is dependent on the bytes
of data structures being contiguous is not "portable" and all the 
C manuals I have read point this out.  I see nothing heinous in Sun or
MIPS or whoever requiring alignment.  There are some machines out there
(e.g. Cray) which don't even have byte addressing, let alone arbitrary
alignment.

bcase@Apple.COM (Brian Case) (03/05/88)

In article <7374@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
>In article <1988Mar3.182645.703@utzoo.uucp> henry@utzoo.uucp
>(Henry Spencer) writes:
>It seems to me that aligned access to all items larger than a bit would allow
>a bit addressable machine to be every bit as fast as a byte or word addressable
>machine.  Am I missing something?

Yes, the alignment network is always there whether an instruction uses it or
not.

henry@utzoo.uucp (Henry Spencer) (03/06/88)

> >I once had the opportunity to ask Bill Wulf what he thought of bit-oriented
> >machines; his answer was "I wish they weren't so damned slow".  I'm afraid
> >I haven't seen anything since that invalidates that assessment.  There is
> >something to be said for providing bit addressability, but one must realize
> >that actually exploiting it will be slow and that there will still be a
> >large payoff for trying to work on byte or word boundaries whenever possible.
> 
> It seems to me that aligned access to all items larger than a bit would allow
> a bit addressable machine to be every bit as fast as a byte or word addressable
> machine.  Am I missing something?

No and yes.

No, in that this is exactly what I said in the last sentence of my comments,
although somewhat obscurely.  (Note that "bit-oriented" and "bit-addressable"
aren't the same thing in the terminology I was using.)  As an extreme case,
one can envision a bit-addressable machine -- that is, one whose pointers
use the low-order three bits to indicate a bit within a byte -- that traps
whenever those bits aren't zero, leaving the actual use of bit pointers
entirely up to the software.  When all accesses were in fact aligned, this
would incur essentially no overhead except the reduction in address space.

Yes, in that almost any attempt to make bit-aligned objects easier to handle
is going to mean extra hardware, quite possibly in a critical path where
every added gate slows the whole machine down.  Even if it's not in a
critical path, it will steal chip area from other things that could boost
performance.  The tradeoffs depend on the design details.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

earl@mips.COM (Earl Killian) (03/08/88)

In article <7374@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:

   A bit addressable machine would allow us to use single bits,
   nibbles, BCD, etc.  with much greater ease.  Besides, bit
   addressability seems "right".  (I know, "right" isn't a rational
   statement!)

It's more right in certain environments.  For example the TI 34010
graphics processor is bit-addressed, which is a good match for pixel
operations.  Also, when we have 64-bit addresses, using bit addresses
will make sense (this is independent of whether you have bit
load/stores).

bcase@Apple.COM (Brian Case) (03/09/88)

In article <1799@gumby.mips.COM> earl@mips.COM (Earl Killian) writes:
>It's [BIT ADDRESSABILITY] more right in certain environments.
>For example the TI 34010
>graphics processor is bit-addressed, which is a good match for pixel
>operations.  Also, when we have 64-bit addresses, using bit addresses
>will make sense (this is independent of whether you have bit
>load/stores).

Just as a point of interest, bit addressability does not win in certain
graphics environments; there are planar, chunky, and chunky-planar graphics
organizations (probably there are more, but I am not a graphics type),
and in chunky, bit addressability gains very little.  For 8-bits per
pixel in chunky, byte addressability is wonderful.  For 24-bits plus
alpha, 32-bit word addressability is great.  This is according to the
graphics guys here.  BTW, the TI 34010 is none too fast.

franka@mmintl.UUCP (Frank Adams) (03/09/88)

In article <1988Mar6.002518.945@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>As an extreme case,
>one can envision a bit-addressable machine -- that is, one whose pointers
>use the low-order three bits to indicate a bit within a byte -- that traps
>whenever those bits aren't zero, leaving the actual use of bit pointers
>entirely up to the software.  When all accesses were in fact aligned, this
>would incur essentially no overhead except the reduction in address space.

This may sound like an off the wall idea, but it makes a lot of sense to me.
This would mean that arithmetic on bit pointers can be done using the
standard arithmetic operations; and no special format is required for them.
Note that the software need not wait for a trap to deal with unaligned data
-- if it knows it is dealing with a bit pointer, it can extract and deal
with the low order bits itself.

As for the address space issue: I personally believe that 32 bit addresses
are too short, and that this will become apparent fairly quickly.  With a 64
bit address, one can afford to use 3 bits this way.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

aglew@ccvaxa.UUCP (03/13/88)

..> Bit addressability

One problem with bit addressability is literal fields for indexed
addressing - they are usually of limited size. Can you afford to
give up 3 bits?

ccplumb@watmath.waterloo.edu (Colin Plumb) (03/14/88)

franka@mmintl.UUCP (Frank Adams) wrote:
>henry@utzoo.uucp (Henry Spencer) writes:
>>... a bit-addressable machine ...
>
>This may sound like an off the wall idea, but it makes a lot of sense to me.

How is this off the wall?  I think it's a wonderful idea.  It seems
more sensible than having pointers on 32-bit machines count 8-bit hunks
of memory.  We have already observed that pointers based on the
machine's word length are a lose - we want to be able to address bytes,
at least.  With 64-bit machines, you want to use something smaller than
a full word for most things.  The other logical extreme is bit
addressability.  With 64-bit pointers, this reduces our address space
from 18,446,744,073,709,551,616 bytes to 2,305,843,009,213,693,952.
Big deal.  Call me a pessimist, but I don't think a single processor,
in any sense we use today, will be able to use this much memory.

Backwards compatibility?  With a C compiler insulating the user, the
only change is that sizeof(char) is now 8.  Pointer arithmetic still
works fine.  And, as all the processor architects here have based their
expectations of success on, if it doesn't uncover bugs in existing C
code, everybody likes it.

Theoretically, we want a pointer to be able to address any object we
can manipulate.  Even if the architecture does not directly support bit
operations, we can twiddle single bits.

>Note that the software need not wait for a trap to deal with unaligned data
>-- if it knows it is dealing with a bit pointer, it can extract and deal
>with the low order bits itself.

... In the spirit of RISC processors which ignore the low two bits of
the address in a word access.

>As for the address space issue: I personally believe that 32 bit addresses
>are too short, and that this will become apparent fairly quickly.  With a 64
>bit address, one can afford to use 3 bits this way.

Well, I think 32 bits will hold single-user computers for a while yet,
but I can see some uses for such a huge virtual address space, and I'm
sure its availability will spark more.  (I remember someone from the
New OED project pinting out that he was manipulating a 550 Meg
database, soon to grow to 800 and beyond, and having a pointer as an
atomic unit which could address any byte in this huge string simplified
many algorithms significantly.)

Of course, MMU designers will hate us for needing more tag bits. :-}

This can't be a new idea.  Why has no one implemented it before, when
32-bit pointers seemed infinite?  Perhaps that will uncover a flaw in
my reasoning.

>-- 
>Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
>Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108
--
	-Colin (watmath!ccplumb)

Zippy says:
Everywhere I look I see NEGATIVITY and ASPHALT...

ram@lscvax.UUCP (Ric Messier) (03/14/88)

In article <2760@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>In article <1988Mar6.002518.945@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>[stuff deleted]
While this is really rather fascinating, I just have one question. What
the hell is it doing in comp.lang.c?? I don't C the relevance here.


-- 
- Kilroy                                                 ram@lscvax.UUCP
'Just what cowpatch is Lyndonville, Vermont in anyway?'

                                                         *** Can't deal, &CRASH

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/14/88)

In article <17458@watmath.waterloo.edu> ccplumb@watmath.waterloo.edu (Colin Plumb) writes:
>This can't be a new idea.  Why has no one implemented it before, when
>32-bit pointers seemed infinite?  Perhaps that will uncover a flaw in
>my reasoning.

It's occasionally been tried, and there is nothing fundamentally wrong with
the idea.  The biggest reason for lack of popularity is that it doesn't help
much with the code generated for typical existing high-level langauges; they
often don't provide convenient access to bit-level data, so applications are
coded to access data in larger chunks and pick it apart themselves.

If direct bit-operation support is not built into some popular systems
programming language (such as a C successor), there will be little
incentive for manufacturers to provide the underlying hardware support.

The main categories of applications I've been involved in that would benefit
from being able to access bits as conveniently as words/bytes are:
	bit-map graphics (especially black-and-white)
	data compression
	encryption (also cryptanalysis of machine ciphers)
	bottom-up parsing (e.g. transitive closure of Boolean matrices)
	simulation
I'm sure there are others.

henry@utzoo.uucp (Henry Spencer) (03/15/88)

> Backwards compatibility?  With a C compiler insulating the user, the
> only change is that sizeof(char) is now 8...

Actually, even that incompatibility isn't necessary.  A C compiler is
perfectly free to decide that it still counts in bytes.  (This may in fact
be desirable, given that the hypothetical machine we are discussing does
not have bit operations, just bit addressing.)  The only situation in
which the compiler can't completely hide what is going on is if pointers
are converted to integers and examined, which is already an implementation-
dependent area.

Best news of all (heh, heh) is that on such a machine one would probably
want to print pointers in octal, so that the bit offset was cleanly broken
out in the low-order digit.  Since octal is the way God meant programmers
to count (the thumbs are parity bits) :-), this is clearly a Good Thing.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

franka@mmintl.UUCP (Frank Adams) (03/16/88)

In article <7452@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>It's occasionally been tried, and there is nothing fundamentally wrong with
>the idea.  The biggest reason for lack of popularity is that it doesn't help
>much with the code generated for typical existing high-level langauges; they
>often don't provide convenient access to bit-level data, so applications are
>coded to access data in larger chunks and pick it apart themselves.

Of course, high-level languages which provide convenient access to bit-level
data have been tried occasionally, and haven't been very popular.  The
biggest reason for this is that popular machine architectures don't provide
efficient access to bit-level data, so applications are coded to access data
in larger chunks and pick it apart themselves.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

lm@arizona.edu (Larry McVoy) (03/17/88)

In article <1988Mar14.193330.488@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>> Backwards compatibility?  With a C compiler insulating the user, the
>> only change is that sizeof(char) is now 8...
>
>Actually, even that incompatibility isn't necessary.  A C compiler is
>perfectly free to decide that it still counts in bytes.  
>............. given that the hypothetical machine we are discussing does

Hypothetical, my foot.  The ETA-10 compiler does exactly what you described.
Crazy thing also converts pointers into bit addresses (p<<3) when you 
put them into an int.  So think about what code this generates:

foo()
{
    register char* bar = (char*)malloc(123);
}

And then get out lint.

>out in the low-order digit.  Since octal is the way God meant programmers
>to count (the thumbs are parity bits) :-), this is clearly a Good Thing.
>-- 
>Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
>condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

Jeez, Henry, I finally found something to date you by :-)  Doncha know
that hex is the wave to future?  (Actually, hex is really nice when you
do network debugging: it's easy to see when the byte order is ``wrong''.)
-- 

Larry McVoy	lm@arizona.edu or ...!{uwvax,sun}!arizona.edu!lm

aglew@ccvaxa.UUCP (03/18/88)

>Best news of all (heh, heh) is that on such a machine one would probably
>want to print pointers in octal, so that the bit offset was cleanly broken
>out in the low-order digit.  Since octal is the way God meant programmers
>to count (the thumbs are parity bits) :-), this is clearly a Good Thing.
>
>Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
>condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

Naw. By the time we have bit addressible machines, we'll have 16 bit bytes,
and will be reading our on-line manual pages in Kanji.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/20/88)

In article <2767@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
-In article <7452@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
->It's occasionally been tried, and there is nothing fundamentally wrong with
->the idea.  The biggest reason for lack of popularity is that it doesn't help
->much with the code generated for typical existing high-level langauges; they
->often don't provide convenient access to bit-level data, so applications are
->coded to access data in larger chunks and pick it apart themselves.
-Of course, high-level languages which provide convenient access to bit-level
-data have been tried occasionally, and haven't been very popular.  The
-biggest reason for this is that popular machine architectures don't provide
-efficient access to bit-level data, so applications are coded to access data
-in larger chunks and pick it apart themselves.

No, the IMPLEMENTATION would do the work in that case.  Although it
amounts to the same thing at the nitty-gritty level, assuming the
particular hardware doesn't support bit operations, it makes
application programming much nicer.  And, when compiled on a machine
that DOES have bit operations, the object code runs much faster.

This vicious circle of cause-and-effect needs to be broken somehow.
The fact that there are several application areas that could benefit
(as I listed earlier) should be sufficient reason to try.

cudcv@daisy.warwick.ac.uk (Rob McMahon) (03/27/88)

Newsgroups: comp.lang.c,comp.arch
Subject: Re: Bit Addressable Architectures
References: <11702@brl-adm.ARPA> <243@eagle_snax.UUCP> <2245@geac.UUCP> <1988Mar6.002518.945@utzoo.uucp> <2760@mmintl.UUCP> <17458@watmath.waterloo.edu>
Reply-To: cudcv@titania.warwick.ac.uk (Rob McMahon)
Distribution: 
Organization: Computing Services, Warwick University, UK

In article <17458@watmath.waterloo.edu> ccplumb@watmath.waterloo.edu (Colin Plumb) writes:
>Backwards compatibility?  With a C compiler insulating the user, the
>only change is that sizeof(char) is now 8.

Only change ?  Sounds like a big "only" to me.  I wonder how much code
out there assumes that sizeof(char) == 1, that sizeof("constant string"), 
or sizeof(initialised_char_array) is the same as strlen(xx)+1 ?  Does
malloc now take number of bits required, or char's ?  It's going to
break either "malloc(n * sizeof(s))" or "malloc(strlen(s) + 1)", or does
everybody but me write "malloc((strlen(s)+1)*sizeof(char))" ?

Rob
-- 
UUCP:   ...!mcvax!ukc!warwick!cudcv	PHONE:  +44 203 523037
JANET:  cudcv@uk.ac.warwick.cu          ARPA:   cudcv@cu.warwick.ac.uk
Rob McMahon, Computing Services, Warwick University, Coventry CV4 7AL, England

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/29/88)

In article <504@sol.warwick.ac.uk> cudcv@cu.warwick.ac.uk (Rob McMahon) writes:
>I wonder how much code out there assumes that ...
>sizeof("constant string"), or sizeof(initialised_char_array)
>is the same as strlen(xx)+1 ?

There's a lot of code like that, no question.  It would continue to
work if sizeof(char) were allowed to be other than 1, on most current
systems, although it might not be portable to other systems or to
future compiler releases.

>Does malloc now take number of bits required, or char's ?

malloc() would be told the number of "bytes" required, where
sizeof(byte)==1.  By "byte" I mean the smallest addressable storage
unit, not necessarily 8 bits in size, nor 1 bit, nor big enough to
represent a character.  (In my proposal this was a "short char".)

Your concerns are legitimate, but so are those of programmers
who have to deal with so-called multi-byte character representations.
Anyway, X3J11 did not buy into the "short char" idea and I doubt they
will be willing to change to it now.

karl@haddock.ISC.COM (Karl Heuer) (03/30/88)

In article <504@sol.warwick.ac.uk> cudcv@cu.warwick.ac.uk (Rob McMahon) writes:
>... or does everybody but me write "malloc((strlen(s)+1)*sizeof(char))" ?

I do.  It automatically makes the argument the right type$ for malloc()
(assuming malloc's argument and sizeof's result are both unsigned, or size_t
in ANSI C); and it makes it easier to convert when you later decide that you
want to use some type other than char.  And, of course, it makes the code no
longer dependent on the questionable% sizeof(char)==1.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
Followups to comp.lang.c only.
________
$ Yes, strlen() is already a size_t in ANSI C, but this situation can also
occur with int-valued expressions.
% I don't question that it's true, just whether it's a good idea.

greg@csanta.UUCP (Greg Comeau) (04/01/88)

Hmm, I'm new on the net here, so excuse me for jumping into the middle of a
discussion, but sizeof(char) is always 1.  The number of bits in a char
is a whole 'nuther story.  This is usually 8, but need not be.  This is
true of both dpANSI C and K&R C.

gp@picuxa.UUCP (Greg Pasquariello X1190) (04/04/88)

In article <113@csanta.UUCP> greg@csanta.UUCP (Root) writes:
>Hmm, I'm new on the net here, so excuse me for jumping into the middle of a
>discussion, but sizeof(char) is always 1.  The number of bits in a char
>is a whole 'nuther story.  This is usually 8, but need not be.  This is
>true of both dpANSI C and K&R C.

This is true of the dpANSI C and K&R C _implementation_, but it is not neces-
sarily true of the C definition.  Sizeof yields "the size, in bytes, of it's 
operand" (K&R pg 188).   The fundamental type char, is "large enough to store
any member of the implementations character set" (K&R pg 182).  This _could_
be multiple bytes!

(God I hope what I just said is true :-))

Greg Pasquariello
ihnp4!picuxa!gp

mouse@mcgill-vision.UUCP (der Mouse) (04/10/88)

In article <7578@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
> In article <504@sol.warwick.ac.uk> cudcv@cu.warwick.ac.uk (Rob McMahon) writes:
>> I wonder how much code out there assumes that ...
>> sizeof("constant string"), or sizeof(initialised_char_array)
>> is the same as strlen(xx)+1 ?
> There's a lot of code like that, no question.  It would continue to
> work if sizeof(char) were allowed to be other than 1,

How could it?

char foo[] = "This is foo";

strlen(foo) is 11.  sizeof(foo) is 12*sizeof(char).  Or are you
redefining strlen() as well?

Personally, I tend towards the sizeof returning size in bits rather
than bytes.  And making bits full objects.  And lots of other
things...but this belongs in comp.lang.d.

					der Mouse

			uucp: mouse@mcgill-vision.uucp
			arpa: mouse@larry.mcrcim.mcgill.edu

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/10/88)

In article <1040@mcgill-vision.UUCP> mouse@mcgill-vision.UUCP (der Mouse) writes:
-In article <7578@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
-> There's a lot of code like that, no question.  It would continue to
-> work if sizeof(char) were allowed to be other than 1,
-How could it?

Naturally, if you truncate the explanation by stopping at the "," then
you missed it.  Why do people narrow their focus to such small contexts?

major@eleazar.Dartmouth.EDU (Lou Major) (04/13/88)

*ahem*
 
char foo[]="This is a test.";
 
sizeof (foo) == sizeof (char *)

NOT the number of machine bytes/words those characters take up. (16, for most
typical installations)

ok@quintus.UUCP (Richard A. O'Keefe) (04/14/88)

In article <8646@eleazar.Dartmouth.EDU>, major@eleazar.Dartmouth.EDU (Lou Major) writes:
> *ahem*
> char foo[]="This is a test.";
> sizeof (foo) == sizeof (char *)
> NOT the number of machine bytes/words those characters take up. (16, for most
> typical installations)

Wrong.  The answer *is* 16.  This is one of the few cases where
foo and &(foo[0]) are different.  I _tried_ this to make sure I was right.
That's always a good idea.

edward@ucbarpa.Berkeley.EDU (Edward Wang) (04/14/88)

 In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU (Lou Major) writes:
 >
 >*ahem*
 > 
 >char foo[]="This is a test.";
 > 
 >sizeof (foo) == sizeof (char *)
 >
 >NOT the number of machine bytes/words those characters take up. (16, for most
 >typical installations)

 This is just plain false.

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (04/14/88)

In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU (Lou Major) writes:
...
| char foo[]="This is a test.";
|  
| sizeof (foo) == sizeof (char *)
| 
| NOT the number of machine bytes/words those characters take up. (16, for most
| typical installations)

Excuse me? Not on any machine I've ever seen, or in any standard I've
ever read. foo is an array, not a pointer. What you said would be true
if the declaration was:
	char *foo = "This is a test";

I've moved followup to comp.lang.c
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

emiller@bbn.com (ethan miller) (04/14/88)

Expires:

Sender:

Followup-To:

Keywords:


In article <877@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
=>In article <8646@eleazar.Dartmouth.EDU>, major@eleazar.Dartmouth.EDU (Lou Major) writes:
=>> char foo[]="This is a test.";
=>> sizeof (foo) == sizeof (char *)
=>> NOT the number of machine bytes/words those characters take up. (16, for most
=>> typical installations)
=>
=>Wrong.  The answer *is* 16.  This is one of the few cases where
=>foo and &(foo[0]) are different.  I _tried_ this to make sure I was right.
=>That's always a good idea.

Sure is.  What did you try?  _I_ just tried printing foo and &(foo[0]), and
they are the same.  BTW, I also tried sizeof (foo), and it is 16.

ethan
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
Ethan Miller			   | "If they don't keep on exercising
BBN Laboratories (Cambridge, MA)   |  their lips, he thought, their brains
ARPAnet: emiller@bbn.com           |  start working."
Disclaimer: BBN didn't write this. |   -- The Hitchhiker's Guide to the Galaxy
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

chris@mimsy.UUCP (Chris Torek) (04/15/88)

In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU
(Lou Major) writes:
>*ahem*
>char foo[]="This is a test.";
>sizeof (foo) == sizeof (char *)
>NOT the number of machine bytes/words those characters take up.

Quite wrong.

There *are* compilers that produce the wrong answer for

	sizeof("string")

(the `correct' number is 7), but any compiler that gets sizeof(foo)
above wrong (2 or 4 rather than 16) is so egregiously broken that
it is not worth discussing.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

root@mfci.UUCP (SuperUser) (04/15/88)

Expires:

Sender:

Followup-To:

Distribution:

Keywords:


In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU (Lou Major) writes:
}
}*ahem*
}
}char foo[]="This is a test.";
}
}sizeof (foo) == sizeof (char *)
}
}NOT the number of machine bytes/words those characters take up. (16, for most
}typical installations)

Not on this planet.  From K&R:  "When applied to an array, the result is the
total number of bytes in the array."  The parentheses don't alter this fact.
The example above is confusing because the number of characters in foo is
exactly 16 (15 in the quotes plus the final null character).

ok@quintus.UUCP (Richard A. O'Keefe) (04/15/88)

In article <23396@bbn.COM>, emiller@bbn.com (ethan miller) writes:
> In article <877@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
> =>In article <8646@eleazar.Dartmouth.EDU>, major@eleazar.Dartmouth.EDU (Lou Major) writes:
> =>> sizeof (foo) == sizeof (char *)
> =>Wrong.  The answer *is* 16.  This is one of the few cases where
> =>foo and &(foo[0]) are different.  I _tried_ this to make sure I was right.
> =>That's always a good idea.
> Sure is.  What did you try?  _I_ just tried printing foo and &(foo[0]), and
> they are the same.  BTW, I also tried sizeof (foo), and it is 16.

foo and &(foo[0]) as expressions have different types:
	foo	  : array-of-16-chars
	&(foo[0]) : pointer-to-char
sizeof notices this difference.  In almost any other context, there is an
implicit conversion to pointer-to-char form.  In particular, print() is not
going to reveal the difference.  Consider
	short x = 1; int y = 1;
	printf("%d %d\n", x, y);
Printing obscures the difference between x and y, and in just the same way
it obscures the difference between foo and &(foo[0]).

mike@turing.UNM.EDU (Michael I. Bushnell) (04/15/88)

In article <8646@eleazar.Dartmouth.EDU> major@eleazar.Dartmouth.EDU (Lou Major) writes:

>*ahem*

>char foo[]="This is a test.";

>sizeof (foo) == sizeof (char *)

>NOT the number of machine bytes/words those characters take up. (16, for most
>typical installations)

Perhaps people have had the POINTER == ARRAY thing hammerred into
their skull too hard.  According to the 4.3 BSD "C Programming
Language Reference Manual, page 8 [PS1:1-8]", I find:

  The sizeof operator yeilds the size in bytes of its operand.  (...)
  When applied to an array, the result is the total number of bytes in
  the array.  The size is determined from the declarations of the
  objects in the expression....

But what about the compiler?  Here are the results.

for the code

char foo1[]="This is a test.";
int size1=sizeof foo1;
char *foo2="This is a test.";
int size2=sizeof foo2;

I get:

[4.3 BSD pcc]:

LL0:
	.data
	.data
	.globl	_foo1
_foo1:
	.long	0x73696854
	.long	0x20736920
	.long	0x65742061
	.long	0x2e7473
	.data
	.align	2
	.globl	_size1
_size1:
	.long	16		# NOTE: 16 for the array

	.align	2
	.globl	_foo2
_foo2:
	.data	2
L14:
	.ascii	"This is a test.\0"
	.data
	.long	L14
	.align	2
	.globl	_size2
_size2:
	.long	4		# NOTE: 4 for the pointer

[GNU C Compiler 1.18]:

#NO_APP
.globl _foo1
.data
	.align 0
_foo1:
	.ascii "This is a test.\0"
.globl _size1
.data
	.align 2
_size1:
	.long 16		# NOTE: 16 for the array

.globl _foo2
.text
	.align 0
LC0:
	.ascii "This is a test.\0"
.data
	.align 2
_foo2:
	.long LC0
.globl _size2
.data
	.align 2
_size2:
	.long 4			# NOTE: 4 for the pointer

Plug: Note how much easier it is to read the gcc stuff too...

                N u m q u a m   G l o r i a   D e o 

			Michael I. Bushnell
			HASA - "A" division
14308 Skyline Rd NE				Computer Science Dept.
Albuquerque, NM  87123		OR		Farris Engineering Ctr.
	OR					University of New Mexico
mike@turing.unm.edu				Albuquerque, NM  87131
{ucbvax,gatech}!unmvax!turing.unm.edu!mike

tainter@ihlpg.ATT.COM (Tainter) (04/18/88)

In article <8646@eleazar.Dartmouth.EDU>, major@eleazar.Dartmouth.EDU (Lou Major) writes:

Mr. Major asserts that given:
> char foo[]="This is a test.";

then:
> sizeof (foo) == sizeof (char *)
> NOT the number of machine bytes/words those characters take up. (16, for most
> typical installations)

Which is patently absurd.

--j.a.tainter