[comp.arch] 64 bits--why stop there?

ccc_ldo@waikato.ac.nz (Lawrence D'Oliveiro, Waikato University) (08/18/90)

I remember reading a very old book by M V Wilkes, called "Basic Machine
Principles". I was only a school student at the time, and I didn't
understand many of his arguments. But one thing that stuck in my mind
was his concept of structuring the virtual (i e programmer-visible)
address space on a tree basis, so that regions--at any level--could
grow and shrink independently. The advantage was that you could
have resizable objects that didn't need to have their addresses changed.

What do people think of other ways of addressing memory, besides
using a fixed-length integer? Would it make certain kinds of programs
easier to write? Or would it just move the overhead from the software
into the hardware?

Lawrence D'Oliveiro                       fone: +64-71-562-889
Computer Services Dept                     fax: +64-71-384-066
University of Waikato            electric mail: ldo@waikato.ac.nz
Hamilton, New Zealand    37^ 47' 26" S, 175^ 19' 7" E, GMT+12:00
Man who failed to pay bill for vasectomy is sent a final demand--
and threatened with reconnection.

wf@cs.glasgow.ac.uk (Mr Bill Findlay) (08/20/90)

In article <1263.26cdaecc@waikato.ac.nz> ccc_ldo@waikato.ac.nz (Lawrence D'Oliveiro, Waikato University) writes:
>I remember reading a very old book by M V Wilkes, called "Basic Machine
>Principles". I was only a school student at the time, and I didn't
>understand many of his arguments. But one thing that stuck in my mind
>was his concept of structuring the virtual (i e programmer-visible)
>address space on a tree basis, so that regions--at any level--could
>grow and shrink independently. The advantage was that you could
Actually, the book was by J.K Iliffe.
He prototyped a machine while working for ICL about 20 years ago,
but the architecture was rejected as the basis for a range of machines.

mhjohn@aspen.IAG.HP.COM (Mark H Johnson) (08/20/90)

Clearing large memory can be done with tricks as noted in an 
earlier response.  However, often memory is tested before it
is used.  Trying a few simple patterns to identify pages to map
out can defer maintenance.   The only portion
of physical memory that must be intialized is that needed to boot
the system.  The rest can be done in background or "pay as you go".
At least one mini vendor (Prime) has done this for some time.

Mark H. Johnson  IAG Hewlett-Packard, mhjohn@iag.hp.com

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (08/22/90)

  While we're all talking about 64 bits, where is it writ' that word
size shall be a power of two bits? Outside of the prevalence of the
eight bit byte, is there a good technical reason for it? Certainly the
old Honeywells I used to use, with their lovely nine bit bytes and 36
bit words... Conversion to IBM was *REAL* painful. 

    bytes - never had a problem personally, or heard of one
    float - in many cases converted to double
    double - about 10% of the programs were partially rewritten to
        insure valid results. A few were moved to Cray, some dropped.
    int - if it couldn't justify running on the Cray, it got dropped.
        Faking multiple precision in FORTRAN is ugly and not worth it.

  Now this was goping from 36 to 32 bits. If four bits can hurt that
much, how much would 48 bit help, instead of 64? I know there are some
48 bits machines made (or were), what gains were there?

  int is +/- 10^14, assuming 12 bit exponent and 36 mantisa, range
is 10^616, accuracy 1 in 10^10.8. That seems to offer a pretty good
single precision, with double precision even better. The addressing
would be 262144GB, enough for memory mapped databases for a decade or
so.

  Maybe some of the chip designers can give an idea of what the cost
ratio is for 48 vs 64 bits. Obviously there can be a lot of new
applications tackled with 48 bits, is the saving over 64 significant, or
should we expect a leap to 64? 48 bits soon is more useful than 64 bits
eventually, perhaps.

  Points have been made for going to more than 64 bits, now I've put out
a few thoughts on going for less. Since we have some ideas about 64 bit
systems in actual practice, can someone tell us about 48?
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
    VMS is a text-only adventure game. If you win you can use unix.

sysmgr@KING.ENG.UMD.EDU (Doug Mohney) (08/22/90)

In article <2437@crdos1.crd.ge.COM>, davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes:
>
>  While we're all talking about 64 bits, where is it writ' that word
>size shall be a power of two bits? Outside of the prevalence of the
>eight bit byte, is there a good technical reason for it? 

A good point here. Won't things change with different technologies down
the pipe (optical). Maybe you'll want to do things in terms of red, blue,
and yellow....I dunno.

mash@mips.COM (John Mashey) (08/22/90)

In article <2437@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:

>  While we're all talking about 64 bits, where is it writ' that word
>size shall be a power of two bits? Outside of the prevalence of the
>eight bit byte, is there a good technical reason for it? Certainly the
>old Honeywells I used to use, with their lovely nine bit bytes and 36
>bit words... Conversion to IBM was *REAL* painful. 
 ....
>  Now this was goping from 36 to 32 bits. If four bits can hurt that
>much, how much would 48 bit help, instead of 64? I know there are some
>48 bits machines made (or were), what gains were there?
.....
>  Maybe some of the chip designers can give an idea of what the cost
>ratio is for 48 vs 64 bits. Obviously there can be a lot of new
>applications tackled with 48 bits, is the saving over 64 significant, or
>should we expect a leap to 64? 48 bits soon is more useful than 64 bits
>eventually, perhaps.

1) As bill notes, computing history is filled with machines that have/had
word sizes that weren't powers of two. Some examples include:
36:	IBM 7090, DEC PDP-10, GE 635, Univac 1108
48:	Burroughs B5000
51:	Burroughs B6700
60:	CDC 6600
(and of course, the successors to these)
And of course, there were plenty of minis with 12, 18, or 24.
If you look around, you can probably find that somebody has built some
machine somewhere with almost any number of bits/word from 8 to 64, especially
given that tagged-architecture machines often have unusual sizes.

2) However, at this point, most people build general-purpose machines with
power-of-two wordsizes, and it seems likely that this will continue,
with the possible exception that tagged-architecture machines might
have power-of-two space for data, plus bits for tags.
Why?
	These days, you would have to think long and hard before creating
	a new general-purpose architecture to which it is difficult to:
	port C
	port UNIX
	port FORTRAN, COBOL, PL/1, PASCAL, etc, etc.
and
	you would want to think real hard before introducing archtitectures
whose character-size is inconvenient and poorly matched with existing
peripherals, support chips, etc.

Note: I did not say you'd never do this, I just said you'd better have
pretty good reasons for it.

3) Note that with 8-bit chars, 16-bit shorts, and words of 32 or 64,
addressing is simple (low order bits select sub-unit within the word),
everything is packed 100% full, and there are no weirdly-special pointers
to different kinds of objects.

3) Unfortunately, 48 doesn't work very well under these circumstances:
	Assume char = 8 bits, short = 16, and int = 48.
	6 chars/word, 3 shorts.  Ugh.
Now, you get several very unpleasant choices:
	a) The machine is byte-addressed, and to obtain the address of
	the word containing a byte, you get to divide by 6, something
	hardware designers show scant enthusiasm. :-)
	b) The machine is word addressed, with some kind of special byte
	pointer (the solution adopted by most of the non-zero-power of
	two machines).  A typical mechanism would use the low-order
	3 bits to select the byte within the word, with special string
	instructions that increment the word address, and reset the byte
	number to 0, whenever the byte count exceed the number of bytes.
	Likewise, you will probably do something for shorts.
	In this case, the hardware folks may be ahppier, but the compiler
	people are not.  C has certainly been ported to such machines,
	and so have many UNIX commands, so it is possible. But it is
	not fun, and even worse, if you'd like to get lots of third-party
	software, things will not be so easy. (People may recall that
	the Stanford MIPS used word-addressing with byte pointers, whereas
	none of the MIPS Computer Systems chips do so....there's a reason :-)

4) Well, maybe 8-bit bytes are bad, and a 48-bit machine should have
12-bit bytes, 24-bit shorts.  This is probably easier for porting software,
but there will still be problems. it will be easier to make the code on
a single machine consistent, but it will be worse talking to the outside
world.  Networking code will be exciting, and you're on your own when it
comes to busses, periperhal chips, SIMMs, etc.  Finally, 12-bit bytes have the
awkwardness of using 50% more space than 8-bit ones, without even having
the advantage of improving language coverage much (i.e., as for some
Asian languages that really need about 16 bits/character, or more).

SUMMARY:
1) Software inertia strongly impels people to build machines whose
words contain 2**n bytes, for C especially, but also for other languages.
2) (Some) software inertia and (much) hardware inertia impels people
to use 8-bit characters.
3) So, I'd be amazed if a new general-purpose architecture would likely
be viable at 48 bits.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

usenet@nlm.nih.gov (usenet news poster) (08/22/90)

In article <41004@mips.mips.COM> mash@mips.COM (John Mashey) writes:

    [Why not 48-bit processors?]

>1) Software inertia strongly impels people to build machines whose
>words contain 2**n bytes, for C especially, but also for other languages.
>2) (Some) software inertia and (much) hardware inertia impels people
>to use 8-bit characters.
>3) So, I'd be amazed if a new general-purpose architecture would likely
>be viable at 48 bits.

In scientific work 64-bit floating point has become standard.  Hydrid
processors with some 32 and some 64-bit paths/registers are viable, but
increasing the 32-bit paths to 48 is not going to help much in 64-bit
transfers, and you would still need two 48-bit registers to store one
64-bit number.  Finally, I don't think 48-bit FP could totally replace
64-bit, although it could do alot better than 32-bit FP.

David States

mash@mips.COM (John Mashey) (08/22/90)

In article <41004@mips.mips.COM>, mash@mips.COM (John Mashey) writes:
....
> 3) So, I'd be amazed if a new general-purpose architecture would likely
> be viable at 48 bits.

Of course, several people wrote to point out various 24/48 bit machines.
So I must not have made the point strongly enough.

1) Computer design is driven by the tradeoffs
	a) At that time of design
	b) Expected over the life of the design, which of course,
	could easily be as long as 25-30 years (S/360, so far :-)
	c) With a proper balance between a) and b), i.e., if you are
	a startup, you'd better weight a) enough that the first product
	out the door makes sense.  Bigger organizations might make
	some kinds of longer-term tradeoffs that are non-optimal in first
	round, but better over the life.  Obviously the best is to have
	things that are terrific at each round :-)
	-As an example, suppose you wanted to build the best thing you could
	in a given silicon technology, in a given amount of space.
	You'd put things as close together as possible.
	If you did this, the chip may well not shrink easily into the
	next technology, especially because you don't get to shrink the
	wires as fast as the transistors. An alternative (which is what
	MIPS does), is to make each base design as fast as it can,
	subject to allowing for 1 or more straightforward shrinks,
	so the point of optimization is more like the 2nd or 3rd
	technology, not the first. 

2) But the tradeoffs change over time, and they can change a lot.
What was a good idea 10 or 20 years may be the wrong choice now.
A point of the earlier posting was that the software-related tradeoffs
have changed radically in the last 10 years, such that anyone doing
a 8-bit-byte, 32-bit word (or 64, sometime), two's-complement,
byte-addressed architecture,
gets a "free ride" with a huge amount of fairly portable software,
compared to someone who does something much different.
(This is not necessarily good, but I claim that it is true, and not likely
to change for at least 5-10 years, because somethign different has got
to be compelling.)
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

tgg@otter.hpl.hp.com (Tom Gardner) (08/22/90)

|  While we're all talking about 64 bits, where is it writ' that word
|size shall be a power of two bits? Outside of the prevalence of the
|eight bit byte, is there a good technical reason for it? Certainly the
|old Honeywells I used to use, with their lovely nine bit bytes and 36
|bit words... Conversion to IBM was *REAL* painful.

Who says they have to have an _even_ number of bits? The first machine
I used had 39 bits. Why? Simple, when you know how: each word contained
two 19 bit instructions plus a "modifier" bit.

Drift on: most coax cables are 50 or 75 ohm. Why was TAT-7 (Trans-Atlantic
Telephone cable 7) 61.8 ohms?

I'll give a piece of wedding cake to the person who gives the nearest answer.

brian@ncrorl.Orlando.NCR.COM (brian) (08/22/90)

As John indicated in synopsis #3, He'd doubt if anyone would come
up with a 48-bit machine.  Well.....  Take yea ole Harris Vulcan
and you will find that this beast (yup, that's the word) has
a 24bit word (and bus and......)  SOOOO all they have to do is
double it, right???

Sorry, just couldn't resist... :-)

brian
"His job is to shed light, not to master!!!"

richard@aiai.ed.ac.uk (Richard Tobin) (08/22/90)

In article <2437@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
>  While we're all talking about 64 bits, where is it writ' that word
>size shall be a power of two bits? Outside of the prevalence of the
>eight bit byte, is there a good technical reason for it?

As you note, the existence of eight bit bytes makes a difference.  A
risc-like machine on which the number of bytes (by which I mean the
smallest addressable unit) in a word was not a power of two would be
very strange.  How big would the pages be?  What would the alignment
restrictions be? ("doubles must start at an address which is a multiple 
of six"?).

-- Richard

-- 
Richard Tobin,                       JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,           ARPA:  R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.                UUCP:  ...!ukc!ed.ac.uk!R.Tobin

kevinw@portia.Stanford.EDU (Kevin Rudd) (08/23/90)

In article <3259@skye.ed.ac.uk> richard@aiai.UUCP (Richard Tobin) writes:
>In article <2437@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
>>  While we're all talking about 64 bits, where is it writ' that word
>>size shall be a power of two bits? Outside of the prevalence of the
>>eight bit byte, is there a good technical reason for it?
>
>As you note, the existence of eight bit bytes makes a difference.  A
>risc-like machine on which the number of bytes (by which I mean the
>smallest addressable unit) in a word was not a power of two would be
>very strange.  How big would the pages be?  What would the alignment
>restrictions be? ("doubles must start at an address which is a multiple 
>of six"?).

And who says that machines must have byte addressable memory?  There would
then be no need for alignment restrictions based on bytes.  Of course, as has
been mentioned already, machines with odd sized sub-words (most common
is the byte) can have problems with SW conversion.  But if there is no address
access for a byte except through instructions (such as "get byte n" or
"set byte n") then this shouldn't be a problem.  Page size is only a
problem in terms of mapping the machine page size in words into a
backing store block size.  Since most peripherals are byte or word oriented
it seems that this is where the real problem would lie.  But if there is
a reason for such a machine in a major marketplace, a major manufacturer like
could certainly design custom hardware to match this design.  But
there'd better be a *good* reason for having such an oddball scheme.  For
most applications it doesn't seem too practical...  At least, considering
the marketplace.  Besides, who buys single source, anyway...  386+ and 68k
excepted...

  -- Kevin

seanf@sco.COM (Sean Fagan) (08/23/90)

In article <2437@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
>  While we're all talking about 64 bits, where is it writ' that word
>size shall be a power of two bits? Outside of the prevalence of the
>eight bit byte, is there a good technical reason for it?

None.  Example:  CDC Cyber 170-state (ok:  everyone who knew I was going to
say that, raise your hand 8-)).  Very fast machine, RISC-type architecture,
designed by Seymour ``God'' Cray, etc.

It had 60-bits words (one's complement, even!).  It's OS, NOS, used 6-bit
characters, and could pack 10 of them to a word (btw:  word-addressing
*only*).

Addresses were 18-bits, and this, plus the above, allowed a nice little
thing:  a symbol name and its address could take up one word:  7 6-bit
``bytes'' for the name (since FORTRAN only allows 6, and NOS allowed 7), and
18 bits for the address.  Nice.

The sucessor to the 7600, the 8600, was to be a 64-bit machine (don't know
how many address bits, though), partially because the 60-bits was, even at
the time (late 60's to early 70's), causing some problems.  Also because
people wanted it, of course 8-).

-- 
Sean Eric Fagan  | "let's face it, finding yourself dead is one 
seanf@sco.COM    |   of life's more difficult moments."
uunet!sco!seanf  |   -- Mark Leeper, reviewing _Ghost_
(408) 458-1422   | Any opinions expressed are my own, not my employers'.

dricejb@drilex.UUCP (Craig Jackson drilex1) (08/23/90)

In article <1990Aug22.031911.7376@nlm.nih.gov> states@tech.NLM.NIH.GOV (David States) writes:
>In article <41004@mips.mips.COM> mash@mips.COM (John Mashey) writes:
>
>    [Why not 48-bit processors?]
>
>>1) Software inertia strongly impels people to build machines whose
>>words contain 2**n bytes, for C especially, but also for other languages.

I think C would be the chief offender here--few other languages expose
the characters/word ratio quite as much.  Note, however, that at least
one C compiler has been written for a 48-bit word machine.
>>2) (Some) software inertia and (much) hardware inertia impels people
>>to use 8-bit characters.

The nice thing about 24 and 48 bit words, in years past, was that you
could straddle the six-bits-per-char/eight-bits-per-char argument.

>>3) So, I'd be amazed if a new general-purpose architecture would likely
>>be viable at 48 bits.

I'd agree with this.  In this and several other regards, the industry
has significantly calcified in the last few years.  Similarly, I doubt
if a new general-puropse architecture would be viable if a Unix port
(not just a POSIX interface) wasn't almost trivial.

>In scientific work 64-bit floating point has become standard.  Hydrid
>processors with some 32 and some 64-bit paths/registers are viable, but
>increasing the 32-bit paths to 48 is not going to help much in 64-bit
>transfers, and you would still need two 48-bit registers to store one
>64-bit number.  Finally, I don't think 48-bit FP could totally replace
>64-bit, although it could do alot better than 32-bit FP.

We have been quite successful doing econometric calculations in 48-bit
floating point on a Burroughs for years.  (Econometrics  may not be
scientific, but it's certainly numerically intensive.)  We only rarely
have had to resort to 96-bit double precision.
-- 
Craig Jackson
dricejb@drilex.dri.mgh.com
{bbn,axiom,redsox,atexnet,ka3ovk}!drilex!{dricej,dricejb}

mustard@sdrc.UUCP (Sandy Mustard) (08/23/90)

48 bits for addressing is already used in the IBM AS/400.
This can easily be expanded to 64 bits without impacting
the applications.  the applications address objects using
a 64 bit virtual address and at execution time this is converted
to 48 bits.

Sandy Mustard
SDRC

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (08/24/90)

In article <1990Aug22.031911.7376@nlm.nih.gov> states@tech.NLM.NIH.GOV (David States) writes:

| 64-bit number.  Finally, I don't think 48-bit FP could totally replace
| 64-bit, although it could do alot better than 32-bit FP.

  True, but 96 bit f.p. would certainly be nice for some things.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
    VMS is a text-only adventure game. If you win you can use unix.

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (08/24/90)

In article <15249@drilex.UUCP> dricejb@drilex.UUCP (Craig Jackson drilex1) writes:

| I think C would be the chief offender here--few other languages expose
| the characters/word ratio quite as much.  Note, however, that at least
| one C compiler has been written for a 48-bit word machine.
| >>2) (Some) software inertia and (much) hardware inertia impels people
| >>to use 8-bit characters.

  Having run C on a 36 bit machine, I can say that there are programs
which break, but there are a lot which break on a Cray (64 bit) too, so
the assumption that "all the world's a VAX" causes problems in any case.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
    VMS is a text-only adventure game. If you win you can use unix.

firth@sei.cmu.edu (Robert Firth) (08/24/90)

In article <1263.26cdaecc@waikato.ac.nz> ccc_ldo@waikato.ac.nz (Lawrence D'Oliveiro, Waikato University) writes:
>I remember reading a very old book by M V Wilkes, called "Basic Machine
>Principles". 

Here it is

	J K Iliffe: Basic Machine Principles
	Macdonald/Elsevier Computer Monographs
	Aylesbury, 1968

Sigh! and to think I bought it 1969.

Yes, it contains some good discussions of addressing principles.
However, I still think Bell & Newell's 'Computer Structures' -
the old 1971 edition - has the best analysis of the key issues.

meissner@osf.org (Michael Meissner) (08/27/90)

In article <1990Aug23.015636.506@portia.Stanford.EDU>
kevinw@portia.Stanford.EDU (Kevin Rudd) writes:

| In article <3259@skye.ed.ac.uk> richard@aiai.UUCP (Richard Tobin) writes:
| >In article <2437@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
| >>  While we're all talking about 64 bits, where is it writ' that word
| >>size shall be a power of two bits? Outside of the prevalence of the
| >>eight bit byte, is there a good technical reason for it?
| >
| >As you note, the existence of eight bit bytes makes a difference.  A
| >risc-like machine on which the number of bytes (by which I mean the
| >smallest addressable unit) in a word was not a power of two would be
| >very strange.  How big would the pages be?  What would the alignment
| >restrictions be? ("doubles must start at an address which is a multiple 
| >of six"?).
| 
| And who says that machines must have byte addressable memory?  There would
| then be no need for alignment restrictions based on bytes.  Of course, as has
| been mentioned already, machines with odd sized sub-words (most common
| is the byte) can have problems with SW conversion.  But if there is no address
| access for a byte except through instructions (such as "get byte n" or
| "set byte n") then this shouldn't be a problem.  Page size is only a
| problem in terms of mapping the machine page size in words into a
| backing store block size.  Since most peripherals are byte or word oriented
| it seems that this is where the real problem would lie.  But if there is
| a reason for such a machine in a major marketplace, a major manufacturer like
| could certainly design custom hardware to match this design.  But
| there'd better be a *good* reason for having such an oddball scheme.  For
| most applications it doesn't seem too practical...  At least, considering
| the marketplace.  Besides, who buys single source, anyway...  386+ and 68k
| excepted...

Obviously you've never had the 'fun' of porting to a machine which
different types of pointers.  I supported a C compiler on such a
machine for 7 years (the Data General MV/Eclipse computers), and if I
never have to see such a beast again, it will be too soon.

C programmers are notorious for thinking that all pointers look the
same.  I had to put in several options to either flag when one type of
pointer was used in the wrong context, or silently add extra
instructions so that programmers who were too lazy to type things
correctly could get their programs to work.

IMHO, the 64 bit machine should represent all addresses in bits, not
bytes (and yes this will probably break those programs which do int
arithmetic on pointers -- but those are probably in the miniority).
Before people lynch me, let me explain, that I think that the
addresses that are not appropriately aligned should trap.
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Do apple growers tell their kids money doesn't grow on bushes?

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (08/29/90)

In article <MEISSNER.90Aug27105232@osf.osf.org> meissner@osf.org (Michael Meissner) writes:

| Obviously you've never had the 'fun' of porting to a machine which
| different types of pointers.  I supported a C compiler on such a
| machine for 7 years (the Data General MV/Eclipse computers), and if I
| never have to see such a beast again, it will be too soon.

  Actually I never ported C, but I have worked on a GE600/6000/DPS
language, and it certainly had every type of pointer you could wish.
 
| C programmers are notorious for thinking that all pointers look the
| same.  I had to put in several options to either flag when one type of
| pointer was used in the wrong context, or silently add extra
| instructions so that programmers who were too lazy to type things
| correctly could get their programs to work.

  Actually the ANSI compilers I've used are very good about complaining
until you use casts or clean up your logic. Far better code is the
result. In truth the biggest problem I've seen is people putting
addresses into ints.
 
| IMHO, the 64 bit machine should represent all addresses in bits, not
| bytes (and yes this will probably break those programs which do int
| arithmetic on pointers -- but those are probably in the miniority).
| Before people lynch me, let me explain, that I think that the
| addresses that are not appropriately aligned should trap.

  As I recall the iapx432 had bit addresses, didn't it? Lord, I can't
remember... time *does* heal all wounds. 
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
    VMS is a text-only adventure game. If you win you can use unix.

colin@array.UUCP (Colin Plumb) (08/29/90)

In article <MEISSNER.90Aug27105232@osf.osf.org> meissner@osf.org (Michael Meissner) writes:
> IMHO, the 64 bit machine should represent all addresses in bits, not
> bytes (and yes this will probably break those programs which do int
> arithmetic on pointers -- but those are probably in the miniority).
> Before people lynch me, let me explain, that I think that the
> addresses that are not appropriately aligned should trap.

Strong agreement.  There's something fundamentally ugly about byte addressing
when the bit is the "true" least addressible unit.  The fact that we've gotten
used to it (e.g. the bitfield kludge in C) doesn't necessarily make it good.
-- 
	-Colin

rpeglar@csinc.UUCP (Rob Peglar) (08/29/90)

In article <631@array.UUCP>, colin@array.UUCP (Colin Plumb) writes:
> In article <MEISSNER.90Aug27105232@osf.osf.org> meissner@osf.org (Michael Meissner) writes:
> > IMHO, the 64 bit machine should represent all addresses in bits, not
> > bytes (and yes this will probably break those programs which do int
> > arithmetic on pointers -- but those are probably in the miniority).
> > Before people lynch me, let me explain, that I think that the
> > addresses that are not appropriately aligned should trap.
> 
> Strong agreement.  There's something fundamentally ugly about byte addressing
> when the bit is the "true" least addressible unit.  The fact that we've gotten

Even stronger agreement.

Speaking from experience on such a machine (true bit addresses), one has
to deeply consider the relationship between compiler and OS code where
byte addressing is taken for granted.  sure, the compiler generates
the address (bit) of byte n as 0x100 (say) and byte n+1 as 0x108 (for
example);  the real pain reveals itself in the numerous occasions where
either the OS code or an application assumes integer = pointer.  you
just can't add one to the above example (0x100+1 = 0x101) and assume
byte n+1 lives at that address (0x101).  this kind of subtlety caused
us the most pain in the development cycle.

Rob

Rob Peglar
uunet!csinc!rpeglar

-- 
Rob Peglar	Comtrol Corp.	2675 Patton Rd., St. Paul MN 55113
		A Control Systems Company	(800) 926-6876

...uunet!csinc!rpeglar

rcpieter@svin02.info.win.tue.nl (Tiggr) (08/30/90)

rpeglar@csinc.UUCP (Rob Peglar) writes:

|Speaking from experience on such a machine (true bit addresses), one has
|to deeply consider the relationship between compiler and OS code where
|byte addressing is taken for granted.  sure, the compiler generates
|the address (bit) of byte n as 0x100 (say) and byte n+1 as 0x108 (for
|example);  the real pain reveals itself in the numerous occasions where
|either the OS code or an application assumes integer = pointer.  you
|just can't add one to the above example (0x100+1 = 0x101) and assume
|byte n+1 lives at that address (0x101).  this kind of subtlety caused
|us the most pain in the development cycle.

The person adding 1 to the int expecting to have the (char *) casted int point
to the next byte shouldn't call himself a programmer ('cos he isn't producing
a program but pure garbage (eh?)).

Tiggr

mash@mips.COM (John Mashey) (08/30/90)

In article <1372@svin02.info.win.tue.nl> rcpieter@svin02.info.win.tue.nl (Tiggr) writes:
>rpeglar@csinc.UUCP (Rob Peglar) writes:

>|Speaking from experience on such a machine (true bit addresses), one has
...
>|byte n+1 lives at that address (0x101).  this kind of subtlety caused
>|us the most pain in the development cycle.

>The person adding 1 to the int expecting to have the (char *) casted int point
>to the next byte shouldn't call himself a programmer ('cos he isn't producing
>a program but pure garbage (eh?)).

This area clearly needs discussion, not just from this comment, but from
another email that came to me, but that I accidentally lost and couldn't
reply to.  (The email told me that wanting bit-addressing hardware to have
a language expression that should be portable meant that if everyone had
my attitude, calculus and most other important mathematics
never would have been invented. (??)  (As I was a physicist & mathematician
by original background, and have studied science/technology history a lot,
I don't quite understand this, but maybe whoever wrote me the mail will send
me more info to educate me.))

Anyway, consider computer architecture, languauge design, standards, and
practice (as opposed to purist theory):

IF you are designing computers for any market in which noticable amounts
of code already exist

THEN anything you do that makes it harder to port such software
must be justified by something that people want; the more difficulty,
the more absolutely compelling that something must be.

This fact is what causes computer companies to maintain upward compatibility,
up to the point where there is too much competitive disadvantage in
doing so. 

These days, the domain of possibilities includes:
	a) It's upward-compatible from anything that has a large installed
	base, and then it's faster, cheaper,  or smaller.
	b) It's faster, cheaper, or smaller, and although it's different,
	its architecture+software make it easy to recompile lots of
	existing code with zero hassle.
	c) It's faster, cheaper, or smaller, and although it's different,
	its architecture+software make it easy to recompile lots of
	existing code with minimum hassle.  For example, minor portability
	flaws will need to get fixed.  EX: most of the RISCs pass some arguments
	in registers, and they like you to take varags.h seriously.
	d) It's (better).... and with modest work, it's a lot faster.
	Ex: recoding to get to vector libraries, or slight rework of
	algorithms, or a few extra declarations to help.
	(Note: Perfect Club has nice methodology in this area.)
	Ex: profile-driven compiling, which requires hacking of makefiles.
	e) It's (better), but only when you redesign the code
	substantially.
	f) It's (better), but it doesn't even work until you redesign it,
	or recode it in a special language available nowhere else.
	g) It's not any better, just different, and hard to port to.

Now:
	a) is anybody with serious installed base.
	c) is anybody with hot new general-purpose technology, like RISCs.
	It's fairly hard to get b).  Doing c) well requires good judgement
	about 1) How people really use languages, and 2) The realistic
	progress of portability standards  and 3) if necesary, the realistic
	practicality of requiring standards a little earlier than people
	would actually get there naturally.
	d) includes many scientific machines, like vector machines
	e) often includes fine-grain parallel processors
	f) might include the state of Transputers when they first came
	out and Occam was really the preferred languauge)
	g) is what happened to vendors who tried to move UNIX to some
	superminis or mainframes that C didn't really fit very well,
	i.e., I believe relatively few machines were sold this way.

NOTE: THERE IS NOTHING WRONG WITH e) or f), IF you can find people for
whom the value of the improvement is worth the hassle, and there are
plenty of such applications that demand highest performance or lowest
cost.  Progress sometimes happens this way, and this is good.
Also, compelling reasons cause people to clean up their act.  For
instance, recall that once upon a time, UNIX was pretty much Little-Endian,
with a lot of 16-bit stuff that wasn't very portable.  Things improved
when there was a good reason to do it.

However, it is still clear that bit-addressing would break more programs
than byte-addressing, so all I wanted was for someone to work through
a complete example to show why it would be A Good Thing on balance.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

lm@snafu.Sun.COM (Larry McVoy) (08/30/90)

In article <1372@svin02.info.win.tue.nl> rcpieter@svin02.info.win.tue.nl (Tiggr) writes:
>rpeglar@csinc.UUCP (Rob Peglar) writes:
>
>|Speaking from experience on such a machine (true bit addresses), one has
>|to deeply consider the relationship between compiler and OS code where
>|byte addressing is taken for granted.  sure, the compiler generates
>|the address (bit) of byte n as 0x100 (say) and byte n+1 as 0x108 (for
>|example);  the real pain reveals itself in the numerous occasions where
>|either the OS code or an application assumes integer = pointer.  you
>|just can't add one to the above example (0x100+1 = 0x101) and assume
>|byte n+1 lives at that address (0x101).  this kind of subtlety caused
>|us the most pain in the development cycle.
>
>The person adding 1 to the int expecting to have the (char *) casted int point
>to the next byte shouldn't call himself a programmer ('cos he isn't producing
>a program but pure garbage (eh?)).

Ahem.  Two points:

(1)  Unix has reached the age where it has what can be called dusty deck code.
     And this code frequently does stuff like

	 char *bar = (char*)malloc(100);
    
    which doesn't work under Rob's machine.  Do we want to support this code?

(2) Even so, it doesn't turn out to be so bad.  I hacked over lint to catch
    these sorts of problems and ported most of /usr/src/cmd to the ETA in a
    week.  ``Work smarter, not harder.''
---
Larry McVoy, Sun Microsystems     (415) 336-7627       ...!sun!lm or lm@sun.com

Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) (08/30/90)

John> However, it is still clear that bit-addressing would break more programs
John> than byte-addressing, so all I wanted was for someone to work through
John> a complete example to show why it would be A Good Thing on balance.

Yes, it would break a _few_ programs.  (e.g. incrementing a pointer cast to
an int or otherwise directly twiddling pointer bits) Adding a new type C
type called "bit" (or boolean or logical etc.) should be orthagonal to
existing correctly written C code, except for the added keyword.  (void *)s
and (char *)s _can maintain the old semantics_ and old correct code need
only be recompiled after a character substition, if necessary.  I think
this would fall under your category "b".

John> 	b) It's faster, cheaper, or smaller, and although it's different,
John> 	its architecture+software make it easy to recompile lots of
John> 	existing code with zero hassle.

If implemented in hardware, bit addressing could provide faster bit access
(bit read vs. byte read + mask + shift), and (at last!) a syntatically and
semantically consistant way to access multi-word bit arrays.

I'm _not_ suggesting this be added to ANSI C, but this could become a de
facto standard as 64 bit machines become more common.

#include <std/disclaimer.h>
--
Chuck Phillips  MS440
NCR Microelectronics 			Chuck.Phillips%FtCollins.NCR.com
2001 Danfield Ct.
Ft. Collins, CO.  80525   		uunet!ncrlnk!ncr-mpd!bach!chuckp

peter@ficc.ferranti.com (Peter da Silva) (08/30/90)

In article <141569@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
> (1)  Unix has reached the age where it has what can be called dusty deck code.
>      And this code frequently does stuff like

> 	 char *bar = (char*)malloc(100);

>     which doesn't work under Rob's machine.

It won't work on *any* machine where sizeof(int) != sizeof(char*), or where
pointers are returned differently than integers (as in some 68000 compilers
that return ints in D0 and pointers in A0 for the very good reason that it's
more efficient that way when the result is going to be used immediately), or
on any number of other environments that don't look like VAXes.

>     Do we want to support this code?

No.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (08/30/90)

In article <141569@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:

| (1)  Unix has reached the age where it has what can be called dusty deck code.
|      And this code frequently does stuff like
| 
| 	 char *bar = (char*)malloc(100);
|     
|     which doesn't work under Rob's machine.  Do we want to support this code?

  This should work under an ANSI compliant C implementation, what's the
problem. malloc() takes an arguments in bytes, returns a (void *)
pointer which may be cast to (char *) correctly.

  Neither malloc() or (void *) work in "smallest addressable unit"
increments, and a byte may be anything as long as (a) it's at least
eight bits, and (b) every printable character in the native character
set fits in a char as a positive value (char may be unsiged by default
to allow this).

  You are correct that there is old code around, but not as much as you
would think, since a lot of it has been through compilers which will
catch stuff which is not portable, or through the great portability
testing software SCO calls xenix/286 (no smiley).
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
    VMS is a text-only adventure game. If you win you can use unix.

henry@zoo.toronto.edu (Henry Spencer) (08/30/90)

In article <141569@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
>(1)  Unix has reached the age where it has what can be called dusty deck code.
>     And this code frequently does stuff like
>
>	 char *bar = (char*)malloc(100);
>    
>    which doesn't work under Rob's machine...

Uh, why not?  C definitely requires that sizes be expressed in bytes; you
cannot fool with that without breaking great masses of code.  That doesn't
meant that addressing can't be to the bit.
-- 
TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
OSI: handling yesterday's loads someday|  henry@zoo.toronto.edu   utzoo!henry

rcpieter@svin02.info.win.tue.nl (Tiggr) (08/31/90)

lm@sun.UUCP (Larry McVoy) writes:

|     And this code frequently does stuff like
|
|	 char *bar = (char*)malloc(100);
|    
|    which doesn't work under Rob's machine...

peter@ficc.ferranti.com (Peter da Silva) writes:

|It won't work on *any* machine where sizeof(int) != sizeof(char*), or where
|pointers are returned differently than integers (as in some 68000 compilers
|that return ints in D0 and pointers in A0 for the very good reason that it's
|more efficient that way when the result is going to be used immediately), or
|on any number of other environments that don't look like VAXes.

Suppose for one moment that at least the header file has been included
(or fix this in the same moment (John Mashey's b category)).  Then malloc
will have been declared as returning a void * or (non ANSI) char *, and
nothing is wrong with the declaration of BAR.  Not even on fancy addressing
machines.

Followups to comp.lang.c since this hasn't got anything to do with
computer architecture.

Tiggr

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (08/31/90)

In article <141569@sun.Eng.Sun.COM>, lm@snafu.Sun.COM (Larry McVoy) writes:
> (1)  Unix has reached the age where it has what can be called dusty deck code.
>      And this code frequently does stuff like
> 
> 	 char *bar = (char*)malloc(100);

>     which doesn't work under Rob's machine.  Do we want to support this code?

But it *does* work under Rob's machine!  C's malloc() is defined to take an
argument in *bytes*, not in hardware addressable units.  If malloc() has to
multiply its argument by 8 (or for that matter, by 137) to determine the
number of hardware addressable units to allocate, that's _malloc's_
business.  The C code doesn't notice.

What's more, code that does
	long *w; char *p; ...
	p = (char *)w + 1; ...
	w = (long *)(p - 1); ...
will work exactly as before.  Adding or subtracting 1 to a char pointer
will _really_ add or subtract 8 to a bit pointer, but the C code won't know.
	
The main thing that will break is
	long w; char *p;
	w = (long)p;
	w++;
	p = (char *)w;
which will no longer have the same effect as p++;
On the other hand, this never _was_ portable.  Consider a machine which
normally uses pointers to 16-bit words, where byte pointers use the top
(sign) bit to indicate even/odd byte within word.  On such a machine (I
have a real machine in mind) this C fragment would increment p by 2.

But even that can be fixed.  There is no way for existing C code to get
its hands on a bit pointer.  So if a C compiler for a bit-addressable
machine implements			as
	w = (long)p;			movw	p,w;	shli $-8,w
	p = (char *)w;			movw	w,p;	shli $8,p
then everything will continue to work as before.  The only case which will
_still_ break is
	union { char *p; long w; } pun;
	pun.p = p;
	pun.w++;
	p = pun.p;
but that is _already_ non-portable.

Note that this approach allows all C pointers to be bit pointers, and
it even allows pointers to objects smaller than chars -- which would have
to be an extension to C.  The only thing that bends is casting between
pointers and ints, which is already known to be non-portable; casting
between char*/void* and other pointers would be no problem.
-- 
You can lie with statistics ... but not to a statistician.

lm@snafu.Sun.COM (Larry McVoy) (08/31/90)

In article <1990Aug30.165552.3875@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <141569@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
>>(1)  Unix has reached the age where it has what can be called dusty deck code.
>>     And this code frequently does stuff like
>>
>>	 char *bar = (char*)malloc(100);
>>    
>>    which doesn't work under Rob's machine...
>
>Uh, why not?  C definitely requires that sizes be expressed in bytes; you
>cannot fool with that without breaking great masses of code.  That doesn't
>meant that addressing can't be to the bit.
>-- 
>TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
>OSI: handling yesterday's loads someday|  henry@zoo.toronto.edu   utzoo!henry

(Love the signature, henry.)

Anyway, the reason why not is so weird you'll die laughing.  Get this:
On the ETA machine, rather than fixing the compiler so that a ++ on a char*
incremented by 8 (bits) instead of one, they shifted bit addresses (the 
natural address of the machine) into byte addresses when they stored them in
pointers.  But they shifted back to bit addresses when shoving a pointer into
an int.  Whay you ask?  God only knows, it certainly seemed weird to me.

So, look at that code again.  From the compiler's point of view, we have
malloc(), a function returning an int, being assigned into a pointer
(the cast doesn't do squat, the assignment does the same thing).
The assignment causes the int to pointer shift.  Too bad, because malloc()
actually returned a pointer, i.e., the shift had already happened.  The
second shift gives you a la la land address.  Bummer.  Braindead system.
The fix is, of course, to declare malloc() before using it.
---
Larry McVoy, Sun Microsystems     (415) 336-7627       ...!sun!lm or lm@sun.com

rcpieter@svin02.info.win.tue.nl (Tiggr) (08/31/90)

lm@snafu.Sun.COM (Larry McVoy) writes:

|Anyway, the reason why not is so weird you'll die laughing.  Get this:
|On the ETA machine, rather than fixing the compiler so that a ++ on a char*
|incremented by 8 (bits) instead of one, they shifted bit addresses (the 
|natural address of the machine) into byte addresses when they stored them in
|pointers.  But they shifted back to bit addresses when shoving a pointer into
|an int.  Whay you ask?  God only knows, it certainly seemed weird to me.

So it is not a problem of a bit-adressing machine but of a braindead compiler.
Case solved.

Tiggr

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (08/31/90)

In article <141569@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
>  char *bar = (char*)malloc(100);
>    
>  which doesn't work under Rob's machine...

In article <1990Aug30.165552.3875@zoo.toronto.edu> henry@zoo.toronto.edu
(Henry Spencer) writes:
>  Uh, why not?

In article <141658@sun.Eng.Sun.COM>, lm@snafu.Sun.COM (Larry McVoy) writes:
> So, look at that code again.  From the compiler's point of view, we have
> malloc(), a function returning an int, being assigned into a pointer
> (the cast doesn't do squat, the assignment does the same thing).

The trouble was that the original posting failed to make it clear that
there *wasn't* a declaration
	extern char *malloc();
in scope.  I naturally assumed there was, because I thought the point
was "here is something which works now but would be broken by bit addressing".
Well,
	/* default: extern int malloc(); */
	char *p = (char*)malloc(100);
is ALREADY broken and has been broken for some time.
    * 	sizeof (char*) == sizeof (int)	THIS IS NOT A LAW OF C!  IT BREAKS!
Such code already broke about 10 years ago when you moved it to a
machine with 32-bit pointers but 16-bit ints, and yes there _were_ UNIX
machines around that did that.  Losing the top 16 bits of a pointer is a
pretty serious way to break...

-- 
You can lie with statistics ... but not to a statistician.

mash@mips.COM (John Mashey) (08/31/90)

In article <CHUCK.PHILLIPS.90Aug30120102@halley.FtCollins.NCR.COM> Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) writes:
>Yes, it would break a _few_ programs.  (e.g. incrementing a pointer cast to
>an int or otherwise directly twiddling pointer bits) Adding a new type C
>type called "bit" (or boolean or logical etc.) should be orthagonal to
>existing correctly written C code, except for the added keyword.  (void *)s
>and (char *)s _can maintain the old semantics_ and old correct code need
>only be recompiled after a character substition, if necessary.  I think
>this would fall under your category "b".

>If implemented in hardware, bit addressing could provide faster bit access
>(bit read vs. byte read + mask + shift), and (at last!) a syntatically and
>semantically consistant way to access multi-word bit arrays.

Let me try the exercise again:  start with an architecture, modify it for
bit-addressing, add instructions, see how it fits with the languauges,
and see what happens.  The above has at least some parts of such a proposal,
so let's try it and look at the other pieces.
(THis is long, but has a short moral at the end).

The comments seem to propose addition of 1 or more instructions
that load a bit-field into a register.  Let's look at that, and consider
adding onto any of the architectures similar to H&P's DLX (but if necesary,
I'll use an R3000 a a concrete example).
I'll assume that one really wants the following 2 instructions:
	LFS:	load-bitfield-signed	register,address,fieldsize
	LFU:	load-bitfield-unsigned	register,address,fieldsize
and that within a single word, they are the logical equivalents of:
	load-word	register,address
	shift-left	register,register,32-offset
	shift-right-X	register,register,32-fieldsize
			X = arithmetic (LFS) or logical (LFU)
OR, on machines that have one:
	load-word	register,address
	extract-field	register,register, offset, fieldsize

Of course, this is ignoring any setup required if everything is dynamic,
i.e., it assumes that a compiler is extracting a bitfield from something
whose byte address is known.  If the bitfield is 1-32 bits, and crosses
a word boundary, then longer sequences are needed, including two
loads, and then some shifting & masking.  (on an R3000, one might use
LWLeft, LWRight to get the right 4-bytes back together).
(working thru all these cases is left as an exercise :-)

Now, consider some low-level consequences:

1) The format of LFS/LFU doesn't fit into the typical 32-bit RISC
instruction, because it has an extra 5-bit operand ("fieldsize").
Where do you encode it?
	a) If you have base+index addressing, use the index register
	as the fieldsize.
	b) Steal the high-order 5 bits of the displacement.
	c) Steal the target register field, so that you always load the
	result into an implicitly-chosen register.
Now, compiler writers will gag on c) (which ends up requiring you to move
the result to the desired register some of the time anyway).
MIPS would have to use b), some others might use c).
The compiler writers must deal with the fact that these instructions
have different displacement-limits (b) or are non-indexing, unlike
other load instructions.
Either b) or c) mean that you've almost certainly created a new format:
	More hardware to decode the instruction (maybe, maybe not)
	More hardware to execute the instruction (almost certainly),
		and this might (or might not) be on the critical path.
It is unlikely that either these is a big deal, but you never know.
(Given existing isntruction formats, I'd guess HP PA might accomdate
the new instruction format easiest.)

2) Adding a bitfield extraction and arbitrary shift amount to the
load instruction may well increase the critical path (since load is usually
a critical path item).
	Most existing machines fetch a word, then run it through a network
	that: extracts the relevant byte, halfword, (and in MIPS case,
	tribytes), sign-extends or zerofills to 32-bits, and then
	loads it int othe register.  (Except for MIPS LWL/LWR, which
	are more complicated.)
Either:
	a) You can add the complete variable bit-field extract into
	the load-path, without time penalty (bu unlikely to be without
	chip space penalty),
Or
	b) You can't.

What do you do with case b)?
	b1) Lengthen the cycle time
	b2) Add an additional cycle of load-latency to whatever was there,
	for   b2a) these instructions
	      b2b) all load instructions
None of these are attractive, but it is hard to tell which is least bad.
b2b is probably awful.  b2a may cause additional irregularities in the
pipeline design.  As usual, serious simulation is needed to figure out
which might be best.   Note that if you add a cycle of load-latency,
you've lost back a cycle of your improvement whenever you can't fill
the load delay slots, that is:
	load
	(stall)	assuming 1-cycle delay
	extract
	use bitfield

consumes 4 cycles, as does:
	LFU
	(stall, stall)

3) Alignment.  If you let LFU/LFS cross word boundaries in memory
(which is needed to achieve the express goal of convenient access to
arbitrary bitfields), then you for sure have instructions that cross
boundaries, i.e., you're back with the unaligned-data problem that
most of the RISC machines don't want.  Solutions:
	3a) Bite the bullet, and let everything be unaligned,
	with know issues of complexity and possible critical path
	problems & more complex MMUs and maybe exception-handling.
	3b) use the IBM RS/6000 approach, and trap if it crosses
	a cache-line boundary, only.
	3c) Add 4 more instructions, a la R3000:
		LFULeft, LFURight, LFSLeft, LFSRight
		that put the pieces together via 2 separate fetches.
		Assume this does not run you out of opcodes....

4) Stores: we've done the easy part, but it seems like anything that
worked this way at all, would want bitfield stores.
Here, it is easier: there's only 1 store (SF), or maybe 3
(SF, plus SFLeft, SFRight) if 3c above is used.
On the other hand.....
	Perhaps some of hardware folks out there would like to
	show typical implementations of:
	a) Byte-oriented memory with parity
	b) Byte-oriented memory with ECC
	versus the bitfield-oriented versions of these things,
	i.e., ones that permit the processor to issue bitfield
	stores.
Extra signal pins?
Extra places where write turns into read-modify-write?
(Many possible configurations: best bet is to restrict these instructions
to cached memory, use on-chip write-back caches, and never let anything
else see it; that way, you only pay the read-modify-write price into
the on-chip data cache (assuming byte parity on the cache words).
Of course, if you restrict these to cached memory, you'd better be
careful what sort of code your compiler generates to handle bitfields
in structures .... that might be i/o device descriptions, and thus
not cached...

MORAL OF THIS TRIVIAL EXAMPLE:

What sounds simple (have some instructions to access bitfields)
can have all sorts of ramifications, and the only way to figure them out
is to track down ALL the details.  Simple things can surprise you in
cycle time or latency hits.  

COMMENTS?  (esp. maybe some HP PA person would talk about critical-path
timing for load+extract, for example)
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

richard@aiai.ed.ac.uk (Richard Tobin) (08/31/90)

In article <2477@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:

>| 	 char *bar = (char*)malloc(100);

>  This should work under an ANSI compliant C implementation, what's the
>problem. malloc() takes an arguments in bytes, returns a (void *)
>pointer which may be cast to (char *) correctly.

But many programs that cast malloc() to char * do so because they
don't declare malloc() or #include a file that does.  Thus the result
is assumed to be an integer, and if integer <-> pointer casting isn't
a no-op you lose.

-- Richard
-- 
Richard Tobin,                       JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,           ARPA:  R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.                UUCP:  ...!ukc!ed.ac.uk!R.Tobin

paulh@cimage.com (Paul Haas/1000000) (09/01/90)

In article <41188@mips.mips.COM> you write:
>However, it is still clear that bit-addressing would break more programs
>than byte-addressing, so all I wanted was for someone to work through
>a complete example to show why it would be A Good Thing on balance.

I think you are proposing an impossible task.  As there is evidence
that bit-addressing is not a "A Good Thing" on balance at the present
time.

If bit addressing is "A Good Thing" aka. someone has a need for it,
they will add it to their compiler or cause their compiler vendor to
add it.  For example, some people have a real need for 64 bit integers,
thus many C compilers support type "long long".

Adding bit pointers to C compilers would be easy on machines which
already have multiple incompatible pointer types.  Bit-pointers would
be just one more type.  On nicer machines, you have to teach the
compiler about multiple pointer types, but that has been done before.
In either case emulating bit-addressing hardware is straight forward.
A few months ago, I needed to emulate an array of bits.  It took half a
day to write the appropriate bits of code.

Actually, having bit-pointers would not have saved me much time, as
most of the time was spent trying to speed up my find_next_set_bit()
function, and figuring out how to handle malloc() or realloc()
failures.  Neither of which are affected by bit-pointers.  Assume,
though that bit-pointers could save me half a days work, five times a
year.  That works out to a one percent productivity improvement.

I used to use PR1ME 50 series processors from about 1979 to about
1985.  They had many pointer formats, including a bit pointer.  It was
not supported by any of their compilers except as a character pointer.
In case you haven't had the pleasure of using a Prime 750 or similar
processor, this format of pointer used 3 sixteen bit words.  The first
word held the segment number, 2 bit ring number and a bit indicating
whether it was a 2 word or three word pointer.  The second word was
word pointer for the segment (segments held 64k 16 bit words).  The
third word held the bit offset.  Later processors in that line added
yet another pointer format that put a even/odd byte bit into the
segment word, to produce a 4 byte pointer which could point to
arbitrary characters.  If they felt a need, they could have easily
improved the bit pointer support, but they did not bother.

Actually, I like the concept of bit-addressing.  I just can't think
of a justification.  I certainly wouldn't spend extra money on it
at this time.
---
Paul Haas  paulh@cimage.com
I am not a spokesperson for Cimage Corporation.

daveg@near.cs.caltech.edu (Dave Gillespie) (09/01/90)

>>>>> On 31 Aug 90 17:49:57 GMT, paulh@cimage.com (Paul Haas/1000000) said:
> If bit addressing is "A Good Thing" aka. someone has a need for it,
> they will add it to their compiler or cause their compiler vendor to
> add it.  For example, some people have a real need for 64 bit integers,
> thus many C compilers support type "long long".

I think John's original question was about using bit addressing across
the board in hardware, which is a little different.  You wouldn't need
to add a bit-pointer type to the compiler, because all pointers would
be bit-pointers.  So the question is, can we switch to using bit-pointers
without sacrificing performance, portability, low cost, etc.

> Actually, I like the concept of bit-addressing.  I just can't think
> of a justification.  I certainly wouldn't spend extra money on it
> at this time.

There's always the "pure research" justification.  The extra cost is
probably small, and there is promise for great benefits down the road,
maybe even benefits none of us have foreseen.  So it could be worth
trying "just in case."

Back when byte addressing was invented, it was probably a radical and
dubious idea.  After all, you can get along fine with word-pointers
for most uses and byte-pointers for specifically byte-oriented work.
Using byte-pointers for everything complicates the hardware, raises
tricky issues like what to do about unaligned accesses, and so on.
All you get for it is a pleasing internal consistency.  Do you want
to spend extra money on elegance?  Now that we're used to byte
addressing, the answer appears to be yes.  In fact we take it for
granted as one of the properties a serious modern architecture must
have.

Maybe years from now we'll all laugh (or curse) at those old clunkers
that had different formats for bit- and byte-pointers.

[Lest I become known as the net's great champion for bit-addressing, I
think I should point out that I myself doubt it will ever really be
"justified."  I just think we should keep an open mind.]

								-- Dave
--
Dave Gillespie
  256-80 Caltech Pasadena CA USA 91125
  daveg@csvax.cs.caltech.edu, ...!cit-vax!daveg

preston@titan.rice.edu (Preston Briggs) (09/01/90)

In article <DAVEG.90Aug31222343@near.cs.caltech.edu> daveg@near.cs.caltech.edu (Dave Gillespie) writes:

>[Lest I become known as the net's great champion for bit-addressing, I
>think I should point out that I myself doubt it will ever really be
>"justified."  I just think we should keep an open mind.]

Bits are sort of useful as flags and such.
However, I usually want to manage my bit-vectors in large chunks
(getting that 32-way parallelism when ANDing and ORing integers).
But what will we do with pairs and nybbles?
And will we have 1, 2, 4, 8, 16, 32, and 64 bit registers?
We could perhaps manage them like the common idea of using register
pairs for holding double-precision floats.

So, if we have 64 x 64 bit registers, we'll need a lot of bits to
specify the registers (but of course we'll have a 64 bit instruction!).

	6 bits for 64   x 64 bit registers
	7	   128  x 32
	8	   256  x 16
	9	   512  x  8
	10	   1024 x  4
	11	   2048 x  2
	12	   4096 x  1 bit registers

For a 3-address bit-wise instruction then, we'll need 48 bits to specify the
registers, leaving 12 bits to specify the operator (and how many do we need
to operate on 2 bits?).

Imagine the interference graph I could build for register allocation!
We'll keep track of every bit in every live value and all their interferences.

Alternatively, we could take a Cray-style approach.  64 x 64 bit registers.
That's it, no subdivisions.  Memory is addressable in 64-bit chunks.
Saves addressing bits, shifts, unaligned access traps, register 
specification bits, massive hardware, and massive software.  Of course, 
it costs memory (64 bits per char), but that's cheaper every day.  
And it would certainly crunch floating-point, big integers, and
long bit-vectors (though perhaps not AWK programs).

Even without my exaggaration (did anyone notice?) of the fun of 
bit-addressable memory, I'd still prefer the 2nd approach.

-- 
Preston Briggs				looking for the great leap forward
preston@titan.rice.edu

dricejb@drilex.UUCP (Craig Jackson drilex1) (09/01/90)

In article <141569@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
>In article <1372@svin02.info.win.tue.nl> rcpieter@svin02.info.win.tue.nl (Tiggr) writes:
>>rpeglar@csinc.UUCP (Rob Peglar) writes:

>>>(Stuff about how the ETA, as a bit addressed machine, did transformations
>>>converting ints to pointers & vice-versa.)

>>The person adding 1 to the int expecting to have the (char *) casted int point
>>to the next byte shouldn't call himself a programmer ('cos he isn't producing
>>a program but pure garbage (eh?)).
>
>Ahem.  Two points:
>
>(1)  Unix has reached the age where it has what can be called dusty deck code.
>     And this code frequently does stuff like
>
>	 char *bar = (char*)malloc(100);
>    
>    which doesn't work under Rob's machine.  Do we want to support this code?

An amazing point here was how many people didn't pick up the error here.
Sure, Rob did not mention that there was no proper declaration of malloc
in scope.  But the cast of (char *) on malloc should have been a hint--
a good compiler would give a complaint about a 'redundant cast' if you didn't
turn the warning off, but Unix people wouldn't buy it.

>(2) Even so, it doesn't turn out to be so bad.  I hacked over lint to catch
>    these sorts of problems and ported most of /usr/src/cmd to the ETA in a
>    week.  ``Work smarter, not harder.''

One real shame is how often  'Unix' has been worked over by many programmers
working for many companies, each fixing one-by-one many bugs.  Since fixed
bugs are a proprietary advantage, it goes on until one of the 'master'
programmers (who work on porting bases) happens on it.

(Sorry for the comp.arch digressions.  To put my two cent's worth in about
bit addressing, I don't think there is a language out there today which
would care, as long as bit *fields* were reasonably cheap.  Sure, PL/I
had bit strings, but I don't think they would be significantly sped up
by bit addressing.)
-- 
Craig Jackson
dricejb@drilex.dri.mgh.com
{bbn,axiom,redsox,atexnet,ka3ovk}!drilex!{dricej,dricejb}

dhinds@portia.Stanford.EDU (David Hinds) (09/02/90)

In article <1990Sep1.062535.7541@rice.edu> preston@titan.rice.edu (Preston Briggs) writes:
>Bits are sort of useful as flags and such.
>However, I usually want to manage my bit-vectors in large chunks
>(getting that 32-way parallelism when ANDing and ORing integers).
>But what will we do with pairs and nybbles?
>And will we have 1, 2, 4, 8, 16, 32, and 64 bit registers?
>We could perhaps manage them like the common idea of using register
>pairs for holding double-precision floats.

   Isn't it obvious how you would manage registers?  The register store
would also be bit-addressable.  So, a 64x64-bit block of registers would
be just like a 4096-bit block of memory, and an instruction specifying
a register would just need to give a 12-bit short address and a size field.
You might require that the alignment of any pseudo-register be at least
its own size, but you wouldn't have to.

 -David Hinds
  dhinds@popserver.stanford.edu

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (09/02/90)

In article <1990Aug31.174957.9612@cimage.com>, paulh@cimage.com (Paul Haas/1000000) writes:
> Adding bit pointers to C compilers would be easy on machines which
> already have multiple incompatible pointer types.  Bit-pointers would
> be just one more type.  On nicer machines, you have to teach the
> compiler about multiple pointer types

Not at all.  Nobody was proposing a new kind of pointer.  The suggestion
was to have *one* kind of address *only*, namely a bit address.  Only the
size of addressible units was to change.

> I used to use PR1ME 50 series processors from about 1979 to about
> 1985.  They had many pointer formats, including a bit pointer.  It was
> not supported by any of their compilers except as a character pointer.

Forget the *compilers*.  The point was that bit pointers weren't
supported by the *hardware*.  The 3-word pointer format had a 4-bit
"bit number within word" field, but there were no instructions to
fetch or store bits, and there were no instructions to do arithmetic
on these pointers in bit units (you _could_ fetch and store characters
and adjust by characters).

Then there were fun things like this:  you could only fetch and store
characters through these pointers when they were in certain special
registers, and one of those registers overlapped the double-precision
floating-point register.  (I wondered for a long time how they managed
to fit 67 bits of pointer into a 64-bit register until it dawned on me
that they dropped the 3 bit-within-byte bits.)

The main point of character pointers in the 50 series was to support
COBOL editing.

Didn't the Burroughs 1700/1800 machines support bit-level addressing?

-- 
You can lie with statistics ... but not to a statistician.

cik@l.cc.purdue.edu (Herman Rubin) (09/02/90)

In article <1990Sep2.015030.4135@portia.Stanford.EDU>, dhinds@portia.Stanford.EDU (David Hinds) writes:
> In article <1990Sep1.062535.7541@rice.edu> preston@titan.rice.edu (Preston Briggs) writes:
| >Bits are sort of useful as flags and such.
| >However, I usually want to manage my bit-vectors in large chunks
| >(getting that 32-way parallelism when ANDing and ORing integers).
| >But what will we do with pairs and nybbles?
| >And will we have 1, 2, 4, 8, 16, 32, and 64 bit registers?
| >We could perhaps manage them like the common idea of using register
| >pairs for holding double-precision floats.
> 
>    Isn't it obvious how you would manage registers?  The register store
> would also be bit-addressable.  So, a 64x64-bit block of registers would
> be just like a 4096-bit block of memory, and an instruction specifying
> a register would just need to give a 12-bit short address and a size field.
> You might require that the alignment of any pseudo-register be at least
> its own size, but you wouldn't have to.

Registers should be addressable both as registers, and as memory.  When
addressing the registers as memory, the above operations would be easily
available.  Also, registers could be indexed, etc. (the Univac 1108/1110
and some others had this), which I am sure others besides me would have
liked to have available.  Why not be able to use a short vector in
registers for looping purposes?
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

preston@titan.rice.edu (Preston Briggs) (09/03/90)

>In article <1990Sep1.062535.7541@rice.edu> I wrote:
>>Bits are sort of useful as flags and such.
>>However, I usually want to manage my bit-vectors in large chunks
>>(getting that 32-way parallelism when ANDing and ORing integers).
>>But what will we do with pairs and nybbles?
>>And will we have 1, 2, 4, 8, 16, 32, and 64 bit registers?
>>We could perhaps manage them like the common idea of using register
>>pairs for holding double-precision floats.

In article <1990Sep2.015030.4135@portia.Stanford.EDU> dhinds@portia.Stanford.EDU (David Hinds) writes:
>   Isn't it obvious how you would manage registers?  The register store
>would also be bit-addressable.  So, a 64x64-bit block of registers would
>be just like a 4096-bit block of memory, and an instruction specifying
>a register would just need to give a 12-bit short address and a size field.
>You might require that the alignment of any pseudo-register be at least
>its own size, but you wouldn't have to.

You've over-trimmed my post.  I suggested a bit addressable register set, 
but wanted alignment.  Without alignment, I expect the hardware becomes 
far more complex (and I'm certain the register allocator does).

Hinds also suggests a size field to accompany each register number.
Why?  The instruction usually implies the operand size.
For example, we don't say 

	FADD r0.double, r2.double, r4.double

We say instead

	FADD.double r0, r2, r4

effectively saving 2 size specifications.

And I still think the whole thing is a mess.  Too much hardware
and software expended for very little return.  Design should proceed top-down.
You decide what problem you need to solve (or class of problems)
and you build a solution.  If you over-generalize, your solution
will be slower and/or more expensive than necessary.

If all you need is to shift left 1 bit, you don't build a multiplier.
If all you need is to find the smallest element in an array,
you don't bother building a quicksort.  If you need to run
C fast, you don't support bit, pair, or nibble addressing.  If you
intend to run Fortran, you don't bother with 8 or 16 bit addressing
either.

Of course, this kind of thinking is the sort Rubin rails against.
Languages are (often) designed to run on available machine and then
later machines are designed to supprot the languages *and nothing else*.
This leads to certain ideas or paradigms being unsupported.

I guess I'd like to see high-level expressions of the problems that aren't
well supported.  If an adequate form of expression can be invented (this
is probably hard -- good notation is very hard), then we can see about
supporting it on different machines.

For example, a "typical Rubin problem" (based on limit observations)
is usually expressed in terms of random bit strings.
But what's really going on?  Can we get a little more abstract,
away from the "bit string" part.  How about "a stream of random
boolean values" or something.  As far as implementation, why use 
a bit string?  Why not a linked list or some combination of
linked list and array.  There are a lot of choices of structure that
significantly influence the algorithmic complexity of solutions.

-- 
Preston Briggs				looking for the great leap forward
preston@titan.rice.edu

paulh@cimage.com (Paul Haas/1000000) (09/03/90)

In article <3656@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>In article <1990Aug31.174957.9612@cimage.com>, paulh@cimage.com (Paul Haas/1000000) writes:
>> Adding bit pointers to C compilers would be easy on machines which
>> already have multiple incompatible pointer types.  Bit-pointers would
>> be just one more type.  On nicer machines, you have to teach the
>> compiler about multiple pointer types
>
>Not at all.  Nobody was proposing a new kind of pointer.  The suggestion
>was to have *one* kind of address *only*, namely a bit address.  Only the
>size of addressible units was to change.

Bit address pointers seem like a new kind of pointer to me (-:.
In C-speak the new type of pointer would be "pointer to bitfield",
"pointer to bit", or "pointer to short short char" (look ma, no new
keywords).

If bit-addressing is used, it can happen in several orders:
   1) Hardware and programmer interface at the same time.
         A computer vendor (MIPS, Sun, HP, IBM, DEC, etc...) decides that
         bit addressing is the wave of the future and designs it into their
         next generation architecture at considerable cost.  They enhance
         C, FORTRAN, Cobol and/or Ada, so the programmers can actually
         take advantage of the bit addressing.
   2) Hardware first.
         The computer vendor adds bit-addressing to the hardware, but
         doesn't provide an interface in their higher level languages.
         (Prime Computer used a variant of this in the seventies, where
         they didn't ever finish the hardware support, as Richard pointed out).
         The cost is still considerable, nothing is gained.
   3) Programmer interface first.
         Someone (compiler maintainer for a computer vendor, independent
         compiler writer, anyone with the source for GCC, etc...) decides
         that bit addressing is the wave of the future and adds a way
         to use bit addressing to an existing compiler on existing hardware.
         This does imply multiple pointer types.  The cost would be measured
         in person weeks.  (Could it be done in a header file in C++ ?)

Does anyone seriously believe that a sane computer vendor will create
a new flavor of hardware in the nineties, with no evidence of any
customers?
---
I'm not a C++ expert, but it seems to me, that it would be possible to
define bit pointers
to add a bit type and pointer to bit

peter@ficc.ferranti.com (Peter da Silva) (09/03/90)

In article <1990Aug31.174957.9612@cimage.com> paulh@dgsi.UUCP (Paul Haas/1000000) writes:
> If bit addressing is "A Good Thing" aka. someone has a need for it,
> they will add it to their compiler or cause their compiler vendor to
> add it. 

Intel's cross-compiler for the 8051, which supports bit addresses (albeit
in a special on-chip RAM with special instructions to access it), supports
a "bit" data type and bit-granular arrays.

So the embedded systems market already has some use for this.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

usenet@nlm.nih.gov (usenet news poster) (09/03/90)

preston@titan.rice.edu (Preston Briggs) writes:
>Bits are sort of useful as flags and such.
>However, I usually want to manage my bit-vectors in large chunks
>(getting that 32-way parallelism when ANDing and ORing integers).

Or doing long string comparisons by building a finite state machine
where state transitions are taken several letters at a time.

>But what will we do with pairs and nybbles?
>And will we have 1, 2, 4, 8, 16, 32, and 64 bit registers?

Why not just support subdivided instructions (ADD_BY_BYTES ...)?
All of the logical operations can be viewed as arbitrarily divided
into subfields.  Adding a set of condition registers instead of
a single condition code would be minimal overhead.  The result
would be a machine that could parallelize small integer operations.

There are limits to how fast you can push clock speed.  If you want
to process character strings and small integer operations faster,
parallelization seems like the way to go.

>[...]
>Alternatively, we could take a Cray-style approach.  64 x 64 bit registers.
>That's it, no subdivisions.  Memory is addressable in 64-bit chunks.
>Saves addressing bits, shifts, unaligned access traps, register 
>specification bits, massive hardware, and massive software.  Of course, 
>it costs memory (64 bits per char), but that's cheaper every day.  

If you parallelize, it doesn't cost memory and you could potentially
win big on performance. 

>Preston Briggs				looking for the great leap forward

David States

cik@l.cc.purdue.edu (Herman Rubin) (09/03/90)

In article <1990Sep2.201943.3670@rice.edu>, preston@titan.rice.edu (Preston Briggs) writes:

			........................

> If all you need is to find the smallest element in an array,
> you don't bother building a quicksort.  If you need to run
> C fast, you don't support bit, pair, or nibble addressing.  If you
> intend to run Fortran, you don't bother with 8 or 16 bit addressing
> either.

But what if your problem is best done by a combination of these?

> Of course, this kind of thinking is the sort Rubin rails against.
> Languages are (often) designed to run on available machine and then
> later machines are designed to supprot the languages *and nothing else*.
> This leads to certain ideas or paradigms being unsupported.

The problem is clearly expressed.

> I guess I'd like to see high-level expressions of the problems that aren't
> well supported.  If an adequate form of expression can be invented (this
> is probably hard -- good notation is very hard), then we can see about
> supporting it on different machines.

If one wants to get a notation which is in some sense optimal, this is very
hard indeed.  But by having the language fully extensible, including operator
symbols, etc., this would not be difficult.  This happens all the time in
mathematics and other sciences.  The one creating the expresson form suggests
a syntax, and usage decides.  It is not that difficult to have different
expressions used by different authors more many years.

> For example, a "typical Rubin problem" (based on limit observations)
> is usually expressed in terms of random bit strings.
> But what's really going on?  Can we get a little more abstract,
> away from the "bit string" part.  How about "a stream of random
> boolean values" or something.  As far as implementation, why use 
> a bit string?  Why not a linked list or some combination of
> linked list and array.  There are a lot of choices of structure that
> significantly influence the algorithmic complexity of solutions.

Random boolean values would be just as good as random bit strings.  On
all vector processors about which I know, bit strings are used as such,
although in many they have to be aligned in registers.  It is not at all
difficult to come up with situations where the bit strings will not be
aligned in memory.

These are not the only types of hardware inadequacy I have pointed out.
But the procedures which use these operations arise quite naturally in
methods of generating random variables, processing (other than copying)
only a few bits.

Not a single one of the hardware suggestions I have made was made for the
purpose of coming up with esoteric hardware.  Every one arose from a 
"natural" problem.  I suggest that others do the same. and that an attempt
be made to incorporate them, rather than dismiss them as "why do we need
such things."
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

njk@diku.dk (Niels J|rgen Kruse) (09/03/90)

preston@titan.rice.edu (Preston Briggs) writes:

>Alternatively, we could take a Cray-style approach.  64 x 64 bit registers.
>That's it, no subdivisions.  Memory is addressable in 64-bit chunks.
>Saves addressing bits, shifts, unaligned access traps, register
>specification bits, massive hardware, and massive software.  Of course,
>it costs memory (64 bits per char), but that's cheaper every day.

Ehrrrm.   Memory *bytes* are certainly getting cheaper, but are
memories?

If you are considering wasting X bytes of memory, the price of
a memory byte is what matters.  If you are considering wasting
X % of memory, the price of all of memory is what should
concern you.
-- 
Niels J|rgen Kruse 	DIKU Graduate 	njk@diku.dk

rcpieter@svin02.info.win.tue.nl (Tiggr) (09/03/90)

paulh@cimage.com (Paul Haas/1000000) writes:

>In article <3656@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>>
>>Not at all.  Nobody was proposing a new kind of pointer.  The suggestion
>>was to have *one* kind of address *only*, namely a bit address.  Only the
>>size of addressible units was to change.

>Bit address pointers seem like a new kind of pointer to me (-:.
>In C-speak the new type of pointer would be "pointer to bitfield",
>"pointer to bit", or "pointer to short short char" (look ma, no new
>keywords).

Richard was right.  There will not be any new type of pointer, since
all pointers will point to a bit.  It is a mere coincidence that a
pointer to an int will not only affect the bit it actually points
to, but also the following 31 (or 63) bits.  But this is already the
case on current machines (substitute byte for bit, 3 for 31 and 7 for
63 in the previous sentence).

Talking of C, no new key(buzz)words will be needed.  Imagine declaring
a 3bit variable i you will access as an integer subrange, and the
corresponding pointer to this bitfield:

int i:3, (*p):3;

>Does anyone seriously believe that a sane computer vendor will create
>a new flavor of hardware in the nineties, with no evidence of any
>customers?

Who claims there is no evidence of customers?  And if there are
customers, somebody will jump in the market with something to suit
there needs.  I wouldn't mind to have a nice machine with 64bit
registers, databus and addressbus, where the addressbus adresses
bits, and not bytes, giving 2^64 BITS of memory.  What I don't
really like is thinking about alignment restrictions...

Tiggr

preston@titan.rice.edu (Preston Briggs) (09/04/90)

In article <1990Sep3.045353.19321@nlm.nih.gov> states@artemis.NLM.NIH.GOV (David States) writes:
>preston@titan.rice.edu (Preston Briggs) writes:
>>Bits are sort of useful as flags and such.
>>However, I usually want to manage my bit-vectors in large chunks
>>(getting that 32-way parallelism when ANDing and ORing integers).

>Why not just support subdivided instructions (ADD_BY_BYTES ...)?
>All of the logical operations can be viewed as arbitrarily divided
>into subfields.

>There are limits to how fast you can push clock speed.  If you want
>to process character strings and small integer operations faster,
>parallelization seems like the way to go.

>If you parallelize, it doesn't cost memory and you could potentially
>win big on performance. 

Well, it's been done.
A Connection Machine does all these things and more.

-- 
Preston Briggs				looking for the great leap forward
preston@titan.rice.edu

hp@vmars.tuwien.ac.at (Peter Holzer) (09/04/90)

Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) writes:

>Yes, it would break a _few_ programs.  (e.g. incrementing a pointer cast to

Agreed. It should break no program that is portable now.

>an int or otherwise directly twiddling pointer bits) Adding a new type C
>type called "bit" (or boolean or logical etc.) should be orthagonal to

What should sizeof (bit) be? 0.125?
Sizeof (char) is defined to be 1, and I don't think this will change in the
near future. Of course size_t will stay an integral type, too ...

Of course, gcc could get another switch -fbitsize, so that sizeof () gives the 
size of an object in bits, and if everybody starts to use it, it could make it
into C2001 ...

>existing correctly written C code, except for the added keyword.  (void *)s
>and (char *)s _can maintain the old semantics_ and old correct code need

Yes, if all pointers have the same representation as bit pointers.

>only be recompiled after a character substition, if necessary.  I think
>this would fall under your category "b".

You would have to insert lots of sizeof (char) in mallocs (or use a compiler switch,
see above).

--
|    _	| Peter J. Holzer			| Think of it	|
| |_|_)	| Technische Universitaet Wien		| as evolution	|
| | |	| hp@vmars.tuwien.ac.at			| in action!	|
| __/  	| ...!uunet!mcsun!tuvie!vmars!hp	|     Tony Rand	|

jeremy@sinope.socs.uts.edu.au (Jeremy Fitzhardinge) (09/05/90)

rcpieter@svin02.info.win.tue.nl (Tiggr) writes:

>>Does anyone seriously believe that a sane computer vendor will create
>>a new flavor of hardware in the nineties, with no evidence of any
>>customers?
>
>Who claims there is no evidence of customers?  And if there are
>customers, somebody will jump in the market with something to suit
>there needs.  I wouldn't mind to have a nice machine with 64bit
>registers, databus and addressbus, where the addressbus adresses
>bits, and not bytes, giving 2^64 BITS of memory.  What I don't
>really like is thinking about alignment restrictions...

There already processors in use that have full bit addressing - the
TI340x0 GSP (graphics system processor) series. All pointers are bit
aligned, and all memory operations are 1 to 32 bits wide. The exception
to this is that instructions must be word (16 bit) aligned and performance
drops greatly if the stack isn't aligned. Obviously, since it is 32 bit,
it doesn't have the large, general purpose capability as a 64 bit processor,
but as a co-processor in a system it is really useful. Compression algoithms
such as LZW come out very cleanly, and of course graphics work (supported
in microcode) is very efficent. No doubt there are many algorithms that can
benefit from reading arbitary word widths from arbitary bit offsets.
The assembly language is similar to something like a 680x0, and is thus
quite useable by a compiler, with lots of general purpose registers
(15 that a compiler can safely use, for both addresses and data).

The C compiler I use on it is a subset of C that doesn't take full advantage
of the hardware - it doesn't even support bit fields, let alone an extention
to specify the size of variables. Naturally pointers to objects are to their
bit addresses, and sizeof(char)==1 even although its 8 memory locations wide.

Not surprisingly, the bus interface is a mass of barrel shifters, with a
very complex set of timing calculations. This is because the bus interface
is semi-autonomous with respect to the ALU/processor core, and the nature
of operations that can be performed. The memory interface is 16 bits wide
only, so other sized operations on the memory result in Read-Modify-Write
cycles. The bus controller can also do a range of arithmetic and logical
operations on memory bit fields as support for the graphics instruction
modes.
--
---
Jeremy Fitzhardinge: jeremy@ultima.socs.uts.edu.au  | My hovercraft is full of
No Comment.          jeremy@utscsd.csd.uts.edu.au   | eels.

rogers@iris.ucdavis.edu (Brewski Rogers) (09/05/90)

Enough of this! It's obvious the computer of the future will have a word
the same size as the total memory of the computer. The registers will
all be bit addressable. Just think - you could read ALL the memory of
the computer in *1* cycle!

-bruce

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (09/05/90)

rogers@iris.ucdavis.edu (Brewski Rogers) writes:
| Enough of this! It's obvious the computer of the future will have a word
| the same size as the total memory of the computer. The registers will
| all be bit addressable. Just think - you could read ALL the memory of
| the computer in *1* cycle!

  Only if you have the BIG CACHE option ;-)


  "What's the bandwidth of the bus on that?"

  "About 120 cm"

  "What's that in American?"

  "2.98 mili-furlongs"
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
    VMS is a text-only adventure game. If you win you can use unix.

bmiller@rt2.cs.wisc.edu (Brian Miller) (09/07/90)

    Just off hand, would any of you familiar with past
Burroughs machines care to provide some info on how they
dealt with bit-oriented data?

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (09/07/90)

In article <5063@daffy.cs.wisc.edu>, bmiller@rt2.cs.wisc.edu (Brian Miller) writes:
>     Just off hand, would any of you familiar with past
> Burroughs machines care to provide some info on how they
> dealt with bit-oriented data?

B6700:
	48 bit data + 3 bit tag + parity&c

Pointers said "element I of such-and-such an array" (somewhat indirectly,
but you could figure out which pointers pointed into what array) and gave
the element size:  4 bit unit (think packed decimal), 6 bit unit (think
BCD/BCL 6-bit chars, later dropped), 8 bit unit (think EBCDIC char),
48 bit unit (think "int"), 96 bit unit (think "double").

No bit pointers.

There were bit *tables*, so-called TRUTHTABLEs, used to represent
sets of characters (IBM fans, think "TRT").  They were 32 bits per
word, if I remember correctly.

Bit *fields* were supported directly:
	field := word.[offset_from_lsb:size_in_bits];
	... word & field[source_offset:destination_offset:size_in_bits] ...
-- I probably have the argument order wrong here --
This was on the B5500 as well, but for some reason that numbered the bits
the other way.  The bit field instructions weren't all that different
from that the M68020 and M88k provide.

So to set bit I of A[J] (where 0 <= I < 48), you'd do
	A[J].[I:1] := 1;

(The DEC-10, now, that _did_ have bit pointers...)

-- 
You can lie with statistics ... but not to a statistician.

dricejb@drilex.UUCP (Craig Jackson drilex1) (09/09/90)

In article <5063@daffy.cs.wisc.edu> bmiller@rt2.cs.wisc.edu (Brian Miller) writes:
>    Just off hand, would any of you familiar with past
>Burroughs machines care to provide some info on how they
>dealt with bit-oriented data?

Assuming that you're referring to the old Burroughs Large Systems (B6700, et
al), the basic bit-diddling operations were bit-field-extract and bit-field-
insert.  These only operated at a word level.  That is, you couldn't deal
with more than 48 bits at a time.

Examples:
A.[47:8]   Refers to the high-order byte of A.
A.[47:11]  Refers to the high-order 11 bits of A
A.[7:8]    Refers to the low-order byte of A.
A.[7:11]   Refers to the 8 low-order bits of A, plus the 3 high-order bits.

All of these causes extraction as an Rvalue, or insertion as an Lvalue.

A.[7:48]   Refers to the contents of A, end-around shifted 8 bits to the right,
	   (or 40 bits to the left)

This is most meaningful as an Rvalue.

There also was a combined operation called 'concat', which extracted
a field from one word and inserted it into a different place in a second
word.

Note that although I have used the past tense, all of this architecture
is still sold as the Unisys A-Series.
-- 
Craig Jackson
dricejb@drilex.dri.mgh.com
{bbn,axiom,redsox,atexnet,ka3ovk}!drilex!{dricej,dricejb}

throopw@sheol.UUCP (Wayne Throop) (09/10/90)

> From: hp@vmars.tuwien.ac.at (Peter Holzer)
> What should sizeof (bit) be? 0.125? [...]
> You would have to insert lots of sizeof (char) in mallocs [...]

It would probably be a better idea to preserve backwards compatibility,
and have "sizeof" remain ambiguous as to the bit size, introduce a new
operator (say) "bitsize" and a new function "bitalloc".

Further, extend the ":" bitfield syntax to apply to arrays'n'such.

This would all be backwards compatible, and would allow C to do the
sorts of things the Pascal can now do with packed subrange types.

Just a thought.
--
Wayne Throop <backbone>!mcnc!rti!sheol!throopw or sheol!throopw@rti.rti.org

bbeckwit@next.com (Bob Beckwith) (09/10/90)

In article <3656@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>In article <1990Aug31.174957.9612@cimage.com>, paulh@cimage.com (Paul Haas/1000000) writes:
>> I used to use PR1ME 50 series processors from about 1979 to about
>> 1985.  They had many pointer formats, including a bit pointer.  It was
>> not supported by any of their compilers except as a character pointer.
>
>Forget the *compilers*.  The point was that bit pointers weren't
>supported by the *hardware*.  The 3-word pointer format had a 4-bit
>"bit number within word" field, but there were no instructions to
>fetch or store bits, and there were no instructions to do arithmetic
>on these pointers in bit units (you _could_ fetch and store characters
>and adjust by characters).
>
>Then there were fun things like this:  you could only fetch and store
>characters through these pointers when they were in certain special
>registers, and one of those registers overlapped the double-precision
>floating-point register.  (I wondered for a long time how they managed
>to fit 67 bits of pointer into a 64-bit register until it dawned on me
>that they dropped the 3 bit-within-byte bits.)
>
>The main point of character pointers in the 50 series was to support
>COBOL editing.
---------------------

  Well... you're almost correct. ALFA (Add L to Field Address Register in Vmode)
& ARFA (it's Imode equivalent) will adjust the Field Address Registers by bits.
So, there are indeed instructions that perform arithmetic on the pointers in
bit units. In addition, PCL argument templates (APs) will address to the bit 
level. Also, the field address registers are 36 bits in length. The field length 
registers are 21 bits in length. 36+21 = 57 (not 67). That means there were
7 bits left over (instead of 3 too many). Cobol uses bit pointers to support
decimal arithmetic which requires addressing to the nibble for "packed" data
(i.e. BCD). Arithmetic can be performed on upto 64 (ASCII or BCD) digits. The 
PL/1 (style) compilers use the field address registers to support operations
on the "bit" data type. Finally, when the Prime 400 was being designed (back 
around 1975), there were a set of bit operations defined. These were never 
implemented due to lack of space in the control store and the fact that these 
operation could be performed in software at roughly the same speed.

  As an aside, it's somewhat funny that you mentioned "Hardware First", since 
most of these operations (instructions) were defined by the compiler and OS 
folks. In fact, one of Prime's marketing slogans used to be "Software First".

---------------------
Bob_Beckwith@NeXT.COM
900 Chesapeake Drive
Redwood City, CA 94063

bdg@tetons.UUCP (Blaine Gaither) (09/11/90)

>    Just off hand, would any of you familiar with past
>Burroughs machines care to provide some info on how they
>dealt with bit-oriented data?

In the 1986 time frame (when Burroughs found a pill, swallowed it and
became UNISYS).  Burroughs had 5 main types of systems in the field.

1:  B80/90 Based systems - 
    These were essentially a home grown 8 bit minicomputer
    to compete with 80** systems.  I am not very familiar with these.  But
    I don't think they were of very much academic interest.

2.  Convergent tech -  8086/286 and moto 680xx based systems
    
3.  B1700/18/1900 systems - 24 bit machines as I recall.
    These had bit addressed memories. They were designed originally
    as a multisystem emulation platforms so it was thought
    that a bit addressed memory was a plus for flexibility.  At the time of
    the development of the 1700 the other Burroughs machines were 
    digit (4 bit decimal) addressed, and 48/52 bit addressed machines.
    I don't recall that bit addresses were a big win in that environment,
    at that time.  Since most languages wanted 8 bit addressability, wasting
    3 of 24 address bits is a problem.
    Cobol, Fortran, and Pascal provide little opportunity to
    exploit bit addressing.  The bigest win was probably the ability to address
    4 bit decimal quantities in COBOL.

3.  The B2500-4900 -> PSeries?  This was a decimal addressed memory to memory 
    cobol machine with a very elegant simple instruction set.  
    The decimal addressing was neither a big win nor loss.

The above two machines really became outmoded not because they were
CISC but because of the advent of cache memories, and good optimizing
compilers on medium scale computers.  When there was a huge difference
between the clock rate at which you could run a CPU and the speed of
memory, the impact of multicycle instructions was minimal.

4) B5000-B7900 -> A1-A19?  These are essentially the old stack
machines.  They use a 48 bit word with 4 additional tag bits that
determined the type of the operand.  The machines have a word addressed
memory with 8 six-bit characters /word or 6 - eight-bit characters.
Handling words with non power-of-two number of characters is a disaster
which can only be overcome with lots of hardware.  As far as tagged
memories are concerned, I love the idea of it, but there should still
be separate operator for integer, rational and double.  I would
rather trap if the type of the data I was fetching was guessed wrong,
then not know what FUs to reserve, or how long the operand was.  The
addressing of these machines is in 6MB segments.  I think they allow
each program to have at least 1M segments.  Because all indexed
memory operations took several ops to form a SIRW (stuffed indirect
reference word) descriptor on the top of the stack before loading, addressing 
arrays was painfully slow.

The architectural lessons to learn from Burroughs are:

You loose if:
 
1) Religion takes over
2) A project is "secret"
3) You depend on the advance of technology to improve your system design
   as opposed to evolution of instruction sets and architecture.
4) A general purpose machine is optimized for one language at the expense 
   of all others.

My Opinions not ACs

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (09/11/90)

In article <8599@tetons.UUCP>, bdg@tetons.UUCP (Blaine Gaither) writes:
> 4) B5000-B7900 -> A1-A19?  Because all indexed
> memory operations took several ops to form a SIRW (stuffed indirect
> reference word) descriptor on the top of the stack before loading, addressing 
> arrays was painfully slow.

At least on the B6700, array indexing not only didn't build SIRWs, it
*couldn't*.  The instruction sequence for
	X := A[I];
was
	VALC I; VALC A; NAMC X; STOD
	(value call I; value call A; name call X; store destructive)
The instruction sequence for
	A[I] := X;
was
	VALC I; NAMC A; INDX; VALC X; STOD;
	(INDX takes a number and a pointer to a descriptor
	and produces an indexed descriptor, *not* an SIRW)


-- 
Heuer's Law:  Any feature is a bug unless it can be turned off.

smryan@garth.UUCP (Steven Ryan) (09/20/90)

For an example of an existing 48-bit virtual bit address machine,
see the Star-100/Cyber203/Cyber205.

There is a C (from Purdue?) on the 205 called VectorC. See that for
an example of C on a bit address machine.
-- 
...!uunet!ingr!apd!smryan                                       Steven Ryan
...!{apple|pyramid}!garth!smryan              2400 Geng Road, Palo Alto, CA