wilson@uicbert.eecs.uic.edu (Paul Wilson) (02/14/91)
I don't know if 64-bit addressing is a good idea in general, and I suspect it's a *bad* idea for languages like Lisp and Smalltalk, where the standard implementation technique is to make everything the size of a pointer, so that you can store either a pointer or a tagged "immediate" value anywhere. Going to 64-bit addresses would *double* your memory requirements. There are much better ways to spend transistors. It's possible to implement 64-bit (or 100 bit, or whatever) addresses efficiently on standard 32-bit hardware by performing address translation at _at_page_fault_time_ -- you "swizzle" (translate) big addresses into small ones so that running programs only ever see "normal" hardware-supported addresses. This scheme *can* support more RAM than the addresses specify, if you use a kind of page-switched architecture. (A form of bank switching, where every RAM page knows whether it is mapped into the address space, and where.) Programs (and programmers) don't have to see this page-switching. Pointers are swizzled and pages are switched at page-fault time. (This scheme is a recent development, and wasn't in the tech report on swizzling that I sent out to a lot of people who requested it.) [By the way, everybody who asked more than a week ago -- 75 people! -- should have gotten it by now.] With swizzling, you run out of address space every now and then (after touching around a gigabyte or so of pages), and you have to use faulting to re-map memory. There's some overhead at page fault time, but it's not like a real disk seek. It's probably not a lot, especially if there's any reasonable locality, since then you don't really do it that often compared to the huge number of instructions you're executing in between. This can be seen as the RISC philosophy applied to address translation. Usually there's enough locality that a 32-bit address can specify plenty of locations to keep the processor happy. Every now and then, you have to adjust the address space seen by the 32-bit processor (so that it can actually see more memory than that). You *don't* do it by just making the processor bigger and fatter, you do it by handling the unusual case in software. For Lisp & Smalltalk, even if you have a 64-bit bus, I suspect you'd be *much* better off fetching two 32-bit words (e.g., whole cons cells) at a time. As for type tags, using the two low-order bits as a tag works fine if you put the rest of the type information in object headers; that's how most people do it now. Big-bag-of-pages schemes are not so popular, partly because they make a generational garbage collector a lot more complicated. Even if you can afford 64-bit chips, you may not want them. A 32-bit chip can probably be made faster, if only by using faster materials with a lower level of integration. The extra overhead for the occasional swizzling and page-switching is probably well worth it. I'm not really clear on what the "optimal" hardware address size is, but I'd bet that for Lisp and Smalltalk, it's *not* 64 bits. Of course, if you want a bunch of bits to specify a huge number of addresses across a parallel machine, you may want larger addresses; but then you're really using a lot of those bits as processor numbers. -- Paul Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@bert.eecs.uic.edu Box 4348 Chicago,IL 60680 -- Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@bert.eecs.uic.edu Box 4348 Chicago,IL 60680
lamaster@pioneer.arc.nasa.gov (Hugh LaMaster) (02/14/91)
In article <1991Feb13.170045.16864@uicbert.eecs.uic.edu> wilson@uicbert.eecs.uic.edu (Paul Wilson) writes: >I don't know if 64-bit addressing is a good idea in general, and I suspect : : : >anywhere. Going to 64-bit addresses would *double* your memory >requirements. There are much better ways to spend transistors. It's OK. Since you will need more than 32 bits to address the physical memory in your 1998-era personal computer, you haven't lost anything! 1/2 :-) >This scheme *can* support more RAM than the addresses specify, if you >use a kind of page-switched architecture. (A form of bank switching, >page-fault time. (This scheme is a recent development, and wasn't >in the tech report on swizzling that I sent out to a lot of people >who requested it.) [By the way, everybody who asked more than a >week ago -- 75 people! -- should have gotten it by now.] As far as I can tell, from your description, and, from the papers which you sent (thanks!), the maximum size of any particular object is, however, the maximum amount of virtual memory which you can map. This is fine for your applications, apparently, which consist of zillions of little objects, most of which are less than a page in size. But, it isn't fine for my user's applications, which require a few large objects, namely, linearly addressed arrays of 10-20% of the size of whatever physical memory is available. So, once physical memories are, say, > 10GB, you limit the size of the application that I can run artificially. Of course, there are techniques for addressing really large objects, but these always run into an efficiency problem when accessing arrays in multiple directions. (In other words, useless :-) >Even if you can afford 64-bit chips, you may not want them. A 32-bit >chip can probably be made faster, if only by using faster materials MIPSCo claims that it will do 64 bit address generations without penalty. Whether or not this turns into a real hardware product, there are machines in existence which require 3 cycles just to generate a *32* bit address. The penalty for 64 bits will probably be manageable. >I'm not really clear on what the "optimal" hardware address size is, >but I'd bet that for Lisp and Smalltalk, it's *not* 64 bits. For Smalltalk, it could waste some memory, based on your description of memory allocation, compared to the swizzled address scheme. The question to answer is: do you expect *only* a large number of small objects, or, do you need to handle the case of a small number of large objects efficiently, too? The history of computing is littered with the bones of segmented architectures, which could handle the first case just fine, but not the second, by the current standards of their day. Are *you* going to tell your users that they can buy a machine with 128 GBytes of memory, but the biggest array they can use is 4 GBytes? (128 GBytes is what I expect to see on a large machine built out of 64 MBit chips.) I wouldn't want to tell my users that! :-) >Of course, if you want a bunch of bits to specify a huge number of >addresses across a parallel machine, you may want larger addresses; Or, a "huge" number of physical memory addresses in a single machine. :-) What seems huge to one person is just a factor of two increase in resolution in a 3-D model to someone else... ****** P.S.: - Quote of the Day - "By 1995, a common desktop machine will have 1,000 MBytes of main memory, 1,000 SPECmarks of performance, 10,000 MBytes of secondary storage, a 100 MByte network connection, a homogeneous 64-bit address space, wide-area shared memory, and a vastly richer interface environment." John Gage and Bill Joy, Unix Today, Feb. 4, 1991 issue. Hugh LaMaster, M/S 233-9, UUCP: ames!lamaster NASA Ames Research Center Internet: lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 With Good Mailer: lamaster@george.arc.nasa.gov Phone: 415/604-6117
eliot@cs.qmw.ac.uk (Eliot Miranda) (02/15/91)
In article <1991Feb13.170045.16864@uicbert.eecs.uic.edu> wilson@uicbert.eecs.uic.edu (Paul Wilson) writes: >I don't know if 64-bit addressing is a good idea in general, and I suspect >it's a *bad* idea for languages like Lisp and Smalltalk, where the standard >implementation technique is to make everything the size of a pointer, >so that you can store either a pointer or a tagged "immediate" value >anywhere. Going to 64-bit addresses would *double* your memory >requirements. There are much better ways to spend transistors. > In a rather pedantic mode (appologies all) this is not quite so! In typical Smalltalk-80 systems many objects are bit objects (e.g. strings, symbols, bytecodes, bitmaps). When Smalltalk-80 implementations went from 16-bit pointers to 32-bit pointers the standard V2.0 image file grew from 598016 bytes (seriously folks!) to about 1 megabyte. I've just measured my image (whilst reading your message) and for a system with 40854 objects 899884 bytes were taken up by 32bit pointers, 702028 bytes by bit objects, and 163416 bytes by 4 byte object headers (40854*4). The average length in bytes of bit objects was 58.639 excluding the display screen's bitmap, (62.3648 bytes including the display screen's bitmap). Each bit object is padded at the end to a 32 bit boundary. The space wasted by the padding was 23737 bytes. If this were a 64 bit system padding to a 64 bit boundary, the wasted space would be 56447 bytes. So my image file should grow from about 40854 * 8 (8 byte object table entry per object) + 899884 + 702028 + 163416 bytes = 2092160 to 40854 * 16 + 899884 * 2 + 702028 + 56447 - 23737 + 163416 * 2 bytes = 3515002 bytes an expansion of 1.68. (Which is depressingly close to 2). But think of the new immediate values you could have! With 8 bit tags: Points(3529), Floats(291), <= 7 byte symbols (1067 symbols size <= 7 vs 4886 >= 8 in this image). <= 7 byte immutable strings (770 vs 1389) <= 7 byte bytecode arrays (also immutable) (1753 vs 4302) So thats at least 3529 * (16 (OTEntry) + 16 (x&y oops) + 8 (header)) + 291 * (16 (OTEntry) + 8 (value) + 8 (header)) + (1067+770+1753) * (16 (OTEntry) + 8 (data) + 8 (header)) less bytes = 265352 So now expansion is from 2092160 to (3515002-265352) which is a factor of 1.55. That was fun :-) -- Eliot Miranda email: eliot@cs.qmw.ac.uk Dept of Computer Science Tel: 071 975 5229 (+44 71 975 5229) Queen Mary Westfield College ARPA: eliot%cs.qmw.ac.uk@nsf.ac.uk Mile End Road UUCP: eliot@qmw-cs.uucp LONDON E1 4NS
pcg@cs.aber.ac.uk (Piercarlo Grandi) (02/15/91)
On 14 Feb 91 21:02:19 GMT, eliot@cs.qmw.ac.uk (Eliot Miranda) said:
eliot> In article <1991Feb13.170045.16864@uicbert.eecs.uic.edu>
eliot> wilson@uicbert.eecs.uic.edu (Paul Wilson) writes:
wilson> I don't know if 64-bit addressing is a good idea in general, and
wilson> I suspect it's a *bad* idea for languages like Lisp and
wilson> Smalltalk, where the standard implementation technique is to
wilson> make everything the size of a pointer, so that you can store
wilson> either a pointer or a tagged "immediate" value anywhere. Going
wilson> to 64-bit addresses would *double* your memory requirements.
Not quite, the problem is rather with *some* Lisp implementations that
do use pair-sized cells for everything. Many Lisps (for example those
built with copying collectors can do it easily) also use variable sizes
allocations, with very small overheads, and all Smalltalk systems use
variable sized segments as explained below:
eliot> In a rather pedantic mode (appologies all) this is not quite so!
eliot> In typical Smalltalk-80 systems many objects are bit objects
eliot> (e.g. strings, symbols, bytecodes, bitmaps). When Smalltalk-80
eliot> implementations went from 16-bit pointers to 32-bit pointers the
eliot> standard V2.0 image file grew from 598016 bytes (seriously
eliot> folks!) to about 1 megabyte.
eliot> [ ... assume pointers, headers, etc double in size but not
eliot> variable sized segments, then get data and so calculations ... ]
eliot> an expansion of 1.68. (Which is depressingly close to 2).
eliot> [ ... but actually one could have as immediate values small sized
eliot> segments, so with more data and calculations ... ] now expansion
eliot> is from 2092160 to (3515002-265352) which is a factor of 1.55.
Actually these large expansion factors (not quite 2, but as you remark
fairly large) need not apply everywhere. Smalltalk regrettably is (like
virtually every other OO or 'AI' language) reference based. If it
*allowed* object composition by contiguity instead as well, as most less
advanced languages do, then many less pointers would be needed. Some
statistics show that sharing of subobjects is fairly rare is some major
applications, so they could be profitably inlined.
eliot> That was fun :-)
Not just fun, terribly useful. Both the data and the calculations.
Thanks for both.
--
Piercarlo Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
sritacco@hpdmd48.boi.hp.com (Steve Ritacco) (02/16/91)
> I don't know if 64-bit addressing is a good idea in general, and I suspect > it's a *bad* idea for languages like Lisp and Smalltalk, where the standard > implementation technique is to make everything the size of a pointer, > so that you can store either a pointer or a tagged "immediate" value > anywhere. Going to 64-bit addresses would *double* your memory > requirements. There are much better ways to spend transistors. This doesn't seem very well thought out to me. 64 bits is a nice healthy amount of room for a cons cell or an immediate value. It was pretty common to steal the top few(or bottom) bits of a word for a tag but with 64 bits you can do that without sacrificing precision. What about applications of clever crd codeing schemes which could give access to a huge address space but efficiently describe common addresses with smaller offsets. I think 64 bits opens up a whole new world of possibilities, and we must be careful to re-think implementations, not just re-implement them.
wilson@uicbert.eecs.uic.edu (Paul Wilson) (02/17/91)
sritacco@hpdmd48.boi.hp.com (Steve Ritacco) writes: >> I don't know if 64-bit addressing is a good idea in general, and I suspect >> it's a *bad* idea for languages like Lisp and Smalltalk, where the standard >> implementation technique is to make everything the size of a pointer, >> so that you can store either a pointer or a tagged "immediate" value >> anywhere. Going to 64-bit addresses would *double* your memory >> requirements. There are much better ways to spend transistors. [ NOTE (added later): I acknowledge Eliot Miranda's criticism of the above, that it doesn't really *double* your memory needs, just increases it by nearly 70%. That's bad enough for me... ] >This doesn't seem very well thought out to me. 64 bits is a nice healthy >amount of room for a cons cell or an immediate value. I'm not sure I get this. It sounds like you're saying you can put an immediate or a cons cell in 64 bits. But since a cons cell has two fields (CAR and CDR) that have to be able to hold tagged pointers (or immediates), a cons cell MUST be at least twice the size of a pointer. (Discounting compression schemes like cdr-coding, or my own page-fault- time compression scheme.) I agree that 64 bits is a nice size for a cons cell, but that means it's *not* a nice size for a pointer. You want 32-bit addresses so you can grab two of them at once if you have a 32-bit bus. And you want pointer swizzling to give you a large address space anyway. (And you want page-fault time compression to reduce the memory and disk requirements even further. > It was pretty common >to steal the top few(or bottom) bits of a word for a tag but with 64 bits >you can do that without sacrificing precision. What about applications of >clever crd codeing schemes which could give access to a huge address space >but efficiently describe common addresses with smaller offsets. Two-bit low tagging works great on byte-addressed machines. Any other information can always be put in object headers, rather than in pointer tags. I don't think it's a big win to have more bits available for tag use. Cdr-coding is dead because it's slow, and requires a custom architecture to support it efficiently, which Isn't Likely. (Look at Symbolics... sigh.) Page-fault time compression is better, but I still don't think you want 64-bit architectures for Lisp or Smalltalk. There's a lot of small (but nontrivial) wins from 32-bit machines instead of 64-bit machines, like larger effective on-chip cache size. Lisp and Smalltalk want *big* (and set-associative) caches. >I think 64 bits opens up a whole new world of possibilities, and we must be >careful to re-think implementations, not just re-implement them. I agree. But we also need to rethink the whole idea of 64-bit architectures. Pointer swizzling on 32-bit architectures is an idea we should look at too. [ I agree with Hugh LaMaster that you have restrictions on object sizes, i.e., gigabyte arrays cause trouble. But it's not clear that there isn't some cleverer way around that than simply doubling the word size. And it's a lot nicer than simple segmenting, for most purposes, because programs and programmers only have to see one kind of pointer, etc. And you only need a 32-bit hardware bus. ] -- Paul -- Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@bert.eecs.uic.edu Box 4348 Chicago,IL 60680
wilson@uicbert.eecs.uic.edu (Paul Wilson) (02/18/91)
pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: >On 14 Feb 91 21:02:19 GMT, eliot@cs.qmw.ac.uk (Eliot Miranda) said: >eliot> In article <1991Feb13.170045.16864@uicbert.eecs.uic.edu> >eliot> wilson@uicbert.eecs.uic.edu (Paul Wilson) writes: >wilson> I don't know if 64-bit addressing is a good idea in general, and >wilson> I suspect it's a *bad* idea for languages like Lisp and >wilson> Smalltalk, where the standard implementation technique is to >wilson> make everything the size of a pointer, so that you can store >wilson> either a pointer or a tagged "immediate" value anywhere. Going >wilson> to 64-bit addresses would *double* your memory requirements. >Not quite, the problem is rather with *some* Lisp implementations that >do use pair-sized cells for everything. Many Lisps (for example those >built with copying collectors can do it easily) also use variable sizes >allocations, with very small overheads, and all Smalltalk systems use >variable sized segments as explained below: That's not what I was talking about. I *wasn't* talking about only being able to allocate memory in pair-sized units. (My own garbage collector certainly doesn't have this limitation -- see my 1989 OOPSLA paper if you're interested.) I was just saying that pairs have to be able to hold two pointers, so that (barring compression), they have to be bigger than the pointer (address) size. Twice as big or more, in the general case. So you can use 64-bit pairs, or 64-bit pointers, but not both. Eliot is still right, however, that some objects (like string and bitmap objects) contain fields that can't hold a pointer. This is a ubiquitous efficiency hack. These objects therefore needn't expand, so we don't really double memory requirements. >Actually these large expansion factors (not quite 2, but as you remark >fairly large) need not apply everywhere. Smalltalk regrettably is (like > ^^^^^^^^^^^ >virtually every other OO or 'AI' language) reference based. Them's fightin' words :-) >*allowed* object composition by contiguity instead as well, as most less >advanced languages do, then many less pointers would be needed. Some >statistics show that sharing of subobjects is fairly rare is some major >applications, so they could be profitably inlined. Right, sharing is comparatively rare, but when it's used, it's *very* *very* useful. The nice thing is that you don't have to rewrite your code if you decide something should be shared -- you don't have to "de-inline" it. It's no concidence that the first gc'd language (LISP), and the most completely object-oriented languages (Smalltalk and Self) have uniform-sized value cells for most purposes, into which you can shove pretty much anything. Even strongly-typed polymorphic languages tend toward this. It's handy to be able to change sharing patterns when you need to, without redefining all of your object types to un-inline their components. You can write much more general and reusable code. An object is an object is an object (and other hype). This is not to say I'm against contiguous compound objects; I'm just saying they have a downside in terms of generality. >eliot> That was fun :-) >Not just fun, terribly useful. Both the data and the calculations. >Thanks for both. Yes, thanks. >-- >Piercarlo Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk >Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg >Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk -- Paul -- Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@bert.eecs.uic.edu Box 4348 Chicago,IL 60680
sxc@se2.dsto.oz (Stephen Crawley) (02/21/91)
pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: [... about how going from 32 to 64 bit pointers hits Smalltalk etc] >Actually these large expansion factors (not quite 2, but as you remark >fairly large) need not apply everywhere. Smalltalk regrettably is (like >virtually every other OO or 'AI' language) reference based. If it >*allowed* object composition by contiguity instead as well, as most less >advanced languages do, then many less pointers would be needed. Some >statistics show that sharing of subobjects is fairly rare is some major >applications, so they could be profitably inlined. Adding explicit inlining of data structures to a typical reference based language (RBL) would destroy many of the properties that make it so advanced! If you add constructs that allow the programmer to declare static (non-reference) variables and fields: 1) You need an "address-of" operator: o if you don't have one, statics are not first-class o the address-of operator may be implicit e.g. use call-by-reference when passing statics as function args o if you have address-of (implicit or explicit) you have to get local frames containing statics from the heap. If you don't do this, the language will have dangling pointer problems ... baaaad! 2) The programmer is encouraged to spend too much effort on storage efficiency when this is largely irrelevant in most cases. [What costs more; memory or programmer time?] 3) The computation model is more complex: o address-of gives an extra source of aliasing; another source of errors => less reliable programs o programs are harder to reason about. 4) The language syntax is larger and messier. The alternative is to have the compiler/linker infer which variables and fields in a RBL program can be inlined without changing the program's semantics. Unfortunately this optimisation would need to be done globally to get much benefit. Hence we can't use it in an incremental or persistent environment. [That covers most "advanced" languages ... sigh] At least this isn't unpleasant for the programmer. In addition, inlining has hidden performance penalties; 1) You have to allocate local frames on the heap. This costs CPU on procedure entry / exit, extra GC overheads, extra memory. Clever optimisation can avoid some of this, at the cost of compilation time and compiler complexity. [Of course many RBL's start out with local frames on the heap!] 2) The garbage collector must understand pointers into the middle of objects. This costs CPU time and maybe memory too. 3) You can end up with uncollectable garbage. E.g. if you pass a reference to a static field of an object, the entire object must be retained by the GC. [OK, so the programmer / optimiser shouldn't have inlined that field. But that's just making his / her / its job harder!] -- Steve