horst@techfak.uni-bielefeld.de (04/30/91)
Does anyone know what TAGGED DATA instructions are useful for and how to use them? Tagged data is assumed to be 30 bits wide followed by trwo bits set to zero. The SPARC allows add and subtract instructions on tagged data. HH [Most likely it's for immediate integers in a Lisp-like system that uses tagged pointers, but I hope someone who actually knows will tell us. -John] -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
eb%watergate@lucid.com (Eric Benson) (04/30/91)
In article <9104291542.AA11213@flora.techfak.uni-bielefeld.de> horst@techfak.uni-bielefeld.de wrote: > Does anyone know what TAGGED DATA instructions are useful for and how to > use them? Tagged data is assumed to be 30 bits wide followed by two bits > set to zero. The SPARC allows add and subtract instructions on tagged data. > > [Most likely it's for immediate integers in a Lisp-like system that uses > tagged pointers, but I hope someone who actually knows will tell us. -John] Yes, the tagged arithmetic instructions were put in the SPARC architecture for Lucid Common Lisp. If the low-order two bits of a Lisp object reference are zero, it is a 30-bit immediate fixnum. If some of those bits are non-zero, it may be a pointer to a floating point number or a bignum (arbitrary-precision integer). Generic arithmetic is generally optimized for the fixnum case, since the overwhelming majority of arithmetic is performed on small integers. On many machines + is compiled inline as Test low order two bits of first operand. If nonzero, use general case. (Operand could be a float or bignum.) Test low order two bits of second operand. If nonzero, use general case. (Operand could be a float or bignum.) Add two operands. If overflow, use general case. (Result is a bignum). On the SPARC this is done as one instruction (TADDCC) followed by a conditional branch rarely taken. eb@lucid.com Eric Benson 415/329-8400 x5523 Lucid, Inc. Telex 3791739 LUCID 707 Laurel Street Fax 415/329-8480 Menlo Park, CA 94025 -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
moss@cs.umass.edu (Eliot Moss) (04/30/91)
Well, I cannot speak for SPARC and say what the instructions were DESIGNED for, but as the moderator pointed out, they can be used to good effect in implementing languages such as Smalltalk and LISP, which used tagging to distinguish (small, i.e., 30-bit) integers from pointers. One uses a tag of 00 in the low bits for integers, and a tag of 01 (say) for pointers. All offsets from pointers are scaled by -1 to compensate for the 01 in the low bits. Note that integer add/sub on pointers will be trapped (if you used the tagged add/sub instructions) and pointer access off an integer can also be trapped. This means you don't have to insert gobs of tag checking code all over the place. Multiply and divide tend to require scaling and adjustment anyway, and bsides, they take long enough, and are rare enough (compared with add/sub) that additional penalty in handling the tags is judged "acceptable". -- J. Eliot B. Moss, Assistant Professor Department of Computer and Information Science Lederle Graduate Research Center University of Massachusetts Amherst, MA 01003 (413) 545-4206, 545-1249 (fax); Moss@cs.umass.edu -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
weaver@Sun.COM (Michael Weaver) (04/30/91)
In article <9104291542.AA11213@flora.techfak.uni-bielefeld.de> horst@techfak.uni-bielefeld.de writes: >Does anyone know what TAGGED DATA instructions are useful for and how to >use them? ... Tagged data instructions apparently were borrowed from the Berkeley SOAR (Smalltalk on a RISC) project. Studies have shown that much of the time in used by Smalltalk programs is taken up in adds and subtracts, even though for most adds and subtracts both operands are integers, because in practice for every add (etc.) a method lookup must be done. That is, the type of the both operands must be checked, and then a piece of code that will perform the operation on this combination of operand types must be found and invoked. Tagged instructions can be used (on SPARC) by generating instructions as though all these adds and subtracts were on integers, but using the tagged add (etc.) instruction rather than add. Integers are encoded by setting the two low-order bits of a word to zero to indicate an integer, and the upper 30 to represent the value. Data types other than integers must have at least one of the two lowest bits set, and the upper 30 bits can be encoded arbitrarily, so that the 32 bits can be used (by software) to determine the type and value of the operand. If two such integers are added with a tagged add, the result is the sum, similarly encoded. However, if either of the operands of a tagged add has either of its low bits set, then a trap is taken. The trap handler then can check both operands, dispatch to appropriate code to effect the add, and then resume execution following the add. The overall effect is that adds of integers happen quickly while adds of other types are slowed down a bit. If most of the adds are actually integers, the overall run times are improved. I imagine that Lisp can benefit from these instructions similarly. See Bush et. al. 'Compiling Smalltalk-80 to a RISC', in Proceedings Second International Conference on Architectural Support for Programing Languages and Operating Systems (ASPLOS II). Michael Weaver. -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
kers@otter.hpl.hp.com (Chris Dollin) (05/07/91)
Eliot Moss says: Well, I cannot speak for SPARC and say what the instructions were DESIGNED for, but as the moderator pointed out, they can be used to good effect in implementing languages such as Smalltalk and LISP, which used tagging to distinguish (small, i.e., 30-bit) integers from pointers. One uses a tag of 00 in the low bits for integers, and a tag of 01 (say) for pointers. All offsets from pointers are scaled by -1 to compensate for the 01 in the low bits. Doesn't this choice make inter-language working unnecessarily hard? It means that structures containing pointers cannot be safely passed to (say) C routines, because all the pointer values are wrong. (Structures that you pass to foreign procedures need their numbers raw anyway.) Seems to me that the fixnum tag should have been something other than 0. Isn't it nice when hardware does *almost* what you want? -- Regards, Kers. -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
pardo@june.cs.washington.edu (David Keppel) (05/09/91)
>>[SPARC tagged instructions: set low bits in pointers.] kers@otter.hpl.hp.com (Chris Dollin) writes: >[Inter-language hard; pointers cannot be safely passed.] >Isn't it nice when hardware does *almost* what you want? Or put another way: isn't it nice when your programming environment lacks standardized representations for inter-language calls and your compiler and linker lack hooks for taking advantage of them even if they did exist? ;-D on ( Tagged code ) Pardo -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
henry@zoo.toronto.edu (Henry Spencer) (05/10/91)
In article <KERS.91May7093547@cdollin.hpl.hp.com> kers@otter.hpl.hp.com (Chris Dollin) writes: >Doesn't this choice make inter-language working unnecessarily hard? It means >that structures containing pointers cannot be safely passed to (say) C ... It may be impossible to pass structures to C anyway, because of other design decisions made differently. Even calls between C and FORTRAN, which are *much* closer in basic philosophy than C and Lisp-derived languages, have many boobytraps and take careful attention on both ends. Given that both ends know what is going on, actually, there is no disastrous problem. The C code simply has to correct the values of incoming pointers (in an inevitably machine-specific way -- all these conventions are quite machine-specific!) before using them. This is, at worst, a fairly routine problem of inter-language calls. It can be much worse. >the fixnum tag should have been something other than 0. Except that then you need a special adder which knows about it, because you don't want the tag to change during (say) fixnum addition, and 0 is the only one with that property. The low-bits-zero scheme potentially involves no extra data-path hardware, because the same old adder will work and the check-for-non-zero-bits hardware is already there for pointers. >Isn't it nice when hardware does *almost* what you want? Most Lispish-language users consider higher execution speed more important than more convenient interlanguage calls. The hardware *is* doing what they want. -- Henry Spencer @ U of Toronto Zoology henry@zoo.toronto.edu utzoo!henry -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
pardo@june.cs.washington.edu (David Keppel) (05/11/91)
henry@zoo.toronto.edu (Henry Spencer) writes: >[It may be difficult to pass structures to C anyway because of > other design decisions made differently.] Indeed, it may be difficult to pass structures between C and C if the fragments were compiled with different compilers (e.g., libraries). Problems include: * Structure passing (most notably return) conventions * Alignment * Padding So, for example, a compiler for a machine that allows unaligned fetches (e.g., the VAX) might implement struct { char c; int i; } as: one byte followed by 4 bytes, non-padded one byte followed by 4 bytes, padded to 8 bytes 1 byte, padded to 4 bytes, followed by 4 bytes I think these choices are all legal, but I wouldn't swear to it. The point is that a compiler may legitimately derive different struct layouts from one hadware specification. The second one is pretty silly, but the third may actually improve performance by reducing the number of unaligned fetches. So even if all the world programmed in C, we still wouldn't have solved the interoperability problem :-) ;-D on ( Inter-city commuting? Inter-face computing ) Pardo -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.