[comp.arch] Unaligned Data Trapping -- Highly Useful for Lisp

lnz@lucid.com (Leonard N. Zubkoff) (08/14/90)

Alignment trapping by the hardware and operating system is in fact a highly
useful technique for getting some free type checking in Lisp code on
general-purpose architectures.  First, a bit of background on the layout of
Lisp data types in Lucid Common Lisp.

On most architectures, Lucid Common Lisp uses a 3 bit tag in the low order 3
bits of each 32 bit word.  Two such tags, 0 and 4, are reserved for even and
odd fixnums, which allows for 30 bit signed integers as an immediate data type.
The remaining 6 tags are apportioned among immediate data types, where the
actual data is present, and nonimmediate or pointer data types where the tag
bits are smashed to a fixed value (usually 0) to form a pointer to the object
in the heap.  This requires all heap-allocated objects to be a multiple of 8
bytes in length, which works well in practice since all nonimmediate objects
except for CONS cells have a header word in memory describing their secondary
type and length.

Consider the following assignment of primary tags:

	 PRISM Tagging Scheme

0   Even fixnum		    Even 30 Bit Integers
1   Cons		    CONS Cells
2   Unbound		    Various Internal Markers
3   Other Immediate	    Immediate Data Types
4   Odd Fixnum		    Odd 30 Bit Integers
5   NIL			    The Distinguished Value NIL
6   Other Pointer	    Pointer Data Types
7   Symbol		    Symbols

References to the components of an object are generally a small integer
displacement from a register holding the object, and the CAR and CDR operations
normally compile into something of the form:

	CAR:	move [m1,3], m2
	CDR:	move [m1,-1], m2

Note that because the displacement in the CAR move instruction is 3, if
unaligned word accesses trap then the hardware is performing free type checking
on all CAR and CDR operations, because executing this instruction when M1
contains other than NIL or a CONS will trap, and the only two types on which
the CAR and CDR operations are allowed are NIL and CONS cells.

The particular tagging scheme presented above is one I specifically designed
for Apollo's PRISM architecture, and hardware type checking normally occurs for
Other-Pointer and Symbol objects as well on this machine.

Since Lisp code and C code coexist in the same address space, it is best if the
operating system provides a mechanism to declare at run time the range of
addresses where it should reflect the alignment trap to the user instead of
silently fixing it up and continuing.  C code may or may not want to trap
unaligned references, depending on the constraints imposed by the C compiler;
Lisp code always wants them trapped.

		Leonard N. Zubkoff
		Principal Scientist
		Lucid, Incorporated