andrew@frip.gwd.tek.com (Andrew Klossner) (07/21/88)
> The question (x>>32) has come up. K&R and ANSI-C say the result is undefined. > The 68000 shift instruction will give you a 0. > The 80286 shift instruction will give you a 0. > The 80386 shift instruction will give you x. > The Weitek XL shift instruction will give you x. > I don't know what MIPS Co. 88000 or 29000 return but I suspect the C compiler > does what the instruction does. The 88000 will give you ... x&1. Here's why: the shift-right function is implemented by a generalized "extract bit field" instruction. The assembler notations are: ext rd,rs,wid<off> ; extract signed field of width w, offset o ; from rs, store in rd (with sign extension) extu rd,rs,wid<off> ; like ext but field is unsigned ; (extended with zeros) ext rd,rs,rind ; rind contains width and offset, see below extu rd,rs,rind ; ditto, unsigned The "wid" (width) and "off" (offset) fields of the instructions are each five bits wide. A specified width of 0 is taken to mean 32, in which case the instruction acts like a (signed or unsigned) right shift of "offset" bits. For the register-indirect version, rind is laid out as follows: bit: 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|w|w|w|w|w|o|o|o|o|o| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \_______________don't care________________/ \_width_/ \_offset/ (The low ten bits of this register match the low ten bits of the direct form of the extract instruction.) All these instructions operate in a single cycle, so the generality does not detract from performance. They are included for the RISCy reason that C code which manipulates bit fields shows up a lot in Unix execution traces. Now, the naive code to implement z = x>>y (assuming all operands in registers) is just extu z,x,y but this only does the right thing if y is in the range [0..31]. If y is 32, this is equivalent to extu z,x,1<0> which extracts a one-bit-wide field, the least significant bit of x. A compiler choosing to implement extended semantics for shift count (outside [0..31]) should emit code to qualify the operand before using it. (The shift-left instruction is implemented by the "mak" instruction, which positions a bit field at a specified offset within a register, filling the other bits with zeros.) -=- Andrew Klossner (decvax!tektronix!tekecs!andrew) [UUCP] (andrew%tekecs.tek.com@relay.cs.net) [ARPA]
aglew@urbsdc.Urbana.Gould.COM (07/21/88)
>..> Andrew Klossner talks about the 88K's bit field instructions: > >For the register-indirect version, rind is laid out as follows: > > bit: 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 > 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > |x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|w|w|w|w|w|o|o|o|o|o| > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > \_______________don't care________________/ \_width_/ \_offset/ > >(The low ten bits of this register match the low ten bits of the direct >form of the extract instruction.) > > -=- Andrew Klossner (decvax!tektronix!tekecs!andrew) [UUCP] > (andrew%tekecs.tek.com@relay.cs.net) [ARPA] Perhaps someone can answer what I see as one of the biggest problems with the 88K: why weren't 6 bit fields provided for the bit field instructions? It seems short sighted to design an _architecture_ that is limited to 32 bits (as opposed to an implementation).
andrew@frip.gwd.tek.com (Andrew Klossner) (07/24/88)
[]
	"Perhaps someone can answer what I see as one of the biggest
	problems with the 88K: why weren't 6 bit fields provided for
	the bit field instructions? It seems short sighted to design an
	_architecture_ that is limited to 32 bits (as opposed to an
	implementation)."
The 88k shift (and all other ALU) instructions operate only on
registers.  Like other RISC architectures, its only instructions which
touch memory are load, store, and exchange.  Registers are fixed at 32
bits wide in the architecture.
To implement a double-register shift in a future chip, the "right"
approach would probably be to define a new instruction.  With the
current chip organization, such an instruction would necessarily take
multiple cycles because there aren't enough register ports to do it in
one cycle.  This would be the first multi-cycle "integer unit"
instruction, and I suspect it would bend the integer unit considerably
to fit it in.
  -=- Andrew Klossner   (decvax!tektronix!tekecs!andrew)       [UUCP]
                        (andrew%tekecs.tek.com@relay.cs.net)   [ARPA]