[comp.arch] MIPS R[23]000 and Kane's book

chris@mimsy.umd.edu (Chris Torek) (12/21/89)

I finally picked up a copy of Kane's book yesterday.  After reading it,
I find I have a few things to note here:

 - The typesetting is terrible.  Looks like someone did it on a Mac one
   page at a time (inconsistent spacing, etc).  Not too offensive,
   although there are some real errors:
	p. 7--5, `C.cond.fmt' instruction listed as a `C.fmt.cond'
		 instruction
	p. D--6, `bal' instruction is not marked with a pointing hand
		 dingbat (bal label => bgezal $0,label)
   The whole thing needs more proofreading.  Note that this is in the
   second printing: these things should have been fixed by now!

 - The explanation of exception servicing leaves a great deal to be
   desired with respect to the BD bit and branch delay slots.  In
   particular, the `servicing' text for all but one case where branches
   matter merely says `if the instruction is in a branch delay slot,
   the branch instruction must be interpreted [to find the exception
   return address].'  Under `Reserved Instruction Exception', however,
   there is much more, including:  `If the undefined instruction is in
   the branch delay slot, the routine that implements the instruction
   is responsible for simulating the branch instruction after the
   undefined instruction has been ``executed''.  Simulation of the
   branch instruction includes determining if the conditions of the
   branch were met and transferring control to the branch target
   address (if required) or to the instruction following the delay slot
   if the branch is not taken.'

   Now, the former text can be taken to imply that the `BD' bit is set
   only when a *taken* branch is interrupted during execution of the
   instruction in its branch delay slot.  (After all, if the branch is
   not branching, the instruction after the branch is not *in* a delay
   slot at all: it is merely another instruction.)  Until I read the
   paragraph under `Reserved Instruction Exception', this was how I
   figured things probably worked---it would certainly make the exception
   handler simpler, since one could assume that, if the BD bit were set,
   the return PC for the rfe instruction should be
	epc + 4 + 4 * *(short *)epc /* if little-endian */
   (for all branches except j, jal, jalr, and jr instructions), while
   if BD were not set, the return PC would be epc+4.

   Fortunately, the latter paragraph appears, if only once.  Alas, it
   is also misleading: it says `simulating the branch instruction after
   the undefined instruction has been ``executed'' '---but the branch
   must be simulated *before* the undefined instruction is handled,
   since the undefined instruction could affect the conditions being
   tested by the branch.  (A better way to put it is that the branch
   must be done based on the machine state before the undefined
   instruction is interpreted.)

 - No explanation is given as to what happens when the instruction in
   a branch delay slot is a branch.  Testing shows that the result is
   a `visit':

		.set	noreorder	# not described in Kane!

		add	$2,$0,$0	# r2 <- 0
		j	L1		# branch to label 1
		j	L2		# (but this is done anyway)
	L1:	addi	$2,$2,1		# r2++
		j	$31		# return (not executed)
		nop			# fill branch slot (just in case)
	L2:	addi	$2,$2,4		# r2 += 4
		j	$31		# return
		nop			# fill branch slot

   (The result is the same if the first `nop' is removed---the instruction
   after the one at L1 is never noticed.)

 - On the R2000 itself: dynamically relocatable code is possible, but
   hard.  The only way to find out where you are now is to use a
   `branch on greater than or equal to zero and link' on register 0
   (always 0).  This stores the program counter into register 31.  The
   PC is otherwise inaccessible.  Dynamic linking of shared libraries
   will be difficult.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

jmk@alice.UUCP (Jim McKie) (12/22/89)

In response to <21378@mimsy.umd.edu>:

Kane's book it truly awful, it is useless as a programming
aide for the R[23]000. There are errors, contradictions and
omissions. Particularly amusing is the description of the
MTHI and MTLO instructions - read them both then try to write
exception handling code.
Also, the instruction opcode bit encoding tables
are possibly the most obscure way the information could be
presented.
Fortunately, there is a new edition in the works, which we
are assured will have many of the problems fixed.

However, to be fair, it does mention that placing a branch
instruction in a branch-delay slot is undefined (page A-7),
and 'noreorder' is (indirectly) defined on pages C-5 and C-6.

Jim McKie	research!jmk	-or-	jmk@research.att.com
Bell Labs

chris@mimsy.umd.edu (Chris Torek) (12/23/89)

In article <10279@alice.UUCP> jmk@alice.UUCP (Jim McKie) writes:
>Also, the instruction opcode bit encoding tables
>are possibly the most obscure way the information could be
>presented.

(hi jim)

(p. A-87)  The `opcode' and `special' tables are fine; it is
only the `bcond' and the two `cop' tables that are bogus.

>However, to be fair, it does mention that placing a branch
>instruction in a branch-delay slot is undefined (page A-7),
>and 'noreorder' is (indirectly) defined on pages C-5 and C-6.

There is more on reorder/noreorder under `.set' on p. D-16.  None of
the above can be found in the index (I looked before posting).  The
only excuse I have for missing them the first time is brain fatigue (I
read the book all at once).

After some email conversation, I have found out how to describe
returning from exceptions:

	To return from an exception, load the return address into
	one of the two reserved kernel registers (k0 or k1) and do
	a `j k0; rfe' (or j k1; rfe).  Typically EPC is the return
	address, but see below.  In this description, `EPC' refers
	to the numeric value in EPC, and `return' means to return to
	the address given by this value.

	If the `BD' bit is not set:

		The instruction that trapped (and thus did not
		complete, and did not modify the machine state) is at
		EPC.

		If the instruction is to be retried, return.  If the
		instruction is to be skipped, add 4 to EPC and return.
		If the instruction is to be simulated, simulate it, add
		4 to EPC (to skip it), and return.

	If the `BD' bit is set:

		The instruction that trapped is at EPC+4.

		If the instruction is to be retried, return.  (This
		re-executes the branch, but this is safe since branches
		modify no machine state save the PC, which will be
		overwritten as part of the return sequence.)  If the
		instruction is to be skipped or simulated, the branch
		instruction at EPC must be simulated first (using
		machine state as of the trap) to compute the return
		address.  Once this is done, the trapped instruction
		may be simulated or simply ignored.  Once this is done,
		return to the address calculated by the first
		simulation.

I believe this description is correct.

Now all you need is a list of which instructions are to be retried,
which are to be simulated, and which are to be skipped.  This depends
on the source of the interrupt too.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris