[comp.arch] MIPS R[236]000 interrupts

chris@mimsy.umd.edu (Chris Torek) (08/25/90)

>In article <41066@mips.mips.COM> cprice@mips.COM (Charlie Price) writes:
>>If an exception occurs during execution of the instruction in a branch
>>delay slot or "between" a branch and the instruction in the branch-
>>delay slot, the Cause register has the Branch Delay (BD) bit set and the
>>EPC register contains the address of the branch instruction.

[and the interrupt handling routine has to emulate the branch]

In article <1990Aug25.014235.6894@mozart.amd.com> tim@proton.amd.com
(Tim Olson) writes:
>Just curious -- what happens in the perverse case that someone tries a
>conditional branch-and-link instruction using the link register as a
>conditional source ...

There are a number of things you *can* do which are effectively labelled
`unpredictable'; this is one of them.  If you try it your code misbehaves.
(The machine continues to function normally, but your program mysteriously
bombs sometimes.)

(Surely the 29000 has some places where the architecture book says `if
you do this, you lose'?)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

cprice@mips.COM (Charlie Price) (08/27/90)

In article <26200@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:
>>In article <41066@mips.mips.COM> cprice@mips.COM (Charlie Price) writes:
>>>If an exception occurs during execution of the instruction in a branch
>>>delay slot or "between" a branch and the instruction in the branch-
>>>delay slot, the Cause register has the Branch Delay (BD) bit set and the
>>>EPC register contains the address of the branch instruction.
>
>[and the interrupt handling routine has to emulate the branch]
>
>In article <1990Aug25.014235.6894@mozart.amd.com> tim@proton.amd.com
>(Tim Olson) writes:
>>Just curious -- what happens in the perverse case that someone tries a
>>conditional branch-and-link instruction using the link register as a
>>conditional source ...
>
>There are a number of things you *can* do which are effectively labelled
>`unpredictable'; this is one of them.  If you try it your code misbehaves.
>(The machine continues to function normally, but your program mysteriously
>bombs sometimes.)
>
>(Surely the 29000 has some places where the architecture book says `if
>you do this, you lose'?)

From the "Kane book" description of BGEZAL (one such instruction):

	General register "rs" may not be general register 31,
	because such an instruction is not restartable.
	An attempt to execute this instruction is *not* (italics)
	trapped, however.

This is one of the things that you can trip over with
"simple" pipelined machines,
and I suppose this one isn't all that obvious.
Clearly the assembler should warn you about this dubious usage
similar to the warnings it issues when you write in .noreorder
mode and use a target register in a branch delay slot.
The question is whether it does.
The answer is no, it doesn't.  I've entered a bug report.  Thanks.
-- 
Charlie Price    cprice@mips.mips.com        (408) 720-1700
MIPS Computer Systems / 928 Arques Ave. / Sunnyvale, CA   94086-23650

colin@array.UUCP (Colin Plumb) (08/28/90)

In article <26200@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:
> (Surely the 29000 has some places where the architecture book says `if
> you do this, you lose'?)

In the 29000's stack cache, there are four pointer registers: the base
stack pointer (which all local register references are relative to)
and three indirect registers, which are most often used to pass
register addresses to trap handlers.  For example, all the defined-but-
unimplemented opcodes do local register mapping and protection checking
on the register operand fields and, if all is well, bung the results
into the three indirect pointer registers for use by the emulator.

If you set any of these registers, and then try to use it implicitly
(by accessing a local register or register 0, which means "wherever the
indirect pointer indicates) the next instruction, you may get the old
value (if nothing intervened) or the new one (stall or interrupt).
After one cycle, it's safe.

This could be interlocked (you just have to stall the instruction in
decode/register fetch), but I think the designers felt it wasn't
worth the bother.  If you promised a big enough order, though...

There are also a couple of things that are supervisor-only and pretty
low level that have some latency.  But that's the only user-visible
one.

Has anything changed on the 29050, guys?
-- 
	-Colin

r_carlso@hpfcdj.HP.COM (Richard_Carlson) (08/30/90)

> The point is that ignoring a parity error is a pretty safe thing to do; there's
> very little chance of getting a misleading answer. Much better than crashing
> the computer, which is guaranteed to lose you whatever you had in memory.
> 
> Russell Wallace, Trinity College, Dublin
> rwallace@vax1.tcd.ie

I used to feel this way until I had an interesting experience with some 
Apple ][s.  I was developing and testing 6502 assembly code on one
machine with an emulator; then programming EPROMs on another, remotely-
and inconveniently-located, machine.

I got the code working on the emulator, burned some EPROMs, and then the
program would crash and die.  I burned new sets of EPROMs and they had
the same problem.  The EPROMs verified when programmed; and the programmed
EPROMs verified against the data on my disk.  Considering some of the
hardware differences between emulation and actually running from ROM, such
as being able to map different ROM pages into the same memory addresses
while the processor was executing out of those addresses, I spent a lot
of time looking for software problems in my code.

It turns out that the programmer Apple had some stuck-at faults in its
RAM.  I never suspected that when I was verifying my EPROMs, the data in
*system RAM* was corrupt.  Although I'm not convinced that crashing on a
parity error is the right thing to do, simply ignoring them (or not
having parity at all) really can lead to a lot of headaches and hassles.

On a related note (that probably doesn't belong on comp.arch, oh well):
why in the world does UNIX (Sun's 4.3, in particular) sync the disks
after a parity error panic?  If you're halting all processing because
you can't assume RAM is OK, it seems foolish to write out some of this
questionable data to your disk.  Otherwise (if you know only one
particular RAM location got trashed), why not just send a signal to
the process(es) that care about that location and not panic at all?

--Richard
   ...!hplabs!hpfcmb!carlson

davidb@brac.inmos.co.uk (David Boreham) (09/20/90)

In article <2372@cirrusl.UUCP> douglas%cirrusl@oliveb.ATC.olivetti.com (Douglas Lee) writes:
>In <1990Sep7.003451.13193@portia.Stanford.EDU> dhinds@portia.Stanford.EDU (David Hinds) writes:
>
>>   But don't you really only need one parity bit per word, if you only
>>want to be able to detect single bit errors?  Using one parity bit
>
>But using byte parity allows you to do things like byte writes. If you
>use word parity, you must do a read modify write for every byte in
>order to update the parity of the word. This is very inefficient.
>


Interesting question: 
    Would a micro loose performance if it *NEVER* did
    byte writes ? It would certainly be an advantage to
    be able to turn them off for applications where
    EDC was used---the CPU would do the RMW cycles for
    you.
    
Also, what about EDC versions of the micros which have 
64-bit paths to memory. I don't recall seeing any 
standard 64-bit EDC chips available yet ? On the 
processors which allow bus cycle abort late in the cycle,
the error detect logic shouldn't need to be that fast
as it aught to be possible to catch an error and flush
the necessary pipelines.

Given a CPU chip which transparently implemented EDC
I'm sure that many systems would use the feature and
perhaps we could get away from all these arguments about
whether parity is worth the hassle---you would get EDC
merely by adding an extra memory chip or two.
David Boreham, INMOS Limited | mail(uk): davidb@inmos.co.uk or ukc!inmos!davidb
Bristol,  England            |     (us): uunet!inmos.com!davidb
+44 454 616616 ex 547        | Internet: davidb@inmos.com

davidb@brac.inmos.co.uk (David Boreham) (09/20/90)

In article <DAVEG.90Sep7233206@near.cs.caltech.edu> daveg@near.cs.caltech.edu (Dave Gillespie) writes:
>I wonder, I can see single-bit errors occurring in isolation, but
>how likely is it to have an exactly two-bit error?  Most catastrophes
>I can think of will nuke one bit or many.  And if the only danger is
>two statistically independent errors occuring at once in the same
>word, I think a more pressing danger is that your machine might be
 ^^^^
 AND at the same *TIME* if you're implementing scrubbing.



David Boreham, INMOS Limited | mail(uk): davidb@inmos.co.uk or ukc!inmos!davidb
Bristol,  England            |     (us): uunet!inmos.com!davidb
+44 454 616616 ex 547        | Internet: davidb@inmos.com