hutch@fps.com (Jim Hutchison) (01/17/91)
I've been hearing a bit about new processors which do "speculative execution". That is execution of branchs based on executing both paths or guessing at which way the branch will go. Without addressing the viability of speculative execution, I am curious about how controller registers are addressed. If it write a value to a control register on a controller, something may happen. If it read a control register, something may happen. What happens with these? I could make guesses, but I might guess at something covered in a "non-disclosure" that someone else FPS already knows, so I won't. -- - Jim Hutchison {dcdwest,ucbvax}!ucsd!fps!hutch Disclaimer: I am not an official spokesman for FPS computing
djohnson@beowulf.ucsd.edu (Darin Johnson) (01/17/91)
In article <14829@celit.fps.com> hutch@fps.com (Jim Hutchison) writes: >I've been hearing a bit about new processors which do "speculative execution". >... >If it write a value to a control register on a controller, something may > happen. If it read a control register, something may happen. IO has a tendency of always intruding itself into nice clean theories. The major problem is that the semantics of memory change completely when using mapped IO. For example, two reads of the same location without a write to that location in between may possibly result in different values being read. This same sort of problem occurs when designing caches (esp. with programmable IO controllers such as channels). The problem isn't as great with RISC-style CPU's that can place an instruction in the delay slot after a branch, since the compiler can choose a 'safe' instruction. There are several solutions or techniques that limit the problem, I won't list them all (even if I knew them all). First, is to use explicit IO instructions, and to stall speculative branch execution when one of these is reached. Also, IO can be mapped into an address space that the CPU knows about (high order bit set), and stall the same way. Second, many of these machines are designed to be attached processors, with IO only to the frontend or IO ports (context switching can really bog things down...). So this isn't a major problem. Third, devices could be controlled via an IO processor, which handles all the gory details. Other solutions possible... >Jim Hutchison {dcdwest,ucbvax}!ucsd!fps!hutch >Disclaimer: I am not an official spokesman for FPS computing -- Darin Johnson djohnson@ucsd.edu
prener@arnor.uucp (01/18/91)
Typically, such speculative execution stops when it reaches a point where possibly visible side-effects might occur. Dan Prener (prener @ ibm.com)
mhjohn@aspen.IAG.HP.COM (Mark Johnson) (01/19/91)
In comp.arch, lindsay@gandalf.cs.cmu.edu (Donald Lindsay) writes:
As to what the CPU does about it .. the main requirement is that
uncached speculative reads not be done. (There's no such thing as a
speculative write, because in general you don't know the prior
contents of the memory location, and hence, there is no reasonable
way to undo the write.)
There certainly can be such a thing as speculative writes. Any design
that includes a write buffer can also speculatively execute a write.
The buffer would not dump the information to memory until the
speculative operation was committed. If the speculative operation is
not needed, the write is never done.
Speculative execution would not have to stall on writes, which occur
with about a 1 in 7 frequency on the architecture I last looked at.
Write buffers are a common way of integrating a very fast execution
unit to a slower main memory. The one's I am familiar with were
designed to be completely software transparent. They are a handy
place to accommodate unaligned writes, gather sequential writes into
a wider write, merging direct I/O with programmed
writes, etc.
wright@Stardent.COM (David Wright) (01/21/91)
In article <14829@celit.fps.com> hutch@fps.com (Jim Hutchison) writes: >I've been hearing a bit about new processors which do "speculative execution". >That is execution of branchs based on executing both paths or guessing at >which way the branch will go. Without addressing the viability of speculative >execution... Is this really a new idea? I was under the impression that the IBM 360/91 did this. Or is there some new wrinkle in the recent stuff? -- David Wright, not officially representing Stardent Computer Inc wright@stardent.com or uunet!stardent!wright
cet1@cl.cam.ac.uk (C.E. Thompson) (01/21/91)
In article <11625@pt.cs.cmu.edu> lindsay@gandalf.cs.cmu.edu (Donald Lindsay) writes: >Yes, I was simplifying. For example, see "RISC System/6000 Hardware >Overview": > >"...a five entry pending store queue and a four-entry store data >queue in the FPU enable the FXU [integer unit] to execute floating- >point store operations before the FPU produces the data. This allows >the FXU to generate the address, initiate TLB or cache reload >sequences, and check for data protection for a floating-point store >instruction, and then continue executing the subsequent instructions >without being held back by the FPU." > >Note that this machine isn't even advertised as a speculative- >execution design - merely a parallel one. One wonders about the >sequencing of FPU and MMU interrupts, and about how much more fun the >design could be if some of those stores were conditional. And then >there's the classical problem of matching the addresses of reads >against the addresses with pending writes. But this isn't speculative execution of writes at all! It is simply early detection of exceptions: the FXU address calculation cycle happens first and all possible (storage access) exceptions happen then. Thereafter the store operation sits around in the PSQ until the FPU gets around to delivering the data, but the store is absolutely guaranteed to complete. (There are some details I haven't seen any documentation on, admittedly, such as how the FXU makes sure that the required line is still in the cache later on, and hasn't been flushed by intermediate FXU storage accesses.) In fact, the RS/6000 isn't advertised as a speculative-execution design because it isn't one. Unless you count "conditional dispatching" as speculative excecution, which I certainly wouldn't. Chris Thompson JANET: cet1@uk.ac.cam.phx Internet: cet1%phx.cam.ac.uk@nsfnet-relay.ac.uk
lindsay@gandalf.cs.cmu.edu (Donald Lindsay) (01/23/91)
In article <1991Jan21.142422.17655@cl.cam.ac.uk> cet1@cl.cam.ac.uk (C.E. Thompson) writes: >>"...a five entry pending store queue and a four-entry store data >>queue in the FPU enable the FXU [integer unit] to execute floating- >>point store operations before the FPU produces the data. This allows >>the FXU to generate the address, initiate TLB or cache reload >>sequences, and check for data protection for a floating-point store >>instruction, and then continue executing the subsequent instructions >>without being held back by the FPU." >> >>Note that this machine isn't even advertised as a speculative- >>execution design - merely a parallel one. One wonders about the >>sequencing of FPU and MMU interrupts, and about how much more fun the >>design could be if some of those stores were conditional. And then >>there's the classical problem of matching the addresses of reads >>against the addresses with pending writes. > >But this isn't speculative execution of writes at all! Yes, that's what I said. The design issues raised by this machine would be that much more difficult if some of the stores were initiated speculatively, and could not be committed until a third execution unit (the branch unit) signalled permission. For one thing, one normally tries to do writes in order (hence, the RIOS uses queues). But, if some writes were stalled on conditions, I can quite imagine IBM adding queue-jumping logic. >It is simply early detection of exceptions: There's still the ordering issue. The FXU and FPU execute from queues of issued instructions, and either may be ahead. If a particular instruction is capable of causing two exceptions, which one is raised? >(There are some details I haven't seen any documentation on, admittedly, >such as how the FXU makes sure that the required line is still in the >cache later on, and hasn't been flushed by intermediate FXU storage accesses.) Interesting issue. -- Don D.C.Lindsay .. temporarily at Carnegie Mellon Robotics