[comp.sys.m88k] Data memory unit question?

johnnyw@hubcap.clemson.edu (johnny lee wood) (04/30/91)

In Motorola's 88000 Family Architecture the data memory unit is said to
have 3 cycles with data transfer to or from cache taking place in the
third cycle.

The user's manual says a store uses the writeback phase to fetch
source data from the register file on the D bus.  These two statements
do not seem to agree.  Again I would appreciate any helpful
explanations.

The last reply concerning Integer Multiply was very helpful.

Also, how is a branch instruction able to know taken or not before
the execute stage.

John Wood
johnnyw@hubcap.clemson.edu

marvin@oakhill.sps.mot.com (Marvin Denman) (04/30/91)

In article <1991Apr30.002550.9550@hubcap.clemson.edu> johnnyw@hubcap.clemson.edu (johnny lee wood) writes:
>
>In Motorola's 88000 Family Architecture the data memory unit is said to
>have 3 cycles with data transfer to or from cache taking place in the
>third cycle.
>
>The user's manual says a store uses the writeback phase to fetch
>source data from the register file on the D bus.  These two statements
>do not seem to agree.  Again I would appreciate any helpful
>explanations.

This is more or less explained in section 1.3.3 of the User's Manual. 
The easiest way I can figure out to explain it is to draw a picture.
In the execute stage of the pipeline address calculation is done for all memory
operations. During the next half clock in what would have been a writeback
slot if this was a one cycle instruction, we fetch the store operand using
the read/write port into the register file. During this same clock we send 
the address of the memory operation to the cache.  During the next clock we 
send the data to the cache for stores or recieve data from the cache for loads.
Cache misses stall this pipeline waiting for the data bus to become valid.  
Also 64 bit loads and stores and xmem operations are split into two bus 
transactions or pipeline stages between the address calculation and address 
bus stages.  This is implemented in a way that will only stall if another 
data unit instruction is issued while we are trying to split the operation.  
The WB or writeback stage depends on arbitrating for the writeback bus when 
a load or the load half of an xmem is completing.  Feed forwarding allows 
the load result to be used in CLOCK 3.

|  CLOCK 0  |  CLOCK 1  |  CLOCK 2  |  CLOCK 3  |
  _________ 
 /         \
/    ADDR   \
\    CALC   /   
 \_________/
             ___
            /   \
           /  ST \
           \FETCH/
            \___/
             _________
            /         \
           /  ADDRESS  \
           \    BUS    /
            \_________/
                         __________ 
                        /          \
                       /    DATA    \
                       \    BUS     /
                        \__________/
                                     _____
                                    /     \
                                   /  WB   \
                                   \       /
                                    \_____/

>
>Also, how is a branch instruction able to know taken or not before
>the execute stage.
>

If you check the PBUS timing requirements for address being valid you will
notice that we have 30-40% of a cycle in the execute phase to determine
which instruction to fetch next.  Several potential fetch instruction addresses
are computed in parallel and in the first few nanoseconds of the execute
cycle we decide whether to take the branch or not.  Since branch conditions
on the 88000 are fairly simple this can be done very quickly.

-- 
Marvin Denman
Motorola 88000 Design
cs.utexas.edu!oakhill!marvin