[comp.dsp] MC56000 brain dead??

sdw@hpsad.HP.COM (Steve Warwick) (12/12/89)

I have been looking into using the Motorola DSP56000 for use as more than
just a DSP engine - taking on additional system control and i/o functions.
I have come across an apparently insurmountable problem in the stack handling
of this chip, and am wondering if others have found an efficint workaround.

The internal hardware stack is used for subroutine calls, interrupts and
maintaining information on nested Do structures, but for all its uses, it
is only 15 levels deep. In systems which must handle a number of separate 
tasks and/or written in a modular style, along with much of the code written
in 'C' this stack limitation seems absurdly small. 

Motorola provided two methods of dealing with this, both seem poorly thought 
out.

1)  "overflow" and "underflow" indicators in the status register exist
which tell you that you are on the 0 or 15'th level. This is fine in cases 
where each subroutine call and interrupt will check the stack to see if there
is still room, and block move the stack accordingly, but if the program is
primarily 'C', no such automatic checking exists.

2)  overflow exception interrupt occurrs when the stack pointer goes from
0 to 15 or 15 to 0. Here, the offending instruction's PC is not saved, and 
no method of stack buffering can be implemented. 

has anyone actually run into such problems and have a method of dealing
with the stack limitation? In the ideal world, the stack interrupt would 
have occurred before we run out of room on the stack itself, allowing 
an exception routine to buffer the stack as needed. Else, allowed the user
to specify an area of internal XY ram as a stack base and forgot about the 
15 double words. 

I am concerned that the 96000 will also have this problem, which will be
a suprising limitation for a 32 bit processor!!

Thanks, 

Steve Warwick
sdw@hpsad
 

bart@videovax.tv.tek.com (Bart Massey) (12/17/89)

I'm sure others have responded to or will respond to this as well, but just
thought I'd throw in my 0.5 inverse cents worth (fractions only :-).

In article <9520009@hpsad.HP.COM> sdw@hpsad.HP.COM (Steve Warwick) writes:
> I have been looking into using the Motorola DSP56000 for use as more than
> just a DSP engine - taking on additional system control and i/o functions.
> I have come across an apparently insurmountable problem in the stack handling
> of this chip, and am wondering if others have found an efficint workaround.
...
> has anyone actually run into such problems and have a method of dealing
> with the stack limitation? In the ideal world, the stack interrupt would 
> have occurred before we run out of room on the stack itself, allowing 
> an exception routine to buffer the stack as needed. Else, allowed the user
> to specify an area of internal XY ram as a stack base and forgot about the 
> 15 double words. 

As you have surmised, the internal stack is kind of useless for C code.  The
Motorola 56K compiler uses the same solution my assembly code does --
allocate R7 as the stack pointer, and forget the internal stack.  Actually,
interrupt handling will still happen on the internal stack, limiting one to
16 (8?) levels of nested interrupt -- big deal.  Also, in my assembly code,
I'm using the internal stack for jsr and for do nesting, since I can
guarantee that they will be 3-4 levels deep at most -- the C compiler really
hasn't this option...

> I am concerned that the 96000 will also have this problem, which will be
> a suprising limitation for a 32 bit processor!!

The rumor mill has it that this limitation will in fact also exist in the
96K.

					Bart Massey
					
					Tektronix, Inc.
					TV Systems Engineering
					M.S. 58-639
					P.O. Box 500
					Beaverton, OR 97077
					(503) 627-5320

					..tektronix!videovax.tv.tek.com!bart

pete@oakhill.UUCP (Pete Percosan) (12/21/89)

> I have been looking into using the Motorola DSP56000 for use as more than
> just a DSP engine - taking on additional system control and i/o functions.
> I have come across an apparently insurmountable problem in the stack handling
> of this chip, and am wondering if others have found an efficint workaround.
>  
> The internal hardware stack is used for subroutine calls, interrupts and
> maintaining information on nested Do structures, but for all its uses, it
> is only 15 levels deep. In systems which must handle a number of separate
> tasks and/or written in a modular style, along with much of the code written
> in 'C' this stack limitation seems absurdly small.

There are two ways to use the system stack.  In some embedded applications,
stack usage is inherently bounded so 15 levels are enough.  Examples are FAX,
modems, sound and music, image processing, graphics...
In this case, the hardware stack is fast, deterministic and supports
the nested hardware Do loop fetch mechanism.  When switching tasks, the
used stack locations can be moved to/from memory.  Register r7 is often
used as a software stack pointer via postincrement/predecrement
operations.
 
The other case is unbounded stack usage, where 15 levels is not enough.
'C' code is a good example.  Here the stack is used by the calling
procedure to store the return address, which is moved to/from memory by
the called procedure if necessary.  This is identical to RISC compilers'
use of their "return address" register, except that the 56K can overlap
the memory moves in parallel with other useful work.  This only uses a
few stack locations - the rest are used for nested interrupts.  Since
the 56K interrupts are bounded to 4 hardware levels, the 15 levels are
enough to leave the interrupt return addresses on the hardware stack.
This provides fast, low latency i/o service for 'long' interrupts -
'fast' interrupt routines do not use any hardware stack levels.
 
> Motorola provided two methods of dealing with this, both seem poorly thought
> out.
> 
> 1)  "overflow" and "underflow" indicators in the status register exist
> which tell you that you are on the 0 or 15'th level. This is fine in cases
> where each subroutine call and interrupt will check the stack to see if there
> is still room, and block move the stack accordingly, but if the program is
> primarily 'C', no such automatic checking exists.


Programs written in a high level language ( such as 'C' )for the dsp56000/1
utilize, at most, one location on the hardware stack, the return address ( SSH )
of a "leaf routine". But in all other cases the hardware stack is bypassed and 
hardware stack information is maintained on the software stack maintained by
the compiler. 

 
> 2)  overflow exception interrupt occurrs when the stack pointer goes from
> 0 to 15 or 15 to 0. Here, the offending instruction's PC is not saved, and
> no method of stack buffering can be implemented.


The overflow, underflow and stack error exception are intended as a diagnostic
trap for an error condition - unfortunately, they are not useful for managing
the stack.  The problem is that it is fairly easy to construct a program which
results in excessive stack thrashing to/from memory, no matter where any
automatic load/unload thresholds are set.  So it may remain more efficient to
manage the stack in software.

 
> has anyone actually run into such problems and have a method of dealing
> with the stack limitation? In the ideal world, the stack interrupt would
> have occurred before we run out of room on the stack itself, allowing
> an exception routine to buffer the stack as needed. Else, allowed the user
> to specify an area of internal XY ram as a stack base and forgot about the
> 15 double words.
>
> I am concerned that the 96000 will also have this problem, which will be
> a suprising limitation for a 32 bit processor!!
>  
> Thanks,
>  
> Steve Warwick
> sdw@hpsad
 
 
Peter Percosan
Motorola DSP Group

brianw@microsoft.UUCP (Brian WILLOUGHBY) (12/22/89)

Remember that the 56000 is designed as a Digital Signal Processor.  I hardly
think that Motorola intended it as a general purpose 32 bit processor.  Thus
I would disagree that its hardware stack limitations are a "flaw".

For speed purposes, I think their internal stack is much faster than an
external stack.  As an internal stack, it is therefore more limited in size
than a normal external stack.  This speed improvement does more good for DSP
programming than any consideration for high level language support.

Brian Willoughby
UUCP:           ...!{tikal, sun, uunet, elwood}!microsoft!brianw
InterNet:       microsoft!brianw@uunet.UU.NET
  or:           microsoft!brianw@Sun.COM
Bitnet          brianw@microsoft.UUCP

cmcmanis@stpeter.Sun.COM (Chuck McManis) (12/28/89)

Minor flame ahead ... Hmmm, is it just me or is this a bit absurd ?

In article <9520009@hpsad.HP.COM> sdw@hpsad.HP.COM (Steve Warwick) writes:
>I have been looking into using the Motorola DSP56000 for use as more than
>just a DSP engine - taking on additional system control and i/o functions.

Why? A DSP is a DSP, it isn't a general purpose CPU. Why do you want to 
make it into one? You can add any number of cheap micros on the board 
next to it if you want a CPU.

[Dealing with the small stack ...]

>1)  "overflow" and "underflow" indicators in the status register exist
>which tell you that you are on the 0 or 15'th level. This is fine in cases 
>where each subroutine call and interrupt will check the stack to see if there
>is still room, and block move the stack accordingly, but if the program is
>primarily 'C', no such automatic checking exists.

Well, if I were writing a C compiler for the 56000 and I knew of the stack
limitation (which I surely would), I would have the compiler generate all
of this stack checking code at the entrance to, and exit from each call. Or,
I wouldn't even use the "real" stack, and instead use a pointer off into some
random RAM and "pretend" it was a stack. The compiler writer can do all this
magic easily. 

>2)  overflow exception interrupt occurrs when the stack pointer goes from
>0 to 15 or 15 to 0. Here, the offending instruction's PC is not saved, and 
>no method of stack buffering can be implemented. 

So stay away from using the real stack except for interrupts or exception
processing.

>has anyone actually run into such problems and have a method of dealing
>with the stack limitation? In the ideal world, the stack interrupt would 
>have occurred before we run out of room on the stack itself, allowing 
>an exception routine to buffer the stack as needed. Else, allowed the user
>to specify an area of internal XY ram as a stack base and forgot about the 
>15 double words. 

Why not buy a different processing unit? It seems the 56000 is ill suited
to your needs. I don't know if you would consider yourself primarily a 
software person or a hardware person, but it seems that considering 
different hardware is in order here. Just a sanity check mind you, you
probably have compelling reasons why you want to use the DSP as a CPU
but normally when such fundamental problems occur they are a signal that
the peg is square and the hole is round.


--Chuck McManis
uucp: {anywhere}!sun!cmcmanis   BIX: cmcmanis  ARPAnet: cmcmanis@Eng.Sun.COM
These opinions are my own and no one elses, but you knew that didn't you.
"If it didn't have bones in it, it wouldn't be crunchy now would it?!"

sdw@hpsad.HP.COM (Steve Warwick) (01/03/90)

As indicated by Pete Percosan from Motorola, the C compiler DOES
correctly deal with the hardware stack limitation by transferring
the return address to the software stack upon subroutine entry, and 
restoring it upon exit, effectivly bypassing the hardware stack.
This has also been verified by simulation. Overhead is two moves
per subroutine call.

 As he correctly asserts, since the processor has a limited number
 of external interrupts,
the 15 level stack is sufficient as long as the `do' construct is not
overused. Again, a software stack can be used to buffer nested do loops
written in assembler if necessary. I am not shure whether the Motorola
C compiler even uses the do loop construction..

Thanks for your replies, 

Steven Warwick
sdw@hpsad