sdw@hpsad.HP.COM (Steve Warwick) (12/12/89)
I have been looking into using the Motorola DSP56000 for use as more than just a DSP engine - taking on additional system control and i/o functions. I have come across an apparently insurmountable problem in the stack handling of this chip, and am wondering if others have found an efficint workaround. The internal hardware stack is used for subroutine calls, interrupts and maintaining information on nested Do structures, but for all its uses, it is only 15 levels deep. In systems which must handle a number of separate tasks and/or written in a modular style, along with much of the code written in 'C' this stack limitation seems absurdly small. Motorola provided two methods of dealing with this, both seem poorly thought out. 1) "overflow" and "underflow" indicators in the status register exist which tell you that you are on the 0 or 15'th level. This is fine in cases where each subroutine call and interrupt will check the stack to see if there is still room, and block move the stack accordingly, but if the program is primarily 'C', no such automatic checking exists. 2) overflow exception interrupt occurrs when the stack pointer goes from 0 to 15 or 15 to 0. Here, the offending instruction's PC is not saved, and no method of stack buffering can be implemented. has anyone actually run into such problems and have a method of dealing with the stack limitation? In the ideal world, the stack interrupt would have occurred before we run out of room on the stack itself, allowing an exception routine to buffer the stack as needed. Else, allowed the user to specify an area of internal XY ram as a stack base and forgot about the 15 double words. I am concerned that the 96000 will also have this problem, which will be a suprising limitation for a 32 bit processor!! Thanks, Steve Warwick sdw@hpsad
bart@videovax.tv.tek.com (Bart Massey) (12/17/89)
I'm sure others have responded to or will respond to this as well, but just thought I'd throw in my 0.5 inverse cents worth (fractions only :-). In article <9520009@hpsad.HP.COM> sdw@hpsad.HP.COM (Steve Warwick) writes: > I have been looking into using the Motorola DSP56000 for use as more than > just a DSP engine - taking on additional system control and i/o functions. > I have come across an apparently insurmountable problem in the stack handling > of this chip, and am wondering if others have found an efficint workaround. ... > has anyone actually run into such problems and have a method of dealing > with the stack limitation? In the ideal world, the stack interrupt would > have occurred before we run out of room on the stack itself, allowing > an exception routine to buffer the stack as needed. Else, allowed the user > to specify an area of internal XY ram as a stack base and forgot about the > 15 double words. As you have surmised, the internal stack is kind of useless for C code. The Motorola 56K compiler uses the same solution my assembly code does -- allocate R7 as the stack pointer, and forget the internal stack. Actually, interrupt handling will still happen on the internal stack, limiting one to 16 (8?) levels of nested interrupt -- big deal. Also, in my assembly code, I'm using the internal stack for jsr and for do nesting, since I can guarantee that they will be 3-4 levels deep at most -- the C compiler really hasn't this option... > I am concerned that the 96000 will also have this problem, which will be > a suprising limitation for a 32 bit processor!! The rumor mill has it that this limitation will in fact also exist in the 96K. Bart Massey Tektronix, Inc. TV Systems Engineering M.S. 58-639 P.O. Box 500 Beaverton, OR 97077 (503) 627-5320 ..tektronix!videovax.tv.tek.com!bart
pete@oakhill.UUCP (Pete Percosan) (12/21/89)
> I have been looking into using the Motorola DSP56000 for use as more than > just a DSP engine - taking on additional system control and i/o functions. > I have come across an apparently insurmountable problem in the stack handling > of this chip, and am wondering if others have found an efficint workaround. > > The internal hardware stack is used for subroutine calls, interrupts and > maintaining information on nested Do structures, but for all its uses, it > is only 15 levels deep. In systems which must handle a number of separate > tasks and/or written in a modular style, along with much of the code written > in 'C' this stack limitation seems absurdly small. There are two ways to use the system stack. In some embedded applications, stack usage is inherently bounded so 15 levels are enough. Examples are FAX, modems, sound and music, image processing, graphics... In this case, the hardware stack is fast, deterministic and supports the nested hardware Do loop fetch mechanism. When switching tasks, the used stack locations can be moved to/from memory. Register r7 is often used as a software stack pointer via postincrement/predecrement operations. The other case is unbounded stack usage, where 15 levels is not enough. 'C' code is a good example. Here the stack is used by the calling procedure to store the return address, which is moved to/from memory by the called procedure if necessary. This is identical to RISC compilers' use of their "return address" register, except that the 56K can overlap the memory moves in parallel with other useful work. This only uses a few stack locations - the rest are used for nested interrupts. Since the 56K interrupts are bounded to 4 hardware levels, the 15 levels are enough to leave the interrupt return addresses on the hardware stack. This provides fast, low latency i/o service for 'long' interrupts - 'fast' interrupt routines do not use any hardware stack levels. > Motorola provided two methods of dealing with this, both seem poorly thought > out. > > 1) "overflow" and "underflow" indicators in the status register exist > which tell you that you are on the 0 or 15'th level. This is fine in cases > where each subroutine call and interrupt will check the stack to see if there > is still room, and block move the stack accordingly, but if the program is > primarily 'C', no such automatic checking exists. Programs written in a high level language ( such as 'C' )for the dsp56000/1 utilize, at most, one location on the hardware stack, the return address ( SSH ) of a "leaf routine". But in all other cases the hardware stack is bypassed and hardware stack information is maintained on the software stack maintained by the compiler. > 2) overflow exception interrupt occurrs when the stack pointer goes from > 0 to 15 or 15 to 0. Here, the offending instruction's PC is not saved, and > no method of stack buffering can be implemented. The overflow, underflow and stack error exception are intended as a diagnostic trap for an error condition - unfortunately, they are not useful for managing the stack. The problem is that it is fairly easy to construct a program which results in excessive stack thrashing to/from memory, no matter where any automatic load/unload thresholds are set. So it may remain more efficient to manage the stack in software. > has anyone actually run into such problems and have a method of dealing > with the stack limitation? In the ideal world, the stack interrupt would > have occurred before we run out of room on the stack itself, allowing > an exception routine to buffer the stack as needed. Else, allowed the user > to specify an area of internal XY ram as a stack base and forgot about the > 15 double words. > > I am concerned that the 96000 will also have this problem, which will be > a suprising limitation for a 32 bit processor!! > > Thanks, > > Steve Warwick > sdw@hpsad Peter Percosan Motorola DSP Group
brianw@microsoft.UUCP (Brian WILLOUGHBY) (12/22/89)
Remember that the 56000 is designed as a Digital Signal Processor. I hardly think that Motorola intended it as a general purpose 32 bit processor. Thus I would disagree that its hardware stack limitations are a "flaw". For speed purposes, I think their internal stack is much faster than an external stack. As an internal stack, it is therefore more limited in size than a normal external stack. This speed improvement does more good for DSP programming than any consideration for high level language support. Brian Willoughby UUCP: ...!{tikal, sun, uunet, elwood}!microsoft!brianw InterNet: microsoft!brianw@uunet.UU.NET or: microsoft!brianw@Sun.COM Bitnet brianw@microsoft.UUCP
cmcmanis@stpeter.Sun.COM (Chuck McManis) (12/28/89)
Minor flame ahead ... Hmmm, is it just me or is this a bit absurd ? In article <9520009@hpsad.HP.COM> sdw@hpsad.HP.COM (Steve Warwick) writes: >I have been looking into using the Motorola DSP56000 for use as more than >just a DSP engine - taking on additional system control and i/o functions. Why? A DSP is a DSP, it isn't a general purpose CPU. Why do you want to make it into one? You can add any number of cheap micros on the board next to it if you want a CPU. [Dealing with the small stack ...] >1) "overflow" and "underflow" indicators in the status register exist >which tell you that you are on the 0 or 15'th level. This is fine in cases >where each subroutine call and interrupt will check the stack to see if there >is still room, and block move the stack accordingly, but if the program is >primarily 'C', no such automatic checking exists. Well, if I were writing a C compiler for the 56000 and I knew of the stack limitation (which I surely would), I would have the compiler generate all of this stack checking code at the entrance to, and exit from each call. Or, I wouldn't even use the "real" stack, and instead use a pointer off into some random RAM and "pretend" it was a stack. The compiler writer can do all this magic easily. >2) overflow exception interrupt occurrs when the stack pointer goes from >0 to 15 or 15 to 0. Here, the offending instruction's PC is not saved, and >no method of stack buffering can be implemented. So stay away from using the real stack except for interrupts or exception processing. >has anyone actually run into such problems and have a method of dealing >with the stack limitation? In the ideal world, the stack interrupt would >have occurred before we run out of room on the stack itself, allowing >an exception routine to buffer the stack as needed. Else, allowed the user >to specify an area of internal XY ram as a stack base and forgot about the >15 double words. Why not buy a different processing unit? It seems the 56000 is ill suited to your needs. I don't know if you would consider yourself primarily a software person or a hardware person, but it seems that considering different hardware is in order here. Just a sanity check mind you, you probably have compelling reasons why you want to use the DSP as a CPU but normally when such fundamental problems occur they are a signal that the peg is square and the hole is round. --Chuck McManis uucp: {anywhere}!sun!cmcmanis BIX: cmcmanis ARPAnet: cmcmanis@Eng.Sun.COM These opinions are my own and no one elses, but you knew that didn't you. "If it didn't have bones in it, it wouldn't be crunchy now would it?!"
sdw@hpsad.HP.COM (Steve Warwick) (01/03/90)
As indicated by Pete Percosan from Motorola, the C compiler DOES correctly deal with the hardware stack limitation by transferring the return address to the software stack upon subroutine entry, and restoring it upon exit, effectivly bypassing the hardware stack. This has also been verified by simulation. Overhead is two moves per subroutine call. As he correctly asserts, since the processor has a limited number of external interrupts, the 15 level stack is sufficient as long as the `do' construct is not overused. Again, a software stack can be used to buffer nested do loops written in assembler if necessary. I am not shure whether the Motorola C compiler even uses the do loop construction.. Thanks for your replies, Steven Warwick sdw@hpsad