tatge@m2.csc.ti.com (Reid Tatge) (09/07/90)
Concerning the article by tommyp@isy.liu.se (Tommy Pedersen) in which the moderator wrote: > People I know in the DSP bix tell me that although there are many C > compilers for DSP chips, nobody uses them because they're all too slow. I've spent the last few years writing compilers for our (TI's) DSP chips, so I thought I should respond to this. There are really two classes of DSP chips on the market today : fixed-point and floating point. The TMS320C25 is fixed point, so I'll talk about that first. In general, the DSP fixed-point processors across the industry are "compiler-hostile", or in other words, very difficult to map general purpose HLL's onto. The `C25 is an accumulator machine with no offset addressing and very little support for arbitrary address arithmetic. Consequently, compilers tend to generate sequences of instructions which would translate to single instructions on any more conventional CPU. Why such an apparently contorted ISA? The reason is simple: DSP fixed-point processors are designed to optimize price/performance for DSP algorithms, often at the expense of general purpose performance. They are targeted at real-time applications where the time-critical kernels are very small and arithmetically intensive. Most DSP folks are more than happy to code these in assembly. However, this is changing - algorithms are becoming increasingly complex, and the need for high quality compilers cannot be discounted. Concerning comments by mhorne@ka7axd.wv.tek.com (Mike Horne) > Generally speaking, few people use compilers to generate code for DSP chips > for *time critical* code sections. Note that this includes just about all > signal processing algorithms. However, you can use a high-level language > (such as C) to build the *structure* of the program and use in-line assembly > for time critical sections..... I agree. This is generally an excellent approach. However, by applying more sophisticated optimization strategies, we plan to narrow the gap between hand coded assembly and C performance. This is particularly true in the floating-point CPU arena (TMS3203x). The new generation of floating point DSP's tend to be moving towards more conventional compiler-friendly architectures. As the compilers become more sophisticated, they enable people to code their entire program in C, with very little, if any, performance penalty (with all the obvious benefits). Over time, the improved technology will also be applied the fixed point processors. Reid Tatge -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.
kuusama@news.funet.fi.tut.fi (Kuusama Juha) (09/11/90)
While it is true, that currently used HLL's do produce poor code for currently used DSP processors, things are changing. I will not talk about code generators that produce optimised code from filter specifications, flow graphs etc., altough I've seen several. But I like to point out: In ICASSP-90 (International Conference on Acoustics, Speech and Signal Processing) K. Leary (form Analog Devices, Inc.) gave an exellent speech on DSP/C: "DSP/C is a structured procedural programming language that solves the problems of using C for DSP, while retaining the benefits of C." My personal view is, that the claim may well prove to be true. DSP/C can, as far as I see, be compiled to _optimimum_ code for the DSP processor, given smart enough compiler. Have a look, the article is in the proceedings book 2. (I can't resist: if the language will indeed be popular, why not call it 'D'?) -- Juha Kuusama, kuusama@korppi.tut.fi -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.
avi@taux01.nsc.com (Avi Bloch) (09/24/90)
I realize that I'm a little late on this topic but I just saw the tail end of this discussion and I thought I'd add what National Semiconductor has to offer in this field. National Semiconductor recently announced three micro-processors: the ns32fx16, ns32cg160 and the ns32gx320. All of these processors have a core of a general purpose processor with additions for DSP and fax applications. These additions are accessed using either special instructions or memory-mapped i/o. In order to allow the user to access these special instructions from HLL (in our case - C) we invented a mechanism which we call Application Specific Instruction Set (ASIS) Support. What this entails is a group of functions and procedures whose prototypes are supplied in an 'include' file and are recognized by the compiler. These functions are then inlined by the compiler. The compiler (including the optimizer) has intimate knowledge on how these instructions work, e.g., which parameters are changed by the instruction or in which register each parameter much reside, and it uses this knowledge to allocate registers and generate code in a most efficient manner. I'm not saying that it will be as good as if it was written in assembly but in most cases it's good enough. I'm willing to add more details for anyone interested. BTW, if anyone knows of any other compiler that does something similar, I'd be interested to hear about it. -- Avi Bloch National Semiconductor (Israel) 6 Maskit st. P.O.B. 3007, Herzlia 46104, Israel Tel: (972) 52-522263 avi@taux01.nsc.com [GCC lets you in-line assembler, frequently hidden inside macros, that is often used to get to features like sin and cos instructions. -John] -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.
pardo@cs.washington.edu (David Keppel) (09/27/90)
In article <4751@taux01.nsc.com> avi@taux01.nsc.com (Avi Bloch) writes: >[Compiler that optimizes for special instructions.] The moderator writes: >[GCC lets you in-line assembler, frequently hidden inside macros, that is >often used to get to features like sin and cos instructions. -John] In particular, you can tell GCC that certain hard registers are clobbered, so GCC can perform register allocation around those instructions. If the machine description knows about those instructions, then I think that it is also possible to define optimizations over those instructions, even if the compiler itself doesn't ``know'' how to emit them. ;-D on ( A compile of things to do ) Pardo -- pardo@cs.washington.edu {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.
seanf@sco.COM (Sean Fagan) (09/30/90)
In article <13148@june.cs.washington.edu> pardo@cs.washington.edu (David Keppel) writes: >In particular, you can tell GCC that certain hard registers are >clobbered, so GCC can perform register allocation around those >instructions. >If the machine description knows about those >instructions, then I think that it is also possible to define >optimizations over those instructions, even if the compiler itself >doesn't ``know'' how to emit them. For example: static inline void * _inline_memcpy (void *dst, void *src, unsigned int len) { void *t1, *t2; unsigned int t3; __asm volatile ("rep;movsb %0, %1, %2" : "=D" (t1), "=S" (t2), "=c" (t3) : "0" (dst), "1" (src), "2" (len)); return (temp1); } Although gcc does not know what the 'rep;movsb' string means, from the information I've told it, it knows that it needs to set up three registers (edi, esi, and ecx), and that they will be clobbered. I have also told it that the values for those registers will initially be in the parameters dst, src, and len; but, after the instruction completes, to move them into t1, t2, and t3, respectively. Where the optimization comes into effect is that gcc can (and will) emit the code such that register movement is as minimal as possible (i.e., it will try to make sure that the values I want to memcpy are already in the respective registers, if it can). Also, without the 'volatile,' gcc is likely to get rid of the instruciton completely, if the modified values are never used (I just tried it, and my trivial case ended up losing the movsb, since I never used the values!). Where this can come into play is something like: static __inline const double __inline_sin(double x) { double temp; __asm ("fsin" : "=f" (temp) : "0" (x)); return (temp); } where gcc will get rid of the entire inline function, if it can determine that none of the values are used. Incidently, I have yet to see a commercial compiler where I can do this. It's really a pity, too, since, although the inline assembly syntax is a bit bizarre, it's far more powerful than the "normal" method of doing it. -- Sean Eric Fagan seanf@sco.COM uunet!sco!seanf (408) 458-1422 -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.
pardo@cs.washington.edu (David Keppel) (10/02/90)
>pardo@cs.washington.edu (David Keppel) writes: >>[Tell gcc hard regs clobbered -> optimize around them.] >>[If md knows about magic instructions can optimize over them, >> even the compiler doesn't kow how to emit them.] In article <7949@scolex.sco.COM> Sean Fagan <seanf@sco.COM> writes: >[For example, `_inline_memcpy', `__inline_sin'.] Gcc has the capability to do two things: * Register allocation and code motion of `asm'-ed stuff. That's what Sean described. * Optimization of instructions that the compiler doesn't know how to emit, provided the instructions are in the machine description. I'm much fuzzier on the latter, but I think it works something like this: * The machine description contains information about how to emit the "div" and "mod" instructions. * The machine description contains a description of a peephole optimization that says something like ``if there's a "div" instruction next to a "rem" instruction, and they operate on the same operands, then trash the "div" instruction and get the results from the "rem" which computes "div" as a side- effect". * The compiler has no way of producing a "rem" instruction. * The user defines an "asm" that emits a "rem" instruction. * If the peephole matches, the optimization occurs, even tho' the compiler never emitted the "rem". I don't think this feature is used often on most targets because C and the gcc IR are pretty well matched. However, I could immagine the optimizations to be useful on e.g., DSP machines where there are some machine primitives that match poorly with C semantics but for which various optimizations could be done with neighboring instructions. ;-D on ( Looking for a few good digital signals ) Pardo -- pardo@cs.washington.edu {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.