[comp.arch] RS/6000 XL compilers

zs01+@andrew.cmu.edu (Zalman Stern) (07/28/90)

[Comments about RS/6000 compilers (XLC, XLF) generated code vs. hand
written assembly language.]

I've played with some small examples in this domain. For simple double
precision floating point code, I have yet to beat the compiler. (These are
small enough pieces of code that I pretty sure that the assembly language
is optimal.)  For single precision, I can do better by changing the
semantics slightly since the C compiler tends to convert things to double
precision when it is unnecessary to do so. (There are switches to get around
this but they don't seem to do everything they should.) Note that single
precision is no faster in instruction cycles but more 32 bit items fit in a
cache line than 64 bit items do. One place where I could really speed up
the code is by doing cache prefetching. However, there is no way to do that
on the RS/6000. (The difference would be roughly 15 million multiply and
adds/sec instead of 10 million ... on a 25Mhz machine.)

I have seen other examples where the compiler doesn't do as well. In
particular, the are certain loop optimizations which result in cross
jumping (i.e. branches to other branch instructions). There are also
specific cases where one can play tricks to get better performance (e.g. IP
checksumming). Also, one occasionally needs to write assembly to express
things which are outside the scope of high level languages. Over any
significant body of code though, the XLC compiler is going to waste any
hand coding effort.

Sincerely,
Zalman Stern | Internet: zs01+@andrew.cmu.edu | Usenet: I'm soooo confused...
Information Technology Center, Carnegie Mellon, Pittsburgh, PA 15213-3890
*** Friends don't let friends program in C++ ***