glew@pdx007.intel.com (Andy Glew) (03/28/91)
>The moral of the story is that the delayed branching should be designed around >the best that the compiler can do, not the idiosyncracies of a particular >implementation. The compiler should be able to generate code that uses all >the available pipelining, without worrying about precisely how much pipelining >that is. Ummm... It sometimes annoys me that more people are not aware of the work done at the University of Illinois on the "forward semantic" style of delayed branches. This is a compiler technique that can effectively use up to 10 delay slots. The technique is simple: they replicate code from the target in the delay slots. They permit branches in the delay slots, in a nice manner that requires only one PC. They use trace scheduling to select which code to place after branches in the delay slots. Effective trace scheduling also restricts executable expansion to a fairly small percentage. Enuff said: I hope that someone from the IMPACT Group can be prompted to provide more details on the forward semantic. -- --- Andy Glew, glew@ichips.intel.com Intel Corp., M/S JF1-19, 5200 NE Elam Young Parkway, Hillsboro, Oregon 97124-6497 This is a private posting; it does not indicate opinions or positions of Intel Corp.