newton%cit-vlsi@cit-vax.ARPA (Mike Newton) (11/13/85)
If you are running UTS on a IBM 4341 or a IBM 370 (or any other IBM-like machine where instructions take longer if the index register is non-zero), you can speed up the execution by up to 10% by using the enclosed sed script. Note that the sed script usually gives a better speed up than the '-O' flag to the optimizer. Due to the problems of up arrows and tabs on IBM machines (if you are running UTS under VM), i have enclosed the sed script in two parts -- the first is the script with tabs and up-arrows replaced by X & Y's, and the second is an executable file to adust these characters. To use it, just pipe the output of 'CC -O -S' (or f77 or pascal or...) through the sed filter. Bugs, complaints ... to me. Timing figures (sorry, but i cant mail to the person that is doing the Dhrystone Benchmarks) : UTS 5.0 running under VM on a 4341-model 12 (model # is important!): Register declarations make no difference! Without my optimizaer: 3685 With my optimizer 3910 /* Performance on floating point benchmarks not as good due to less percentage of the time is spent on the index register derefing. */ Note: we have a lot of graphics people who run 10 hour jobs -- saving 10% is significant!! NOT IN SHAR FORMAT!!!!!! ---------- The sed script: sed \ -e 's/Y\(st[X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(st[X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(la[X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(la[X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(st[chde][X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(st[chde][X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(ic[X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(ic[X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(ex[X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(ex[X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\([lcasmd][de][X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\([lcasmd][de][X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\([noxlcasmd][X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\([noxlcasmd][X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\([lcasm]h[X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\([lcasm]h[X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\([as][lwu][X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\([as][lwu][X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(cv[bd][X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(cv[bd][X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(bal[X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(bal[X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(bct[X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(bct[X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(cl[X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(cl[X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(mxd[X]*[0-9]*,-[0-9]*[+-]$Proc[a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' \ -e 's/Y\(mxd[X]*[0-9]*,[0-9][a-zA-Z0-9$+_-]*\)(\([0-9]*)\)/\1(0,\2/' ------------------------------------cut here ------------------------------- tr "\131" "\136" < sedc > /tmp/sedopt tr "\130" "\011" < /tmp/sedopt > ./sedopt chmod 755 sedopt sedopt < d.s > dd.s diff d.s dd.s > diff.s cc -o dd dd.s -------------------------------end of ---------------------------------- Mike Newton Caltech 256-80 818 356 6771 (noon-midnight) Pasadena CA 91125 newton@cit-vax.ARPA ucbvax!cithep!cit-vax!newton If all the world is a stage, what am i doing in the balcony?