brett@umd5.umd.edu (Brett Bourbin) (04/29/89)
After putting up with Manx's assembler for over a year now, I decided to try out the PD A68k assembler. Besides having to add END statements to all my modules, everything compiled without problem. There was one difference though, the object code size for the A68k code was 30% _Smaller_ than that of Manx's AS. Is there something that AS includes in it's .o files that A68k is not? This coupled with the fact I can use BLink instead of LN has made me switch my assembly enviorment. Just wanted to pass on some information I found out. 8^)
a218@mindlink.UUCP (Charlie Gibbs) (05/01/89)
In article <4784@umd5.umd.edu> brett@umd5.umd.edu (Brett Bourbin) writes: <...> > There was one difference though, the object code size for the >A68k code was 30% _Smaller_ than that of Manx's AS. Is there something >that AS includes in it's .o files that A68k is not? This coupled with >the fact I can use BLink instead of LN has made me switch my assembly >enviorment. I've put the following optimization routines into A68k: - Bcc is converted to Bcc.S if possible (for backward references only - forward references are too hard :-). - Backward references within the current CODE section are converted to PC-relative if possible (again, forward references get ugly). - ADD, SUB, and MOVE instructions are converted to ADDQ, SUBQ, and MOVEQ if possible. - Any operand of the form 0(An) is converted to (An) (except for MOVEP, which must have a displacement). Some people have complained that this mucks up jump tables, so my next version will likely have a switch allowing optimization to be disabled (or I'll add support for operands such as (xxxx).L, which A68k currently can't handle). I'm impressed that it shrunk your object code by 30%. Thanks for the testimonial. Charlie_Gibbs@mindlink.UUCP "Programmers who write small modules have short attention spans."
jesup@cbmvax.UUCP (Randell Jesup) (05/04/89)
In article <221@mindlink.UUCP> a218@mindlink.UUCP (Charlie Gibbs) writes: > - Bcc is converted to Bcc.S if possible (for backward > references only - forward references are too hard :-). > > - Backward references within the current CODE section > are converted to PC-relative if possible (again, > forward references get ugly). Do you convert BSR to BSR.S? JSR to BSR(.S) might be useful when you plan to link with SMALLCODE, also. ADD.L #<16 bit number>,An should be converted to ADD.W. Ditto for SUB, etc. Forward optimization is REAL nice. I HATE to have to sprinkle .S's into my code, then remove the ones that fail! -- Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup
a218@mindlink.UUCP (Charlie Gibbs) (05/04/89)
In article <6755@cbmvax.UUCP> jesup@cbmvax.UUCP (Randell Jesup) writes: > Do you convert BSR to BSR.S? JSR to BSR(.S) might be useful when >you plan to link with SMALLCODE, also. A68k converts suitable backward references for BSR to BSR.S, as well as Bcc and BRA (it's too stupid to tell them apart :-). I agree about the JSR->BSR conversion. I haven't done anything about SMALLCODE yet because I CAN'T FIND ANY DOCUMENTATION! If anyone knows anything about SMALLCODE, SMALLDATA, hunk_drelocXX, etc. could you PLEASE either give me the information or tell me where to find it! > ADD.L #<16 bit number>,An should be converted to ADD.W. Ditto >for SUB, etc. Yup. I'll file that one under the "wild optimizations" heading in my "to do" file. Maybe I could go all the way to ADDQ, etc. > Forward optimization is REAL nice.... It's also REAL ugly. But Bruce Dawson has given me suggestions as to how I might do it for Bcc/BRA/BSR without completely rewriting everything or taking another pass through the source code, so I may have something Real Soon Now. Other forward optimizations will have to wait until I either get another viable suggestion or an inspiration from out of the blue. >Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup Charlie_Gibbs@mindlink.UUCP "The terms 'undefined' and 'unspecified' have been given definitions." -- revision to the IEEE 1003.1 POSIX Standard
doug@xdos.UUCP (Doug Merritt) (05/05/89)
In article <221@mindlink.UUCP> a218@mindlink.UUCP (Charlie Gibbs) writes: > - Bcc is converted to Bcc.S if possible (for backward > references only - forward references are too hard :-). There were some assemblers on the PDP 11 (long ago, V6 Unix) that did the forward references, too. Problematic due to an apparent necessity for an indefinite number of passes or the equivalent, but when your text memory space is only 64K I guess it may seem worth it. Doug -- Doug Merritt Member, Crusaders for a Better Tomorrow {pyramid,apple}!xdos!doug Professional Wildeyed Visionary "Of course, I'm no rocket scientist" -- Randell Jesup, Capt. Boinger Corps
rokicki@polya.Stanford.EDU (Tomas G. Rokicki) (05/06/89)
>> - Bcc is converted to Bcc.S if possible (for backward >> references only - forward references are too hard :-). >There were some assemblers on the PDP 11 (long ago, V6 Unix) that did >the forward references, too. Problematic due to an apparent necessity for >an indefinite number of passes or the equivalent, but when your text memory >space is only 64K I guess it may seem worth it. If you have memory available, it's not *that* difficult. Simply maintain a list of `candidate' branches during the first pass. A branch is a candidate if --> it's not already short and there isn't too much unmoveable code between it and it's target. Now, simply run multiple passes over your branch list in *memory*, changing any appropriate ones to short, until it can no longer be done. Now go back and update your symbol table. tom
a218@mindlink.UUCP (Charlie Gibbs) (05/08/89)
In article <8979@polya.Stanford.EDU> rokicki@polya.Stanford.EDU (Tomas G. Rokicki) writes: >If you have memory available, it's not *that* difficult. Simply maintain >a list of `candidate' branches during the first pass. A branch is a >candidate if --> it's not already short and there isn't too much unmoveable >code between it and it's target. Now, simply run multiple passes over >your branch list in *memory*, changing any appropriate ones to short, until >it can no longer be done. Now go back and update your symbol table. This sounds similar to a suggestion given to me by Bruce Dawson of CygnusEd Pro fame. Actually, it wouldn't even need that much memory; since short branches can only span 128 bytes, I could kick candidates off the end of the table once I've gone more than that far without determining that I can convert it to short. The largest I'd have to make the table (assuming a whole bunch of consecutive forward Bcc instructions) is 32 entries, since candidates would each take 4 bytes. I'd probably want another table of pointers to symbol table entries that would have to be adjusted, though. That one could need up to 64 entries, assuming a bunch of consecutive labeled 2-byte instructions. Still small potatoes, memory-wise. Of course, if I hit a SECTION directive I'd have to throw the table away, so cases like SECTION myprog,CODE BRA foo SECTION myprog,DATA DC.B 'gratuitous data section' SECTION myprog,CODE foo: wouldn't get optimized, but I can't see this happening too often. The one fly in the ointment is CNOP. Here's a case that would break things: CNOP 0,4 BRA foo CNOP 0,4 foo: In this case, changing the BRA to BRA.S would not cause foo to move because of the second CNOP. So I can't just adjust symbol table entries. What I could do, though, is to jump back in the source file and reprocess things starting at the Bcc that has to be adjusted. But then there's the problem of going into and out of INCLUDEs, macros, etc. Maybe I should just wipe the table if I encounter a CNOP, like I'll have to do with SECTION. Thanks for helping me think things through. Just writing this message has helped put things straight in my mind. >tom Charlie_Gibbs@mindlink.UUCP "The terms 'undefined' and 'unspecified' have been given definitions." -- revision to the IEEE 1003.1 POSIX Standard
fnf@estinc.UUCP (Fred Fish) (05/14/89)
In article <8979@polya.Stanford.EDU> rokicki@polya.Stanford.EDU (Tomas G. Rokicki) writes: > >>> - Bcc is converted to Bcc.S if possible (for backward >>> references only - forward references are too hard :-). > >>There were some assemblers on the PDP 11 (long ago, V6 Unix) that did >>the forward references, too. Problematic due to an apparent necessity for >>an indefinite number of passes or the equivalent, but when your text memory >>space is only 64K I guess it may seem worth it. > >If you have memory available, it's not *that* difficult. Simply maintain >a list of `candidate' branches during the first pass. A branch is a >candidate if --> it's not already short and there isn't too much unmoveable >code between it and it's target. Now, simply run multiple passes over >your branch list in *memory*, changing any appropriate ones to short, until >it can no longer be done. Now go back and update your symbol table. Based on my experiences writing an experimental linker at Motorola for the 88K, which was capable of doing some object code optimizations at link time, involving inserting or deleting instructions, I would recommend initially assuming that ALL branches are short branches, and then changing the ones that don't meet the criteria for short branches. The main reason being that if you are only expanding the code, once a branch target moves out of range of a short branch, it will never move back in range, so you can examine each branch independently of all the others. If instead you are trying to shrink the code by converting long branches to shorter branches, there may be situations where examining each branch would appear to result in no more conversions (terminating the loop), while simultaneously shortening several branches would allow their targets to move into range. In practice, two passes over the code is usually sufficient to find all cases. In rare cases, it would take three passes to find all cases. I can't recall ever seeing more than three passes. -Fred -- # Fred Fish, 1835 E. Belmont Drive, Tempe, AZ 85284, USA # 1-602-491-0048 asuvax!{nud,mcdphx}!estinc!fnf
dbk@teroach.UUCP (Dave Kinzer) (05/14/89)
In article <87@estinc.UUCP> fnf@estinc.UUCP (Fred Fish) writes: >In article <8979@polya.Stanford.EDU> rokicki@polya.Stanford.EDU (Tomas G. Rokicki) writes: >> The subject is optimizing 68000 branch instructions for forward branches. Tom explains a possible technique. Fred suggests assuming short branches then converting to long as needed since once they are converted to long they will not need to be checked again. I'm afraid that this falls into the pick your poison catagory. If you assume all branches to be long, once you start going back through the code to shorten the branches, once they have been shortened they need not be looked at again. They will never fall back into the domain of long branches (i.e. we only shrinking the code.) It's the same thing, only backwards. Assuming long first has at least one advantage: You need not run the optimizer if you don't want to take the time (probably minimal, but at least you have a choice.) | // GOATS - Gladly Offering All Their Support Dave Kinzer (602)897-3085| | // >> In Hell you need 4Mb to Multitask! << uunet!mcdphx!teroach!dbk | | \X/ #define policy_maker(name) (name->salary > 3 * dave.salary) |
dbk@teroach.UUCP (Dave Kinzer) (05/22/89)
In article <10891@mcdphx.phx.mcd.mot.com> (I) incorrectly write: > The subject is optimizing 68000 branch instructions for forward branches. > > Fred suggests assuming short branches then converting to long as needed >since once they are converted to long they will not need to be checked >again. I go on to suggest that assuming long works out as well. Fred Fish sent mail explaining that there may exist cases (multiple branches near the end of short branch range) that cannot be shortened considered individually, but can be shortened if considered as a group. Therefore, it is easier to assume them all short, and lengthen only those that need it. I stand corrected. More evidence that no matter how well meaning, Usenet advice may be worth exactly what you pay for it. ;-) | // GOATS - Gladly Offering All Their Support Dave Kinzer (602)897-3085| | // >> In Hell you need 4Mb to Multitask! << uunet!mcdphx!teroach!dbk | | \X/ #define policy_maker(name) (name->salary > 3 * dave.salary) |