davet@oakhill.UUCP (David Trissel) (12/11/88)
When I put the add.w into the compiler I wondered just how many people would see it and think that it must be wrong. Craig Ruff had it right in that the add.w requires 2 less bytes (and is faster) than add.l for cases where the immediate will fit in a word and where addq cannot be substituted by the assembler. Specifically the compiler generates this for popping arguments and so there is never any danger of the immediate value being too large. Instructions depending on esoteric architectural features such as the above separate the men from the boys. Here's another one: There is only one place in the instruction set where a data register is implicitly sign-extended. Where? I'll post the answer in a few days (if nobody else does.) -- Dave Trissel Motorola Semiconductor, Austin Texas if anyone knows the answer to this without can answer this one without looking This type of architectural usage separate the men from the boys. Here's another question on a subtlety which I doubt any of you would know without
duggie@Jessica.stanford.edu (Doug Felt) (12/13/88)
In article <1737@oakhill.UUCP> davet@oakhill.UUCP (David Trissel) writes: > > [stuff deleted] > >There is only one place in the instruction set where a data register is >implicitly sign-extended. Where? > I believe this instruction is CMPA <ea>,An where a word-length value in the effective address field (possibly a data register) is sign-extended before being compared with the address register. > > -- Dave Trissel Motorola Semiconductor, Austin Texas Doug Felt Courseware Authoring Tools Project Sweet Hall 3rd Floor Stanford University duggie@jessica.stanford.edu
mikem@uhccux.uhcc.hawaii.edu (Mike Morton) (12/13/88)
Dave Trissel of Motorola recently wrote: >There is only one place in the instruction set where a data register is >implicitly sign-extended. Where? The phrasing of Dave's question implies that he doesn't count the extension of a word-sized index in xx(An, Dn.w). More likely he means a MOVEM.W to data registers. [No, Dave, I didn't have to hit the manual for this. It was burned into my brain a few years back when I spent an afternoon debugging. Got any explanation of WHY this is done?] Here's some puzzles: - what's the fastest way to bump two address regs each by one byte? - what's faster than an SCS instruction, but equivalent (except for CCR, perhaps) Forgive me if this has been discussed in this group, but has everyone heard about the "superoptimizer" discussed in a paper in ASPLOS II? It does really incredible tricks trying MANY instruction sequences to see which is the best. One of my favorites is a four-instruction one to turn Dn into signum(Dn). Can anyone solve this? Perhaps better, can you do Max(Dx,Dy) or Min in four instructions? -- Mike Morton // P.O. Box 11378, Honolulu, HI 96828, (808) 676-6966 HST Internet: msm@ceta.ics.hawaii.edu (anagrams): Mr. Machine Tool; Ethical Mormon; Chosen Immortal; etc.
johnz@lxviiik.uucp (John Zolnowsky) (12/13/88)
From article <1737@oakhill.UUCP>, by davet@oakhill.UUCP (David Trissel): > There is only one place in the instruction set where a data register is > implicitly sign-extended. Where? > > I'll post the answer in a few days (if nobody else does.) > -- Dave Trissel Motorola Semiconductor, Austin Texas Suggested clarification: What Dave means is that the data register is a destination operand rather than a source operand. In the latter case, there are ADDA/SUBA Dn,An and the use of a data register as a word-sized index. -John Zolnowsky sun!johnz
sbigham@dukeac.UUCP (Scott Bigham) (12/13/88)
In article <1737@oakhill.UUCP> davet@oakhill.UUCP (David Trissel) writes: >There is only one place in the instruction set where a data register is >implicitly sign-extended. Where? Perchance in the addressing mode OFF(A0,D3.W)? sbigham -- Scott Bigham "The opinions expressed above are Internet sbigham@dukeac.ac.duke.edu (c) 1988 Hacker Ltd. and cannot be USENET sbigham@dukeac.UUCP copied or distributed without a ...!mcnc!ecsgate!dukeac!sbigham Darn Good Reason."
jlyle@hpdml93.HP.COM (Jeff Lyle) (12/13/88)
> Instructions depending on esoteric architectural features such as the above > separate the men from the boys. Here's another one: Or working code from code with features??? > There is only one place in the instruction set where a data register is > implicitly sign-extended. Where? > > -- Dave Trissel Motorola Semiconductor, Austin Texas The MOVEM.W instruction has the rather NASTY feature of sign extending its data when the destination is a register. Yes it IS documented in the fine print. No, I did not read the fine print before discovering this! - Jeff Lyle (hpdmlac)
chrisj@cup.portal.com (Christopher T Jewell) (12/13/88)
In article <1737@oakhill.UUCP>, davet@oakhill.UUCP (David Trissel) writes: >There is only one place in the instruction set where a data register is >implicitly sign-extended. Where? Aw come on! Three cases at least: 1. ADDA.W Dn,Am (also SUBA.W, CMPA.W, MOVEA.W) 2. d8(xx,Dn.W) addressing (xx IN {PC, An}) 3. MOVEM.W memory to dataregs I suppose that you intended #3, since the values loaded INTO the DRegs are actually sign-extended during loading; in the other two cases, the value taken FROM a DReg is sign-extended, but the reg itself is unchanged. (The wording of the question in your pop-quiz was ambiguous, professor! :-) ) Christopher T. Jewell chrisj@cup.portal.com sun!cup.portal.com!chrisj "Just because I tell only the truth doesn't mean I tell the only truth there is." Father McAleer in Brad Ferguson's "To Tell the Troof" (Jan '89 F&SF)
paul@taniwha.UUCP (Paul Campbell) (12/13/88)
In article <1737@oakhill.UUCP> davet@oakhill.UUCP (David Trissel) writes: >There is only one place in the instruction set where a data register is >implicitly sign-extended. Where? That's easy ... no need to look at the manual - 'movem.w .., <d0>' on the other hand I've never ever come across a need for this instruction, I have always guessed it was something some big customer asked Mot. to put in the 68k early on when everyone thought of it as a 16 bit machine. Paul -- Paul Campbell ..!{unisoft|mtxinu}!taniwha!paul (415)420-8179 Taniwha Systems Design, Oakland CA "Read my lips .... no GNU taxes"
tmt@xpiinc.UU.NET (Thomas M. Talpey) (12/14/88)
>> Dave Trissel asks... >> There is only one place in the instruction set where a data register is >> implicitly sign-extended. Where? Movemw to data regs. Is this because of microcode streamlining so the processor doesn't have to figure out whether it's an address or data reg regarding extension? The practical reasons seem marginal. Also, why does movem overshoot by one memory cycle? Inquiring minds want to know. >> Mike Morton asks: Hey, you're cheating (I think!). >> - what's the fastest way to bump two address regs each by one byte? cmpmb ax@+,ay@+ | 12 cycles But it adds one memory cycle over two addql's (16 cycles). This may or may not make it faster. >> - what's faster than an SCS instruction, but equivalent (except >> for CCR, perhaps) subxb dx,dx | 4 cycles But it assumes that the X bit is set with CY, only sometimes true! >> - can you do Max(Dx,Dy) or Min in four instructions? movl Dx,result cmpl Dx,Dy bles .+4 movl Dy,result Have you got a better way? Tom Talpey tmt@xpiinc.uu.net
bruce@blue.blue.gwd.tek.com (Bruce Robertson) (12/14/88)
> There is only one place in the instruction set where a data register is > implicitly sign-extended. Where? I see two, if you are allowing the 68020 instruction set. The one you're probably looking for is "moveq". The other is "bfexts", which I suppose may be said to be explicit because of the 's' in the mnemonic. -- -- Bruce Robertson bruce@blue.gwd.tek.com
mikem@amc.UUCP (Mike McGinnis) (12/15/88)
In article <4350@Portia.Stanford.EDU>, duggie@Jessica.stanford.edu (Doug Felt) writes: > In article <1737@oakhill.UUCP> davet@oakhill.UUCP (David Trissel) writes: > > > > [stuff deleted] > > > >There is only one place in the instruction set where a data register is > >implicitly sign-extended. Where? > > > > I believe this instruction is CMPA <ea>,An where a word-length value > in the effective address field (possibly a data register) is > sign-extended before being compared with the address register. > > > > > -- Dave Trissel Motorola Semiconductor, Austin Texas > > Doug Felt > Courseware Authoring Tools Project The CMPA instruction implies that the destination operand is an address register. The only place a data register will be extended to a long is the MOVEQ instruction. The source operand must be no more than 3 bits (37.5 cents :), and the value gets placed in the destination operand (data register), then extended to a long. Do I win a cupie doll??? Michael E. McGinnis Applied Microsytems Corp. "You wake up lost in an empty town, wondering why no one else is around" "Look up to see a giant boy - you've just become his brand new toy..."
mikem@uhccux.uhcc.hawaii.edu (Mike Morton) (12/15/88)
Tom Talpey is right that the CMPM.B does more memory access. But it *is* the shortest way, and fewest cycles. I've never actually used this trick, so I don't know if it's empirically faster. [I use a Mac Plus, which has funny timing anyway...] I also bow to Tom's point that the Carry and eXtend bits must be the same for "SUBX.B Dn,Dn" to replace "SCS". Incidentally, I got this trick from an article Bill Gates wrote ca 1975 on 8080 (?) hacking, reprinted in "Programmers at Work". Counting the cycles in Tom's Max(Dx,Dy) as a function of word or long operands and whether the branch takes/doesn't. word long mov Dx,result 4 4 cmp Dx,Dy 4 6 bles .+4 10/8 (taken/not) mov Dy,result 4 4 18/20 20/22 The Superoptimizer's solution has no branches at all, so its time is constant at 16 cycles for word operands (small win) or 32 for longword (big loss). Anyone care to solve the 4-instruction signum function? -- Mike Morton // P.O. Box 11378, Honolulu, HI 96828, (808) 676-6966 HST Internet: msm@ceta.ics.hawaii.edu (anagrams): Mr. Machine Tool; Ethical Mormon; Chosen Immortal; etc.
gz@spt.entity.com (Gail Zacharias) (12/15/88)
In article <2833@uhccux.uhcc.hawaii.edu> mikem@uhccux.uhcc.hawaii.edu (Mike Morton) writes: >Anyone care to solve the 4-instruction signum function? add Dx,Dx subx Dx,Dx bls .+4 moveq #1,Dx -- gz@entity.com ...!mit-eddie!spt!gz Unix: when you can't afford the very best.
aglew@mcdurb.Urbana.Gould.COM (12/17/88)
Ref: Superoptimizer -- a look at the smallest program Henry Massalin ASPLOS II Provides a number of code sequences that avoid branches. Branch avoidance becomes important on high performance processors. signum(x) (x in d0) add.l d0,d0 subx.l d1,d1 negx.l d0 addx.l d1,d1 (signum x in d1) abs(x) (x in d0) move.l d0,d1 add.l d1,d1 subx.l d1,d1 eor.l d1,d0 sub.l d1,d0 (abs(x) in d0) Longer but no jumps. Unfortunately, dependencies between nearly every instruction, but a good code scheduler may be able to interleave other code here. We have a good code scheduler, don't we, Dave? :-) Max(X,Y) (d0=X,d1=Y) sub.l d1,d0 subx.l d2,d2 or.l d2,d0 addx.l d1,d0 (d0 = max(X,Y) Min is similar. Simultaneous max/min MaxMin(X,Y) (d0=X, d1=Y) sub.l d1,d0 subx.l d2,d2 and.l d0,d2 eor.l d2,d0 add.l d1,d0 add.l d2,d1 (d0 = max, d1 = min) Massalin goes on to present some BCD code fragments, multiplication by constants, etc. I don't think that I have done an injustice to Massalin's paper by presenting these fragments -- many of them were known before his paper, by people like me who like playing with branch avoiding code sequences. Some were already well known tricks. Massalin's contribution is in having written a program that finds the best such sequence, with reasonable certainty. Ie. he has automated the search process that hackers used to do by inspiration. Superoptimizer is probably not the sort of thing that you would put into a production compiler, usually - most code simply does not benefit that much from it, plus, it is extremely slow. But, something like superoptimizer should almost definitely be run by a compiler group that is generating code fragments, for code generation or libraries, every time the machine parameters change (like, when a new chip comes out). Unless your machine is simple enough that the best code sequence is obvious by inspection. Andy "Krazy" Glew aglew@urbana.mcd.mot.com uunet!uiucdcs!mcdurb!aglew Motorola Microcomputer Division, Champaign-Urbana Design Center 1101 E. University, Urbana, Illinois 61801, USA. My opinions are my own, and are not the opinions of my employer, or any other organisation. I indicate my company only so that the reader may account for any possible bias I may have towards our products.
mikem@uhccux.uhcc.hawaii.edu (Mike Morton) (12/17/88)
Gail Zacharias recently contributed this 4-instruction signum function, for which I've counted cycles. word long add Dx,Dx 4 8 ; shift sign bit -> carry ; set Z bit only for Dx = 0, 0x8000 subx Dx,Dx 4 8 ; set Dx = 0xffff if Dx was < 0 ; (and set C bit) ; set Dx = 0 if sign Dx was >= 0 ; (and clear C bit) ; (and leave Z bit alone) bls .+4 10/8 (taken/not) ; branch on C bit (Dx < 0) ; (Dx now holds -1) ; ... or on Z bit (Dx = 0, 0x8000) ; (Dx now holds 0) moveq #1,Dx 4 4 ; none of the above: return 1 ---- ---- 18/20 26/28 Very elegant! It uses only one register and is faster than the Superoptimizer's solution for the longword case (30 cycles). However, the SO's solution is faster for the word case (16 cycles), mostly because it avoids the conditional branch. Takers? -- Mike Morton // P.O. Box 11378, Honolulu, HI 96828, (808) 676-6966 HST Internet: msm@ceta.ics.hawaii.edu (anagrams): Mr. Machine Tool; Ethical Mormon; Chosen Immortal; etc.
john@frog.UUCP (John Woods) (12/20/88)
In article <1737@oakhill.UUCP>, davet@oakhill.UUCP (David Trissel) writes: > > Instructions depending on esoteric architectural features such as the above > separate the men from the boys. Here's another one: > When I first glanced at my screen, I read the above as "separate the men from the bugs." I was going to post a comment about such trivia JOINING the men and the bugs, but decided not to. > There is only one place in the instruction set where a data register is > implicitly sign-extended. Where? > And when I saw various people's complaints about how they discovered the non-intuitive solution to this pop quiz the hard way, I decided to go ahead and post it anyway. The answer to such silliness seems to be RISC. -- John Woods, Charles River Data Systems, Framingham MA, (508) 626-1101 ...!decvax!frog!john, john@frog.UUCP, ...!mit-eddie!jfw, jfw@eddie.mit.edu Go be a `traves wasswort. - Doug Gwyn
davet@oakhill.UUCP (David Trissel) (12/21/88)
In article <1324@X.UUCP> john@frog.UUCP (John Woods) writes: >The answer to such silliness seems to be RISC. Huh? The "leave it to the software" philosophy? No thanks. I happen to know people trying to "work around" some tough RISC hangups and I have to say that if RISC compilers are easier to build than CISC compilers it's not by much. Certainly by now the RISC compilers would all be "finished" wouldn't they? Since the initial implementation would be a breeze all that's left to do is work on optimizing. Looks to me like both RISC and CISC have a nearly equal amount of implementation headaches. -- Dave Trissel
jsan@well.UUCP (Jez San) (12/25/88)
Where can I get the SuperOptimiser from?? From the quoted examples, it sure looks like a neat toy, and I am keen to feed it something more useful than just Mins and Maxs! -- Jez. [no siggy yet... im new to this!]
mikem@uhccux.uhcc.hawaii.edu (Mike Morton) (12/30/88)
Jez San recently wrote: > Where can I get the SuperOptimiser from?? > From the quoted examples, it sure looks like a neat toy, and I am keen > to feed it something more useful than just Mins and Maxs! It's not distributed, to the best of my knowledge. The reference is in ASPLOS II: "Superoptimizer -- A Look at the Smallest Program", by Henry Massalin If you know how to order stuff from ACM, the magic info from the first page is: (c) 1987 ACM 0-89791-238-1/87/1000-0122 $00.75 I wrote to Massalin a couple of times and he was quite helpful with questions. I don't know a net address for him, but the address in the paper is: Department of Computer Science, Columbia University, New York, NY 10027 -- Mike Morton // P.O. Box 11378, Honolulu, HI 96828, (808) 676-6966 HST Internet: msm@ceta.ics.hawaii.edu (anagrams): Mr. Machine Tool; Ethical Mormon; Chosen Immortal; etc.
bga@raspail.UUCP (Bruce Albrecht) (12/31/88)
In article <2899@uhccux.uhcc.hawaii.edu>, mikem@uhccux.uhcc.hawaii.edu (Mike Morton) writes: > It's not distributed, to the best of my knowledge. The reference is in > ASPLOS II: > "Superoptimizer -- A Look at the Smallest Program", by Henry Massalin > > If you know how to order stuff from ACM, the magic info from the first page is: > (c) 1987 ACM 0-89791-238-1/87/1000-0122 $00.75 What's ASPLOS II? Some conference proceedings?
mikem@uhccux.uhcc.hawaii.edu (Mike Morton) (12/31/88)
Bruce Albrecht writes: > What's ASPLOS II? Some conference proceedings? Sorry. It's a conference (#2 was in 1988), short for Architectural Support for Programming Languages and Operating Systems. More hardware than software, if I recall correctly. It's sort of formal presentations of hackery, but people with advanced degrees aren't supposed to admit that they do weird stuff like this, let alone enjoy it. -- Mike Morton // P.O. Box 11378, Honolulu, HI 96828, (808) 676-6966 HST Internet: msm@ceta.ics.hawaii.edu (anagrams): Mr. Machine Tool; Ethical Mormon; Chosen Immortal; etc.