chris@mimsy.UUCP (Chris Torek) (04/30/89)
>In article <17133@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >>If your compiler does not understand `volatile', and has no way to >>disable optimisation, you are out of luck. (You can resort to assembly >>language subroutines.) In article <10136@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes: >Back, back! (Making the sign of the cross.) No need to resort to >assembly language for something so simple. > >What is the real problem here? It's that the compiler knows that >we only need to inspect one byte in order to determine the state of >the bit. So how do we outwit the compiler? ... [various suggestions deleted] As someone else has already pointed out, this approach leads to the dreaded Compiler Upgrade Problem. The next release of the compiler may require you to change all of your defeat mechanisms. As it happens, though, you can usually get away with only a few small assembly routines---often you need only one for each special instruction. For instance, some Unibus devices respond differently to a `bisw2' (r/m/w) instruction than they would to a `movw'(read) ... `movw'(write) sequence. But you need not write an entire driver in assembly. If the compiler will not cooperate, at worst you can write bisw(®, bits); and have the routine _bisw: .globl _bisw .word 0 bisw2 8(ap),*4(ap) ret somewhere callable. Often you can insert this sort of thing directly into the compiler's assembly output (most serious compilers are capable of producing assemblable code, even if their default is to produce object code directly) to avoid subroutine call overhead. Sun provide a program called `inline' that uses this approach, and (I presume) also tries to avoid unnecessary pushes and pops, changing something like pea a4@(12) jsr _readlong movl #10,d1 btst d1,d0 | btst cannot test bit 10 directly plus _readlong: movl sp@(4),a0 movl a0@,d0 rts into lea a4@(12),a0 movl a0@,d0 movl #10,d1 btst d1,d0 or even (if smart enough) merging the lea+movl into one movl. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
bill@twwells.uucp (T. William Wells) (04/30/89)
In article <17195@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
: Sun provide
: a program called `inline' that uses this approach, and (I presume)
: also tries to avoid unnecessary pushes and pops, changing something
: like
:
: pea a4@(12)
: jsr _readlong
: movl #10,d1
: btst d1,d0 | btst cannot test bit 10 directly
:
: plus
:
: _readlong:
: movl sp@(4),a0
: movl a0@,d0
: rts
:
: into
:
: lea a4@(12),a0
: movl a0@,d0
: movl #10,d1
: btst d1,d0
:
: or even (if smart enough) merging the lea+movl into one movl.
If this is what I think it is, I was reading about this (it's in the
floating point manual, an obvious place, right?) some time ago. What
is described there are .il files, which you use by naming them on
your cc command.
The .il files contain assembly code which the compiler inserts in
line for you. The manual gives some instructions on how to write the
functions in such a way that the optimizer will remove the function
call overhead.
It's a neat trick if you need inline assembly. And avoids nonportables
like the asm keyword. (I think I got this right. It's been a while.)
---
Bill { uunet | novavax } !twwells!bill
rsalz@bbn.com (Rich Salz) (05/01/89)
>The .il files contain assembly code ... ... >It's a neat trick if you need inline assembly. And avoids nonportables >like the asm keyword. ... Hunh? -- Please send comp.sources.unix-related mail to rsalz@uunet.uu.net.