[comp.compilers] Inlining and global references

meissner@osf.org (05/04/91)

| carter@cs.wisc.edu (Gregory Carter) writes:
| >[Optimize:
| > (1) local vars -> global vars
| > (2) replace subroutine calls with the subroutine code
| > (3) minimize stack frame usage]
| 
| On many machines (RISC machines in particular), accesing global memory is
| no faster than accessing local variables.

Actually on RISC machines, accessing global memory may be slower than
accessing local variables.  This is because RISC machines typically cannot
reference all of memory with one instruction.  To reference x, they do two
instructions -- set a register with the high part of the address, and do
the memory reference with the loaded register, and the lower part of the
address as an offset.  On the MIPS this would be:

	lui	$at,x		; load assembler temp with high 16 bits of x
	lw	$2,x		; load memory

Now, the MIPS assembler expands the 'lw' macro into the two instructions
above.  Also, the MIPS software conventions has a dedicated register to
hold the address of a 64K pool, which small sized globals are placed, and
only one instruction is used if the assembler is told that the item is
located there.

--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142
[Even on CISCs like a Vax or a 386 a stack reference is shorter than an
absolute reference, since you can typically use a single-byte offset rather
than a 4-byte address.  -John]
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

pardo@june.cs.washington.edu (David Keppel) (05/06/91)

pardo@cs.washington.edu writes:
>>[Accessing globals may be no faster than accessing locals]

In article <9105031839.AA10328@curley.osf.org> meissner@osf.org writes:
>[Locals may be faster: 2-instruction sequence instead of one
> sp-relative instruction.]

John Levine writes:
>[On many CISCs, the stack reference is shorter than absolute.]

The MIPS compilers try to pack global references together so that
global data can be referenced with a single-instruction offset from a
`globals area' pointer that is assigned to a physical register.  I
don't have any data about how well the technique works, but the net
result if it does work is that global and local references take the
same space and time.  If the instruction set can make N-bit offsets in
a single instruction, then the technique only works for less than 2^N
bytes of data.

Hence my original claim ``may be no faster'' :-)

	;-D on  ( Byteing the data that feeds you )  Pardo
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.