[comp.sys.hp] HP PA assembly language question

jthomas@nmsu.edu (James Thomas) (03/06/90)

I'm an old CISC hacker trying to understand the PA and some of the choices
made in its design :-)

In the HP Precision Architecture Handbook (I have June 1987), on page 5-59,
there is a description of a normal intraspace routine call, to wit:

	BL	target,rp			LDIL	L%target,rp
	BLE	R%target(GRr,rp)
	OR	GR31,0,rp

.  Now, I understand the other version with the LDIL, but I am stymied by
the two branches version.

1)  I assume that "GRr" is supposed to be "SR4".  The first register in the
() is a space register name, not a general purpose register name, no?

2)  How can a branch be equivalent to a load?  As I understand it, the
given sequence of instructions ought to have the following effect:

	BL	target,rp
				a) target is used as a relative address to
				   provide the next instruction address
				   for after the delay slot (if target is
				   more than 256K away, this gets an
				   assembly or link error)
				b) .+8 is put in rp (assuming a non-branch
				   instruction preceeded)
	BLE	R%target(SR4,rp)
				a) this is in the delay slot of the BL
				b) .+4+"the 11 low order bits of target"
				   provide the next instruction address
				   for after the delay slot
				c) target+4 is put in rp
	instruction at target is executed
				a) this is in the delay slot of the BLE
				b) ???
	instruction at some rather random address is executed
				a) whatever was branched to by the BLE

, and the OR is ignored.

What is wrong with the above picture?  Would some PA guru please fill in my
understanding delay slot?

Thank you.
Jim Thomas	9000/840 @ midas!jthomas@wsmr-emh82.army.mil

hull@hpsal2.HP.COM (James Hull) (03/07/90)

James Thomas writes:

> I'm an old CISC hacker trying to understand the PA and some of the choices
> made in its design :-)
> 
> In the HP Precision Architecture Handbook (I have June 1987), on page 5-59,
> there is a description of a normal intraspace routine call, to wit:
> 
> 	BL	target,rp			LDIL	L%target,rp
> 	BLE	R%target(GRr,rp)
> 	OR	GR31,0,rp
> 
> .  Now, I understand the other version with the LDIL, but I am stymied by
> the two branches version.

You've run across a printing error in the Second Edition (June 1987).
The same example from the Third Edition (April 1989) shows:

call:	BL    target,rp           or           LDIL  l%target,rp
					       BLE   r%target(SR4,rp)
                                               OR    GR31,0,rp

return: BV    0(rp)               or           BE    0(SR0,rp)

What this is trying to show is that one of two sequences is used to perform
an intraspace call/return depending on whether the target is within range
of a BL or not.  If it is, a simple BL branches to the target and saves the
return point in the general register "rp", and a BV through "rp" branches
back at the end of the procedure.

If the target is too far to reach with a BL (which has a 17-bit signed word
displacement, giving +- 256K bytes), the 3-instruction sequence LDIL/BLE/OR
is used to do the call and a BE does the return.  This sequence works as
follow:  The LDIL loads the leftmost 21 bits of the target's absolute
address into the upper 21 bits of general register "rp".  The BLE branches
to the target by adding the right part of the target's address to register
"rp" to get the offset, and uses SR4 as the target space (SR4 is assumed to
be equal to the current instruction space).  The BLE instruction is
hardcoded to save the offset of the return point in GR31 and the space of
the return point in SR0.  The OR instruction in the delay slot of the BLE
copies the return point offset into "rp".  To return from the procedure,
the BE branches to the return point using SR0 (saved by the BLE) and "rp"
(saved by the OR).

> [correct analysis of bogus sequence deleted]
> 
> What is wrong with the above picture?  Would some PA guru please fill in my
> understanding delay slot?

Well, what's wrong with the picture is that the picture is wrong. :-)
By the way, if you'd like to get the Third Edition, the part number is
09740-90014.

> Thank you.

Sure.

 -- Jim Hull

shankar@hpclscu.HP.COM (Shankar Unni) (03/07/90)

> 	BL	target,rp			LDIL	L%target,rp
> 	BLE	R%target(GRr,rp)
> 	OR	GR31,0,rp

This seems to be a typo in the book. The April '89 copy has the following:

call:   BL    target,rp               or      LDIL   l%target,rp
                                              BLE    r%target(sr4,rp)
                                              OR     r31,r0,rp


return: BV    r0(rp)                  or      BE     0(sr0,rp)


This should be a little more clear. In each case, the first alternative is
pretty obvious. In the call case, the second case is a little more
complicated. The BLE instruction has a hardwired link register (r31). Since
the HP calling convention uses r2 as the return register ("rp" is an alias
for "r2"), there is an instruction in the shadow of the BLE to copy r31
into rp before entering the called function.

P.S. The OR r1,r0,rp convention to copy a register into another one is
reflected by the assembler pseudo-op COPY:

     COPY r1,r2  ==  OR r1,r0,r2

-----
Shankar Unni                                   E-Mail: 
Hewlett-Packard California Language Lab.     Internet: shankar@hpda.hp.com
Phone : (408) 447-5797                           UUCP: ...!hplabs!hpda!shankar

DISCLAIMER:
This response does not represent the official position of, or statement by,
the Hewlett-Packard Company.  The above data is provided for informational
purposes only.  It is supplied without warranty of any kind.

tml@hemuli.tik.vtt.fi (Tor Lillqvist) (03/07/90)

In article <564@opus.NMSU.EDU> jthomas@nmsu.edu (James Thomas) writes:
)In the HP Precision Architecture Handbook (I have June 1987), on page 5-59,
)there is a description of a normal intraspace routine call, to wit:
)	BL	target,rp			LDIL	L%target,rp
)	BLE	R%target(GRr,rp)
)	OR	GR31,0,rp
)..  Now, I understand the other version with the LDIL, but I am stymied by
)the two branches version.

I guess it's simply a typo.  The BLE and OR instructions should be
aligned under the LDIL (and GRr should be SR4 as you say).  Then it
makes sense, doesn't it?
-- 
Tor Lillqvist,
working, but not speaking, for the Technical Research Centre of Finland

jthomas@nmsu.edu (James Thomas) (03/09/90)

Thank you all for the explanation.  That possibility hadn't occurred to me :-(

In article <4750008@hpsal2.HP.COM> hull@hpsal2.HP.COM (James Hull) writes:

jh> call:	BL	target,rp	or	LDIL	l%target,rp
jh> 						BLE	r%target(SR4,rp)
jh> 						OR	GR31,0,rp

jh> return:	BV	0(rp)		or	BE	0(SR0,rp)

Please correct me if I'm wrong, but I think this should be clarified.  It
seems to me that there even more variations possible here than on an
80x86 :-)  I don't think you want to mention the grosser combinations, but:

1)  Can't the BV be used with either form of call?
2)  Mustn't the BE be used only with the LDIL form of call?
3)  Since the BE works with either intra- or interspace calls, is the LDIL
    call "better"?

In other words, wouldn't the explanation be a bit better if there were not
two "or"'s between the columns?

Thank you.  I'll be quiet now ;-)

Jim Thomas   840 @ midas!jthomas@wsmr-emh82.army.mil

dhandly@hpcllz2.HP.COM (Dennis Handly) (03/17/90)

>1)  Can't the BV be used with either form of call?

BV can be used for both calls if the pc space is the same as SR4.  For 
vanilla code this is true.  (Since the BLE was with SR4.)

>2)  Mustn't the BE be used only with the LDIL form of call?

If SR0 is not set up to match the pc space, then it can't be used for the
BL form of the call.

>3)  Since the BE works with either intra- or interspace calls, is the LDIL
>    call "better"?

LDIL form is not better if you are only branching to a procedure close by.

For procedures that are far away, the linker retargets the BL to branch to
                   LDIL   L'xxx
                   BE     R'xxx(4,0)
This is known as a long branch stub.