[comp.unix.wizards] who called a C routine - get it from the stack frame

wbp@cuuxb.UUCP (Walt Pesch) (12/05/86)

In article <810@hropus.UUCP> jgy@hropus.UUCP writes:
>Can anyone help me with the following problem:  I'm looking for a few 
>lines of C or assembly code which can be used at the top of a function 
>to get the address of the function which called it.  I can then map
>this address to the calling functions name using "nm".

Oh, well, you asked!  Time to get down into the mud...  For System V,
the following dirty trick should work:

When defining the actual function, which is normally passed "n" 
variables, define the function to have "n+1" variables.  By the nature 
of the stack frame, the "n+1"'th variable will contain the program
address for returning.  Back up x words for the length of the jump
instruction, and call "nm"...  good luck.

This is an interesting way to get at the entire stack frame, and
needless to say, all sorts of fun!  A generic System V stack frame look
like:

<n words of passing parameters>
program address (address after call)
saved ap (start of previous frame)
saved fp (start of previous frame's automatics and temporaries)
<n words for saving registers>
<automatic and temporary variables>
   |
   |   Stack Growth
   V

I don't know if the same trick will work with BSD or any of the other
<a-hem> variants.  And life is too short to RTFS the BSD internals, so
I'll leave it to someone else to comment on how to do "it" on other
forms of Unix.

   Walt Pesch
   {ihnp4,akgua,et al}!cuuxb!wbp
   cuuxb!wbp@lll-crg

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/05/86)

In article <961@cuuxb.UUCP> wbp@cuuxb.UUCP (Walt Pesch) writes:
>... A generic System V stack frame look like:

It is perhaps worth pointing out for people who aren't really into
language technology that there is no such thing as a "generic"
System V stack frame.  Such details are necessarily implementation-
specific.  The classic reference for this is Bell Labs CSTR No. 102,
"The C Language Calling Sequence" by S. C. Johnson and D. M. Ritchie.

chris@mimsy.UUCP (Chris Torek) (12/05/86)

In article <961@cuuxb.UUCP> wbp@cuuxb.UUCP (Walt Pesch) writes:
>Oh, well, you asked!  Time to get down into the mud...  For System V,
>the following dirty trick should work:

Getting down in the mud is right; this kind of thing is hardly
portable.  But `for System V' is not true:  The operation is
*machine* specific, not *operating system* specific (though there
are machines on which the operating system might affect the method).

Here is some 4.2/4.3BSD Vax Unix code to do the trick.

#include <stdio.h>
#include <sys/types.h>
#include <machine/frame.h>

main()
{

	f();
	g();
	exit(0);
}

f()
{
	g();
}

g()
{
	register struct frame *fr;	/* r11 */

	asm("	movl	fp,r11");	/* set fr=frame */
	printf("g: will return to address %x\n", fr->fr_savpc);
}
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
UUCP:	seismo!mimsy!chris	ARPA/CSNet:	chris@mimsy.umd.edu

greg@utcsri.UUCP (Gregory Smith) (12/06/86)

In article <961@cuuxb.UUCP> wbp@cuuxb.UUCP (Walt Pesch) writes:
>In article <810@hropus.UUCP> jgy@hropus.UUCP writes:
>>Can anyone help me with the following problem:  I'm looking for a few 
>>lines of C or assembly code which can be used at the top of a function 
>>to get the address of the function which called it.  I can then map
>>this address to the calling functions name using "nm".
>
>Oh, well, you asked!  Time to get down into the mud...  For System V,
>the following dirty trick should work:
>
>When defining the actual function, which is normally passed "n" 
>variables, define the function to have "n+1" variables.  By the nature 
>of the stack frame, the "n+1"'th variable will contain the program
>address for returning.

Not on anything I've seen. The n+1'th parameter will be part of the caller's
auto or temp space.

What you need is the 'zeroth' parameter:

foo(a,b)
any a,b;
{	char **klugepointer =( (char **)&a) -1;

This sets klugepointer to an address one 'char *' lower than the address of
the first parameter - this will point to the return address on many machines.
More directly:
	char *retadress = ( (char**) &a )[-1];
This return address will be somewhere within the calling routine.
You can use that to find the caller's name. If you can look at the
next stack frame down, you can find the address to return from the
caller. This will allow you to locate the subroutine call which activated
the caller. The caller's start address will be contained in this
instruction ( or can be calculated from it ). This is assuming that
the caller was normally called, of course. There may be >1 type of
call instruction. If these are of different lengths, you cannot determine
the beginning of the instruction given the address of the next instruction
(i.e. the return address).

You will need a 3B2 machine-language reference and some sample 
assembler output from cc to complete this fearsome, grisly mission
if you choose to accept it. Of course, don't expect it to port to
a different compiler, let alone a different machine.

-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg
Have vAX, will hack...

ggs@ulysses.UUCP (Griff Smith) (12/06/86)

> In article <810@hropus.UUCP> jgy@hropus.UUCP writes:
> >Can anyone help me with the following problem:  I'm looking for a few 
> >lines of C or assembly code which can be used at the top of a function 
> >to get the address of the function which called it.
> 
> Oh, well, you asked!  Time to get down into the mud...  For System V,
> the following dirty trick should work:
> ...
> A generic System V stack frame looks like:
> 
> <n words of passing parameters>
> program address (address after call)
> saved ap (start of previous frame)
> saved fp (start of previous frame's automatics and temporaries)
> <n words for saving registers>
> <automatic and temporary variables>
>    |
>    |   Stack Growth
>    V
> 
> I don't know if the same trick will work with BSD or any of the other
> <a-hem> variants.  And life is too short to RTFS the BSD internals, so
> I'll leave it to someone else to comment on how to do "it" on other
> forms of Unix.
> 
>    Walt Pesch
>    {ihnp4,akgua,et al}!cuuxb!wbp
>    cuuxb!wbp@lll-crg

Er, ah, I understand the company policy that UNIX(R) = System V (don't
agree with it, but understand it), but saying UNIX = System V running
on a 3B2 is carrying company loyalty a bit too far.  

System V (or BSD, for that matter) on a VAX, or a CCI POWER 6/32 (plus
many others) grows the stack the other way, so you can't access the
return address as an extra arg.  You can't easily find the return
address when given only the argument list; you need the value of the
frame pointer.  If you really must have the return address, write a
short assembly language function that finds the caller of ITS caller by
following the frame pointer.  Make sure you use a different version of
the function for each machine you want to run it on.  Better yet, use a
debugging tool that knows what it's doing and quit fooling with machine-
specific hacks.
-- 

Griff Smith	AT&T (Bell Laboratories), Murray Hill
Phone:		(201) 582-7736
UUCP:		{allegra|ihnp4}!ulysses!ggs
Internet:	ggs@ulysses.uucp

tim@hoptoad.uucp (Tim Maroney) (12/06/86)

There is a portable and clean way to implement a routine finding out the
address of its caller in a few lines of code.  It involves no assembly
language or machine assumptions.  Simply pass the address of the calling
routine as an argument to the routine that needs the address.

foo()
{
	nmuser(foo);
}

nmuser(f)
int (*f)();
{
	/* whatever you are doing using nm */
}

This can be fooled, but an assembly-language caller can easily fool the
other scheme as well by putting a spurious return address on the stack.
-- 
Tim Maroney, Electronic Village Idiot
{ihnp4,sun,well,ptsfa,lll-crg,frog}!hoptoad!tim (uucp)
hoptoad!tim@lll-crg (arpa)

mac@esl.UUCP (Mike McNamara) (12/06/86)

In article <5428@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>It is perhaps worth pointing out for people who aren't really into
>language technology that there is no such thing as a "generic"
>System V stack frame.  Such details are necessarily implementation-
>specific.  The classic reference for this is Bell Labs CSTR No. 102,
>"The C Language Calling Sequence" by S. C. Johnson and D. M. Ritchie.


   Actually, the _important_ reference is, as Chris Torek pointed out, 
   /usr/include/machine/frame.h, or equivilent. This tells you what your
   machine does.
   =
   =
   =
   =



-- 
 Michael Mc Namara                 
 ESL Incorporated                 
 ARPA: mac%esl@lll-lcc.ARPA

wbp@cuuxb.UUCP (Walt Pesch) (12/08/86)

This is what I get for looking at too many 3B crash dumps - it gets to
you after a while.  Y'all are right, the stack frame is not only
dependant on the OS but also very much machine specific.  For AT&T SV
releases on the 3B line, the Stack Frame does look like what I showed, 
and it does grow downwards.  For the AT&T System V on the Vax, it still
looks the same but grows upwards (97% certainty).  For anything else, 
my advice would be to get into crash and dump the stack for a couple of 
procs and figure it out yourself.  If you're lucky, the stack dump
will also contain a pretty picture showing yoy exactly what to look
for.


   Walt Pesch
   {ihnp4,akgua,et al}!cuuxb!wbp
   cuuxb!wbp@lll-crg

wbp@cuuxb.UUCP (Walt Pesch) (12/08/86)

This is what I get for looking at too many 3B crash dumps - it gets to
you after a while.  Y'all are right, the stack frame is not only
dependant on the OS but also very much machine specific.  For AT&T SV
releases on the 3B line, the Stack Frame does look like what I showed, 
and it does grow downwards in memory (i.e. the stack pointer is
incremented in a push operation).  For the AT&T System V on the Vax, 
it still looks the same but grows upwards in memory (i.e. the stack
pointer is decremented in a puch operation).  For anything else, my 
advice would be to get into crash and dump the stack for a couple of 
procs and figure it out yourself.  If you're lucky, the stack dump
will also contain a pretty picture showing yoy exactly what to look
for.


   Walt Pesch
   {ihnp4,akgua,et al}!cuuxb!wbp
   cuuxb!wbp@lll-crg

billc@blia.BLI.COM (Bill Coffin) (12/10/86)

People have been talking about the caller's-address problem as if it were
machine-dependent or operating-system dependent.  Not so.  It is
a compiler-dependent problem.  I guarantee you that a compiler generating
different frame-handling code will break any scheme that gets the
functions' return address.  Further, any compiler that can emulate
a stack architecture and do reasonable frame handling will enable 
solution of the problem; we have solved this problem on VM/CMS -- a
non-stack architecture.  Getting the caller's address is dependent 
entirely on the code generated by the compiler.

<vanilla disclaimer>

dave@lsuc.UUCP (12/16/86)

I just came across this code I wrote 5 years ago on a PDP-11/45
running v6:

/ returns the address of the caller's return pointer
.globl	_pcret
_pcret:
	mov	2(r5),r0	/ pc of caller of caller
	rts	pc

Looking back at the application, I see now that doing it this way
was silly as well as non-portable. I wanted a function to do
different things depending on which function called it. Much better
just to pass that information as an argument. However, if anyone
out there has a PDP-11 running v6, this might work. Call it
pcret.s and invoke it as pcret(), of course.

David Sherman
Toronto
-- 
{ ihnp4!utzoo  seismo!mnetor  utai  watmath  decvax!utcsri  } !lsuc!dave