[comp.unix.wizards] sun-3 dbx, arguments, hacking, help...

dean@homxb.UUCP (D.JONES) (04/25/88)

	HELP!!!  I need to know how `dbx' knows how many arguments are on the
stack when you do the `where' command on a SUN 3.

	It seems that in a fit of `lazyness',  our software developers became
very dependent on a very very non-portable, compiler dependent, function that
returns the number of arguments for the function that called it, ... got that ?

	It was used something like this: ... not actual code here   :-)

void do_something(a,b,c) int a, b, c;
{
	int args; 	/* add your favorite auto's, it still works */

	count(&args);

	switch(args){
		case 1:
			do_another_thing(a);
			break;
		case 2:
			do_so_and_so(a,b);
			break;
		case 3:
			yea_yea_yea(a+b,c);
			break;
		default:
			oops();
	}
}

function count() is pretty easy to do an a 3b, vax, or Amdahl Native compiler,
but on the sun 3's,  we got into some real trouble.....  We had to look at the
next instruction at the return point for do_something(), and hack out how much
it was adding to the stack pointer after a function call. ( addqw #0x4,sp )
and divide by sizeof(int) or whatever,  and voila', it worked...

	Yea,  I know this only worked for 4 byte arguments,  I am not defending
the code.

	So the real problem is this. On our new SunOS 4.0,  the optimizer kills
the `final' stack adjustment for a function. Eg:

go()
{
	printf("%d\n", 1);

	/* in asm          addqw  #0x8,sp */

	bla(34);

	/* in asm          addqw  #0x4,sp */

	do_anything(1, 'a', 2, "help");

	/* forget it,  it's the last function call */
}

	Now,  dbx KNOWS how many 4 byte arguments are there.  You do not have
to compile with -g,   and it does not look in a symbol table and see how many
are supposed to be there,   it knows even for function with a variable number
of arguments.  Try the following program:

main()
{
	go(0);
	go(0,0);
	go(0,0,0);
	go(0,0,0,0);
}

go(a) int a;
{
}

	Compile and run under dbx,  no -g,  and stop at symbol go and do a
`where'.  Each time it gives the correct argument dump. So, how does it do
it ???  

	We have about 100 function that use this `count()' routine to check
for optional arguments.   These should be easy enough to fix,  but they are
used from roughly 3000 places in our code,  each of which will also have to
fixed or supplied with the extra argument(s). We have about 1 Million lines
of code,   and poking through it all to find them will take us way past our
release date.  I am looking into ways to discontinue the use of the count()
function, but I need an intermediate solution so we don't blow our schedule ...

	Any help would be appreciated ...

							Dean Jones
							AT&T Bell Labs.
							HO 1K-426

						   { AT&T Gateways }!homxb!dean
P.S.
	This will probably be impossible on a SUN-4

guy@gorodish.Sun.COM (Guy Harris) (04/25/88)

> 	HELP!!!  I need to know how `dbx' knows how many arguments are on the
> stack when you do the `where' command on a SUN 3.

By looking at the code following the call to see how many bytes it pops off the
stack.  (I checked the "dbx" source.)

> 	Now,  dbx KNOWS how many 4 byte arguments are there.  You do not have
> to compile with -g,   and it does not look in a symbol table and see how many
> are supposed to be there,   it knows even for function with a variable number
> of arguments.  Try the following program:

I did, and discovered that it does *not* get the argument count right on the
last call, because, as you indicate, the optimizer gets rid of the
stack-cleaning code.  It thinks "go" was called with zero arguments in the last
call.

> P.S.
> 	This will probably be impossible on a SUN-4

Make that "certainly will be impossible on a Sun-4", or on any other machine
that 1) passes arguments in registers and 2) has a compiler that does
sufficient optimization.  Perhaps not impossible, but at least *extremely*
difficult.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/26/88)

In article <1641@homxb.UUCP> dean@homxb.UUCP (D.JONES) writes:
>We have about 1 Million lines of code, and poking through it all
>to find them will take us way past our release date.

I sure hope that part of your solution is to find those responsible
and see that they are fired.

P.S.  This function used to be called nargs() and it was removed
from PDP-11 UNIX when the PDP-11/70 came out, precisely because it
became clear that programs that relied on it would have problems
when ported to new machines.  It is hard to imagine anyone taking
the trouble to implement a count(&args) function without realizing
this!

bzs@bu-cs.BU.EDU (Barry Shein) (04/26/88)

This "magic function that returns the number of args" started a long
time ago with the unix function nargs() which did the same basic
thing. You will now, undoubtedly, get all sorts of sneers from people
who like playing armchair quarterback about how you shouldn't do that
(even tho you explained that you inherited it!)

One idea that comes to mind is to use /usr/5bin/m4 on the Sun (the
other BSD/Pre-SysV m4 won't do this) to massage the calls into a
format where the first arg will be the number of arguments, then you
can work that one function to use varargs.h :

define(do_something,Do_something(`$#',$*))

(the quotes are necessary) with a little cleverness you could
batch-run that once on all your files. I ain't sayin it's a sure fix,
but it might get you thinking about solving that once and for all.

	-Barry Shein, Boston University

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (04/26/88)

  One of the things lost in the move from B to C was the function nargs,
which was a real part of the language. There had to be something in the
calling sequence to allow the called program to determine the number of
arguments. There were no types so we didn't have any problem there.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

rwa@auvax.UUCP (Ross Alexander) (04/28/88)

Doug Gwyn mentions that nargs() was pulled out of un*x at about v6 or
so because people decided it would be hard to port.  I assume this
ha[ds] something to do with the decision to go to a split I-and-D
space addressing model at about that time (correct me as nescessary).

On machines with flat addressing spaces (i.e., able to to do "mov
@#codespace, r0" and get it to work right :-) the nargs() function is
trivial assuming the compilers adopt some kind of a convention re
linkage code.  Yes, it varies from machine to machine; but so does
bcopy().

I think the other thing that might foul things up is the idea of
variable sized objects (is eight bytes on the stack 4 shorts, 2 ints,
one double, or char x[ 8 ] ??).  This could be solved easily enough
via descriptor lists ( even Algol60 had `dope vectors' ).

Ross "nargs() is in the B library!" Alexander
Athabasca University, Alberta

limes@sun.uucp (Greg Limes) (04/30/88)

In article <613@auvax.UUCP> rwa@auvax.UUCP (Ross Alexander) writes:
>Doug Gwyn mentions that nargs() was pulled out of un*x at about v6 or
>so because people decided it would be hard to port. ...
>
>On machines with flat addressing spaces (i.e., able to to do "mov
>@#codespace, r0" and get it to work right :-) the nargs() function is
>trivial assuming the compilers adopt some kind of a convention re
>linkage code.  Yes, it varies from machine to machine; but so does
>bcopy().
>
>I think the other thing that might foul things up is the idea of
>variable sized objects (is eight bytes on the stack 4 shorts, 2 ints,
>one double, or char x[ 8 ] ??).  This could be solved easily enough
>via descriptor lists ( even Algol60 had `dope vectors' ).

mmm. sometimes the idea of the number of bytes on the parameter stack
might be of use; on current sun compilers, you can look at the
instruction following the call to get this. takes a bit of work, you
have to check for both the "addql #x,sp" and "lea a6(x),sp" forms, but
not too hard. The following function returns the number of bytes passed,
or -1 if it is unable to find out ... oh, this only works on mc68020
based machines that use Sun stack conventions.

		.globl  _bargs
	_bargs:
		movl    a6@(4),a0
		movw    a0@,d0

		cmpw    #0x4FEF,d0
		bne     0f
		bfextu  a0@{#16:#16},d0
		bra     9f
	0:
		andw    #0xF1FF,d0
		cmpw    #0x504F,d0
		bne     0f
		bfextu  a0@{#4:#3},d0
		bne     9f
		movl    #8,d0
		bra     9f
	0:
		movl    #-1,d0
	9:
		rts
-- 
   Greg Limes [limes@sun.com]				frames to /dev/fb

nigelh@uvicctr.UUCP (R. Nigel Horspool) (05/06/88)

Here is an answer to the question about determining how
many bytes of parameters have been passed to a C function
executed on a SUN-3 (i.e. a Motorola 68000).
I don't know if this is close to how SunOS 4.0 dbx does it, but
in the absence of symbol table information, I can't see any other
way it could be done.

I have presented my answer in tutorial form so it's rather long.

First, let's modify the example code we were given so that function
`go' has some parameters and local variables.

>>	go( a, b, c )
>>	int a, b, c;
>>	{
>>	    int a[10];
>>
>>	    /*  most of the body of `go' is omitted  */
>>
>>	    do_anything(1, 'a', 2, "help");
>>	}

The prologue code at the entry to the `go' function has to create a new
stack frame to hold the local variables and link this stack frame to
the caller's stack frame.  Both actions are accomplished by the single
instruction called `link' on the 68000.

So, at the beginning of the code for `go', there will be an instruction:
		link  a6,#-40
where 40 is the number of bytes of local storage for `go'.  The minus
sign appears because the stack on which the memory is allocated grows
downward in memory.  (If the -O option hasn't been used, an equivalent
pair of instructions:  link a6,#0  addw #-40,sp  may appear instead.)

Immediately before calling `do_anything' (i.e. after pushing the 4
parameters in reverse order) and again immediately after returning from
that function, the stack will look like

	a6  ------>	|          |
			|----------|
			| 40 bytes |
			| for the  |
			| locals   |
			|----------|
			| arg  #4  |
			|----------|
			| arg  #3  |
			|----------|
			| arg  #2  |
			|----------|
	sp  ------->    | arg  #1  |
			|----------|

Therefore a computation to determine how many bytes of parameters are
on the stack at either of these points in the execution is:
	(a6) - 40 - (sp)
But this code would need to be executed by the caller of `do_anything'
and that isn't very useful.

If you want to execute code inside `do_anything' that determines how
many bytes of parameters were passed to it, the diagram is more
complicated (like a double version of the diagram above).  Let's assume
that `do_anything' has 20 bytes of local variables.  Then the picture
after `do_anything' has allocated its frame looks like:

			|          |  <----
			|----------|       |
			| 40 bytes |       |	(stack frame
			| for the  |       |	 for `go')
			| locals   |       |
			|----------|       |
			| arg  #4  |       |
			|----------|       |
			| arg  #3  |       |
			|----------|       |
			| arg  #2  |       |
			|----------|       |
			| arg  #1  |       |
			|----------|       |
			| ret addr |       |
			|----------|       |
	a6  ------>	| dyn link +------>
			|----------|
			| 20 bytes |		(stack frame
			| for the  |		 for `do_anything')
	sp  ------>	| locals   |
			|----------|

A calculation for the number of bytes of parameters from inside
`do_anything' is:
	((a6)) - (a6) - 48
where parentheses indicate dereferencing (i.e. following a pointer),
and the 48 is composed from 40 bytes for the caller's local variables
plus 4 bytes for the return address pushed on the stack by the function
call instruction (jsr) plus 4 bytes for the dynamic link pointer pushed
on the stack by the link instruction.


If we knew the number of bytes of local variables in the caller, it
would be easy to solve the problem in a couple of lines of C.
But finding the number of bytes of local variables in the caller ???
That's a bit harder.  The only technique I can think of is...

(1) Get the address of the caller's stack frame.
(2) Use the return address value in the caller's frame to find the
    `jsr' instruction that called the caller.
(3) By inspecting the operand of the `jsr' instruction, find the
    entry address of the caller.
(4) Finally, by looking at the immediate operand of the `link'
    instruction at the caller's entry point, determine the number
    of bytes of locals in the caller.

It's all highly dependent on the style of code emitted by the SUN C
compiler, but below is an example C program which performs all the
necessary calculations.  Remember, the program will produce the right
outputs only on a SUN-3 and only when it is compiled with the -O flag.


		R. Nigel Horspool
		University of Victoria
		(that's in Canada)


----------------- sample program follows -------------------

    /* foo determines how many bytes of parameters it has
       been passed on a SUN-3 (a 68000-based machine) */
    void foo( a )
    int a;
    {
        int *cheat[1];	/* `cheat' must be the FIRST local variable */
        int  *p, nbytes;
        short *q;
        /* declarations for other local variables appear here */
        
        p = cheat[1];		/* address of caller's stack frame */
        nbytes = ((int)p - (int)(cheat) - 12);
        p = (int *)(p[1]);	/* load return address */
        q = (short *)(p[-1]);	/* load entry address from jsr */
        nbytes += q[1];
        
        printf( "Number of bytes of parameters == %d\n", nbytes );
    
        /* code for foo can continue here */
    }
    
    main() {
       int a[10];
       foo();			/* pass 0 bytes */
       foo( 1 );		/* pass 4 bytes */
       foo( 2, 4 );		/* pass 8 bytes */
       foo( 3, "abc", 4, 5 );	/* pass 16 bytes */
    }

------------------------ end of sample program --------------------