[comp.lang.c] Accessing argc & argv from a function

mikel@flmis06.ATT.COM (Mikel Manitius) (07/16/87)

I recently ran accross an interresting question.

How does one get at argc and argv (and possibly envp) from a function?
Without declaring it in main first, and then passing a pointer (global
or not)! Assume you don't have control over what happens in main. Can
you still get at the argument vector?
-- 
				Mikel Manitius @ AT&T Network Operations
				mikel@codas.att.com.uucp | attmail!mikel

steele@unc.cs.unc.edu (Oliver Steele) (07/17/87)

In article <22@flmis06.ATT.COM> mikel@flmis06.ATT.COM (Mikel Manitius) writes:
>I recently ran accross an interresting question.
>
>How does one get at argc and argv (and possibly envp) from a function?
>Without declaring it in main first, and then passing a pointer (global
>or not)! Assume you don't have control over what happens in main. Can
>you still get at the argument vector?

If your question is theoretic, this is probably cheating, but if it's
something that you really need to do here's how:

If you're under UN*X, include some code
    int     myArgc;
    char    **myArgv;
    char    *myEnvp;
    myMain(argc,argv,envp)
	int     argc;
	char    **argv;
	char    *envp;
    {
	myArgc = argc;
	myArgv = argv;
	return main(argc,argv,envp);
    }
and use the -e option of ld.  my{Arg[cv],Envp} have global scope.

If you're on some system without ld, you'll need to find out how to
set the loader to use a different entry point from main().

Alternatively, find out the stack-frame format on your machine, and
create a function that looks down the stack for argc etc. and saves them
for future reference by your other functions.  If main() does not do
stuff like argc--, and if you can guarantee that your function will
always be called from the same depth or that it can find the correct
stack frame in some other way, this will work.  (By the way, this is
non-portable :-).

Note that setenv() and getenv() are preferred for accessing envp.

------------------------------------------------------------------------------
Oliver Steele				  ...!{decvax,ihnp4}!mcnc!unc!steele
							steele%unc@mcnc.org

	"They're directly beneath us, Moriarty.  Release the piano!"

dg@wrs.UUCP (David Goodenough) (07/17/87)

In article <22@flmis06.ATT.COM> mikel@flmis06.ATT.COM (Mikel Manitius) writes:
>I recently ran accross an interresting question.
>
>How does one get at argc and argv (and possibly envp) from a function?
>Without declaring it in main first, and then passing a pointer (global
>or not)! Assume you don't have control over what happens in main. Can
>you still get at the argument vector?
>-- 
>				Mikel Manitius @ AT&T Network Operations
>				mikel@codas.att.com.uucp | attmail!mikel

In a sentence - you could, but it would be a real mess, and extremely
system dependant. To show *EXACTLY* what happens, here is the
stack (from a PDP-11, 68K, VAX, Z80, and probably 80*86) just after
entry to main:


		|		 |
		|      envp	 |	<------ these are the ONLY
		+----------------+	<------ references to argc etc. that
		|      argv	 |	<------ exist in the memory space
		+----------------+	<------ of the program, so you've
		|      argc	 |	<------ GOT to use them or do without
		+----------------+
		| return address |
		|    from main   |
		+----------------+
		| previous stack |
frame pointer ->|  environment   |
		+----------------+
		|   locals for   |
		|      main      |

Since argc, argv, and envp can only be accessed at this location, you have
two options: get at them as arguments to main (nice and clean), or where
ever it is you want them do something like:


getargs()
 {
    int i;

    static int *ip;		/* note the static */

    static struct frame
     {
	char *return;
	struct frame *previous;
     } *fp;

Now, since i is the only automatic local (this is the reason for ip & fp
being static),

    ip = &i

will set ip to point to one int below the current frame pointer. NOW

    fp = (struct frame *) &ip[1];

points fp at your current frame, and all you have to do is chase back up
the stack till you find whatever it is you're looking for. NOTE that this
is a kludge that I *DO NOT* recommend to anyone, because it will be so
unportable as to give most programmers a nervous breakdown. Also I don't
know how you'd set about detecting when you hit main. Once you do,
something along the lines of:

    ip = (int *) fp;

points ip back into the stack at main's locals (i.e. argc etc.) and indexing
off ip can get you what you want. Not very pretty you'll agree :-). As an
aside, I saw this used JUST ONCE in BCPL, which is a predecessor of C. It
did work in that environment because BCPL runs with two stacks: one growing
from high memory down contains constant sized frames, and another growing
from low memory up holds the locals. As a result of this, chasing up the
frame stack was not as traumatic as in C.

P.S. this second suggestion is NOT meant to be taken too seriously, it
is more an explanation of why you should simply try to get the stuff
out of main: as you say either by passing parameters, or by using
globals. As an aside, if you have source for the UNIX library look
at getenv, because somehow or other it does about what you're after, and
I'm damned if I know how it works.
--
		dg@wrs.UUCP - David Goodenough

					+---+
					| +-+-+
					+-+-+ |
					  +---+

rml@hpfcdc.HP.COM (Bob Lenk) (07/18/87)

> If you're under UN*X, include some code
>    int     myArgc;
>    char    **myArgv;
>    char    *myEnvp;
>    myMain(argc,argv,envp)
>	int     argc;
>	char    **argv;
>	char    *envp;
>    {
>	myArgc = argc;
>	myArgv = argv;
>	return main(argc,argv,envp);
>    }
> and use the -e option of ld.  my{Arg[cv],Envp} have global scope.

This won't work in general.  On many (or most) implementations ld's
default entry point when invoked by cc is not main (or _main) but a
special machine-dependent piece of code that invokes main.  This
technique will bypass that code and cause problems.

		Bob Lenk
		{ihnp4, hplabs}!hpfcla!rml

gnu@hoptoad.uucp (John Gilmore) (08/03/87)

> Well, I think we may have lost track of the initial problem.
> 
> However I am still looking for a way to get at "argc" and "argv" from
> a function when I don't have access to "main".

There is no portable way to do this.

Even if there was such a way, most main() programs are written to 
completely parse their arguments and to reject any that they don't
understand.  If you were hoping to sneak in extra arguments that
your routine could parse later, you are probably out of luck.

The solution is to get the source for main() and modify it, or do something
that doesn't involve the arguments, e.g. pass the information you are
looking for in the environment, in a file, etc.  You never did get around
to explaining what it is you *really* want to do, or why you are modifying
subroutines' sources but don't have source to main().
-- 
{dasys1,ncoast,well,sun,ihnp4}!hoptoad!gnu	     gnu@postgres.berkeley.edu
Alt.all: the alternative radio of the Usenet.

dg@wrs.UUCP (David Goodenough) (08/04/87)

In article <5773@ut-ngp.UUCP> ayac071@ngp.UUCP (Bill Douglass) writes:
>In article <382@root44.co.uk> njh@root44.UUCP (Nigel Horne) writes:
>>In article <39@flmis06.ATT.COM> mikel@flmis06.ATT.COM (Mikel Manitius) writes:
>>>
>>>However I am still looking for a way to get at "argc" and "argv" from
>>>a function when I don't have access to "main". getenv() and putenv()
>>>gets it from somewhere...

Technically getenv reference environ - a global variable which is a char **
type of thing. However it just so happens that environ is also the third
argument to main(), which brings me to the following little program.
I wrote this over lunch the other day, and believe it or not it works.
I'm not suggesting that you all go and replace /bin/echo with this [among
other things I haven't implemented the -n switch :-) ], but it demonstrates
that what was requested is possible - if a little messy, and about as safe
as nitroglycerine.

*NOTE* it runs on a 68K and you *WILL*HAVE* to change struct frame to reflect
what your stack frames look like on whatever machine you're using. However
it does behave as expected, for all those of you with 68Ks I suggest you try
it (if you drop the printf in getargs, and the passing of ip all the way down)
it behaves just like echo - except there ain't no argc & argv declared in
main().

If Mikel Manitius is still after something and the following is close, but
doesn't work - email me & I'll help you as much as I can.
--
		dg@wrs.UUCP - David Goodenough

					+---+
					| +-+-+
					+-+-+ |
					  +---+

--- cut here --- cut here --- cut here --- cut here --- cut here ---
extern char **environ;		/* need this to know when to stop */

main()				/* no args!! */
 {
    int i;			/* just used for address checking when we
				 * get to the bottom */
    getit(&i, 5);		/* go do the work */
 }

getit(ip, n)			/* this recursively calls itself to put
				 * some fake crud on the stack */
int *ip;
 {
    int ac;
    char **av;
    int i;

    if (n)
      getit(ip, n - 1);		/* not far enough down */
    else
     {
	findargs(&ac, &av, ip);	/* at the bottom - get argc & argv */
	for (i = 1; i < ac; i++)
	  printf("%s%c", av[i], (i == ac - 1) ? '\n' : ' ');
				/* and print args - just like echo!! */
     }
 }

struct frame			/* stack frame structure for a 68K */
 {
    struct frame *_dynamic;	/* dynamic link to previous frame */
    char *_return;		/* return address: I didn't know what type to
				 * make this, so I made it a char * */
    int _argc;			/* argc when we get there */
    char **_argv;		/* argv when we get there */
    char **_envp;		/* envp when we get there */
 } *work;

findargs(acp, avp, ip)		/* this does the work - ip is not needed,
				 * it's just printed for verification */
int *acp, *ip;
char ***avp;
 {
    int i;			/* only one local variable, used to get a
				 * reference to the current stack frame */

    work = (struct frame *) (&i + 1);
				/* ugh! - point work at the current frame */

    while (work->_envp != environ)
      work = work->_dynamic;	/* run up the stack till we hit a valid
				 * environment which we hope is main!! */
    *acp = work->_argc;		/* get argc */
    *avp = work->_argv;		/* and argv */

    printf("%x %x\n", ip, work);
				/* just to compare pointers - these two should
				 * be sizeof int apart */
 }

feg@clyde.UUCP (08/04/87)

In article <2616@hoptoad.uucp>, gnu@hoptoad.uucp (John Gilmore) writes:
> > Well, I think we may have lost track of the initial problem.
> > 
> > However I am still looking for a way to get at "argc" and "argv" from
> > a function when I don't have access to "main".
> 
> There is no portable way to do this.

   [deletions]

> The solution is to get the source for main() and modify it, or do something
> that doesn't involve the arguments, e.g. pass the information you are
> looking for in the environment, in a file, etc.  You never did get around
> to explaining what it is you *really* want to do, or why you are modifying
 
The original poster didn't say he wanted a portable way and he did say he
didn't have access to main().  The command line information is in the 
PSP.  DOS 3.+ has the entire command line, while DOS 2.+ does not have
the name of the program being executed. If you can't access with C (and
I believe that you can), you certainly can do it in assembly.

Forrest Gehrke

hunt@spar.SPAR.SLB.COM (Neil Hunt) (08/05/87)

In article <289@wrs.UUCP> dg@wrs.UUCP (David Goodenough) writes:
>In article <5773@ut-ngp.UUCP> ayac071@ngp.UUCP (Bill Douglass) writes:
>>In article <382@root44.co.uk> njh@root44.UUCP (Nigel Horne) writes:
>>>In article <39@flmis06.ATT.COM> mikel@flmis06.ATT.COM (Mikel Manitius) writes:
>>>>
>>>>However I am still looking for a way to get at "argc" and "argv" from
>>>>a function when I don't have access to "main". getenv() and putenv()
>>>>gets it from somewhere...
>
>Technically getenv reference environ - a global variable which is a char **
>type of thing. However it just so happens that environ is also the third
>argument to main(),

*At the time main is called*. Remember that C is call by value...

>                    which brings me to the following little program.
>I wrote this over lunch the other day, and believe it or not it works.
>I'm not suggesting that you all go and replace /bin/echo with this [among
>other things I haven't implemented the -n switch :-) ], but it demonstrates
>that what was requested is possible - if a little messy, and about as safe
>as nitroglycerine.
>

There is just one flaw in this code, and that is that you are relying on
environ == <third-arg-to-main> *after* main and various other functions
have run.  If it is not, then you when you find the correct stack frame
for main, the test will fail, and you will continue to look for earlier
and earlier stack frames, until a segmentation violation occurs.

This assumption is violated under the following situations:

1 main changes envp.

2 main passes envp to another function as the third arg.

3 someone uses getenv or setenv to manipulate the environment
   which results in copying of the environment list as refered
   to by environ, without changing envp.

The code I posted a couple of days ago, while not as nicely written
using a struct frame to clean up the messiness, is slightly more
reliable, as instead of checking to see if environ is the same as
the third arg to main to determine whether it has found the main
stack frame, it checks the return address against the address of main
and the address of the following function, to see if a frame
was called from main, and if so, goes one frame further.

Of course, you do have to know what the name of the function following main
is; this can be obtained using nm with options to print the symbol table
in numerical order, if the source code is not available.

Try this modification (which I haven't tested):

>main()				/* no args!! */
> {
>    int i;			/* just used for address checking when we
>				 * get to the bottom */
>    getit(&i, 5);		/* go do the work */
> }
>
>getit(ip, n)			/* this recursively calls itself to put
>				 * some fake crud on the stack */
>int *ip;
> {
>    int ac;
>    char **av;
>    int i;
>
>    if (n)
>      getit(ip, n - 1);		/* not far enough down */
>    else
>     {
>	findargs(&ac, &av, ip);	/* at the bottom - get argc & argv */
>	for (i = 1; i < ac; i++)
>	  printf("%s%c", av[i], (i == ac - 1) ? '\n' : ' ');
>				/* and print args - just like echo!! */
>     }
> }
>
>struct frame			/* stack frame structure for a 68K */
> {
>    struct frame *_dynamic;	/* dynamic link to previous frame */
>    char *_return;		/* return address: I didn't know what type to
>				 * make this, so I made it a char * */
>    int _argc;			/* argc when we get there */
>    char **_argv;		/* argv when we get there */
>    char **_envp;		/* envp when we get there */
> } *work;
>
>findargs(acp, avp, ip)		/* this does the work - ip is not needed,
>				 * it's just printed for verification */
>int *acp, *ip;
>char ***avp;
> {
>    int i;			/* only one local variable, used to get a
>				 * reference to the current stack frame */
>
>    work = (struct frame *) (&i + 1);
>				/* ugh! - point work at the current frame */
>
     while(! (work->_return >= (char *)main &&	/* check for return to main */
      work->_return < (char *)getit))
	 work = work->_dynamic;	/* advance one frame up the stack */

     work = work->_dynamic;	/* advance one final frame up the stack */
>				 * environment which we hope is main!! */
>    *acp = work->_argc;		/* get argc */
>    *avp = work->_argv;		/* and argv */
>
>    printf("%x %x\n", ip, work);
>				/* just to compare pointers - these two should
>				 * be sizeof int apart */
> }

Neil/.

karl@haddock.ISC.COM (Karl Heuer) (09/21/87)

In article <222@hobbes.UUCP> root@hobbes.UUCP (John Plocher) writes:
>If one wanted to make the arguments avaliable w/o going thru main() then I
>agree that you could just make crt0 put them into static locations, BUT would
>anyone use that feature?  Would you?  No, you (and I) know about argc/v and
>how to use them.  The beginning C pgmr would soon learn the "right" way also,
>and use *it*.

Yes, I would.  (And I'm not a beginner, by any strech of the imagination.)  If
argv[0] were available in a static location, I would have subroutines that use
it in error messages.  For example, an ideal implementation of "perror"
should always print the program name, optionally print an additional string
(often the name of a file that couldn't be opened), and the strerror text.

Perhaps an existing analogy would help.  Most UNIX implementations pass a
third argument, envp, to main.  I never use it.  Most of the time that I need
this value, I'm in a subroutine other than main, so I use the static cell
(environ) that's also provided.  Even if I do want to access it from main, I
prefer to use environ rather than envp, just for consistency.  (For similar
reasons, I prefer "exit" to "return" in main.)

Lest you misunderstand -- I see no reason to make all of argv global.  I
believe argv[1] through argv[argc-1] are, as their name implies, arguments to
main.  I think putting the program name into the argv[] array was a mistake;
it should have been a global in the first place, and the real arguments should
have been numbered starting at zero.  But it's too late to change that now.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint