[comp.unix.questions] How does a program get its path name?

Leisner.Henr@xerox.com (marty) (02/21/88)

How does an exec program get the pathname it was execed from if it wants to find
out this information?

(I'm specifically asking how cc knows to looks at ../lib for the compiler
passes).

marty
ARPA:	leisner.henr@xerox.com
GV:  leisner.henr
NS:  martin leisner:henr801c:xerox

ugfailau@sunybcs.uucp (Fai Lau) (02/21/88)

In article <11923@brl-adm.ARPA> Leisner.Henr@xerox.com (marty) writes:
>
>(I'm specifically asking how cc knows to looks at ../lib for the compiler
>passes).
>
	All the relevant path names are hard codes in the source codes.
All that is has to be done to change the pathes (I think you mean
/lib and not ../lib) is to edit the codes, recompile them, and put the
executable into /bin (or whatever).

Fai Lau
SUNY at Buffalo (The Arctic Wonderland)
UU: ..{rutgers,ames}!sunybcs!ugfailau
BI: ugfailau@sunybcs INT: ugfailau@joey.cs.buffalo.EDU

gwyn@brl-smoke.ARPA (Doug Gwyn ) (02/21/88)

In article <11923@brl-adm.ARPA> Leisner.Henr@xerox.com (marty) writes:
>How does an exec program get the pathname it was execed from if it wants to find
>out this information?

That information is not generally available to the process.

>(I'm specifically asking how cc knows to looks at ../lib for the compiler
>passes).

I don't know of any "cc"s that work like that.  Usually the pathnames
of the slave programs are hard-wired into the "cc" code, although
they're sometimes configurable via the makefile for cc when it's built.

shipley@web6b.berkeley.edu (Peter Shipley) (02/21/88)

In article <7304@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <11923@brl-adm.ARPA> Leisner.Henr@xerox.com (marty) writes:
>>How does an exec program get the pathname it was execed from if it wants to find
>>out this information?
>
>That information is not generally available to the process.
>
>>(I'm specifically asking how cc knows to looks at ../lib for the compiler
>>passes).
>
>I don't know of any "cc"s that work like that.  Usually the pathnames
>of the slave programs are hard-wired into the "cc" code, although
>they're sometimes configurable via the makefile for cc when it's built.

I thought that the path came from the user's environment 
variable PATH.

F

I

L

L

E

R
Pete Shipley: 
email:   shipley@violet.berkeley.edu     Flames:  cc-29@cory.berkeley.edu 
         ucbvax!violet!shipley                    ucbvax!cory!cc-29
Spelling corections: /dev/null                    Quote: "Anger is an energy"

turtle@sdsu.UUCP (Andrew Scherpbier) (02/22/88)

In article <11923@brl-adm.ARPA> Leisner.Henr@xerox.com (marty) writes:
>How does an exec program get the pathname it was execed from if it wants to find
>out this information?
>
>(I'm specifically asking how cc knows to looks at ../lib for the compiler
>passes).
>
>marty

When a program executes, the full path to the executable file is kept in
the zero-th argument.  If you have a declaration of main which looks like this:
	main(argc,argv)
	int	argc;
	char	*argv[];

then argv[0] is a pointer to the full path.
Is this what you were looking for?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~T~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"I sometimes get the feeling that   |  Andrew Scherpbier
 things cannot possibly get worse...|  Computer Science Department
 and sure enough, they do."         |  San Diego State University
	-Don Perkins Jr.            |  ihnp4!jack!sdsu!turtle

ugfailau@sunybcs.uucp (Fai Lau) (02/22/88)

In article <7102@agate.BERKELEY.EDU> shipley@widow.berkeley.edu (Peter Shipley) writes:
>In article <7304@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>>
>>That information is not generally available to the process.
>>
>>I don't know of any "cc"s that work like that.  Usually the pathnames
>>of the slave programs are hard-wired into the "cc" code, although
>>they're sometimes configurable via the makefile for cc when it's built.
>
>I thought that the path came from the user's environment 
>variable PATH.
>
	The environmental variable enables the SHELL to find a file, not
for a file to find another file. An executable, however, can be hard
coded to use the envirnomental path specifically, or be hard coded to find
a file through a specific path, however. It all depends on the program
itself. The discussion's focus is on how cc.c knows that all the slave
programs are in /lib. And the explanation is that the path plus the
name of the slave programs themselves are hard coded (defined as pointers)
in cc.c.

Fai Lau
SUNY at Buffalo (The Arctic Wonderland)
UU: ..{rutgers,ames}!sunybcs!ugfailau
BI: ugfailau@sunybcs INT: ugfailau@joey.cs.buffalo.EDU

gwyn@brl-smoke.ARPA (Doug Gwyn ) (02/22/88)

In article <7102@agate.BERKELEY.EDU> shipley@widow.berkeley.edu (Peter Shipley) writes:
>In article <7304@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>>I don't know of any "cc"s that work like that.  Usually the pathnames
>>of the slave programs are hard-wired into the "cc" code, although
>>they're sometimes configurable via the makefile for cc when it's built.
>I thought that the path came from the user's environment 
>variable PATH.

Oh no -- that would obviously be a disaster.  PATH is used to locate
command executable files when no "/" is present in their names as
specified to the shell (or exec*p() function).  This determines, for
instance, whether you will execute /bin/cc or /usr/5bin/cc on a dual-
environment Berkeley-based system such as SunOS 3.2, when you specify
just "cc".  However, each of the "cc" executables knows where to find
its own subprocess executable files (in this example, both may use
/lib/c0 etc.).

shirono@grasp.cis.upenn.edu (Roberto Shironoshita) (02/22/88)

In article <2933@sdsu.UUCP> turtle@sdsu.UCSD.EDU (Andrew Scherpbier) writes:
> [Leisner.Henr@xerox.com's original article, <11923@brl-adm.ARPA>,]
> [asking how cc knows to look at /lib for the passes              ]
>
> When a program executes, the full path to the executable file is
> kept in the zero-th argument.

I'm sorry, but this is plain and simply an overstatement.  It may be
right if the command was the full path, but not necessarily otherwise.

> If you have a declaration of main which looks like this:
>	main(argc,argv)
>	int	argc;
>	char	*argv[];
>
> then argv[0] is a pointer to the full path.

argv[0] contains whatever was passed as argv[0] to the exec family of
system calls.  According to AT&T's manual for SVR2 (of 04/84), the
convention is that it be either the full pathname or its last
component.  The behavior I've seen on both /bin/csh and /bin/sh under
Ultrix 2.0 and HCX/UX 3.0 (with universe both bsd and att) is that it
is whatever came as the command.  If you specify "cc" as your command,
strcmp (argv[0], "cc") == 0.

Thus, you can't say that argv[0] IS a pointer to the full path.  You
can't even say that it isn't (sigh!).

                                   Roberto Shironoshita

-------------------------------------------------------------------------
@@@@@@@@@\   Disclaimers:
 @@     @@   1 -  The opinions expressed here are my own.  The University
 @@     @@        need not share them, or even be aware of them.
 @@@@@@@@/   2 -  Like most humans, I'm bound to err at times.  I believe
 @@               what I have said, but agree that I may be wrong.
 @@
@@@@         Internet: shirono@grasp.cis.upenn.edu

cjc@ulysses.homer.nj.att.com (Chris Calabrese[rs]) (02/23/88)

In article <2933@sdsu.UUCP>, turtle@sdsu.UUCP writes:
> 
> When a program executes, the full path to the executable file is kept in
> the zero-th argument.  If you have a declaration of main which looks like this:
> 	main(argc,argv)
> 	int	argc;
> 	char	*argv[];
> 
> then argv[0] is a pointer to the full path.
> Is this what you were looking for?

This is true only if the program was not found in the
current working directory, in which case argv[0] will contain
only the name of the program, not the full path.  Of course,
you can test to see if argv[0][0] != '/' and getcwd()
to find the full path name.

	Christopher Calabrese
	AT&T Bell Laboratories
	ulysses!cjc

jws@hpcllf.HP.COM (John Stafford) (02/23/88)

The name a program was invoked under is in argv[0]; that may (or more
likely may not) contain the "full" path name.

kent@tifsie.UUCP (Russell Kent) (02/24/88)

in article <10106@ulysses.homer.nj.att.com>, cjc@ulysses.homer.nj.att.com (Chris Calabrese[rs]) says:
> 
> In article <2933@sdsu.UUCP>, turtle@sdsu.UUCP writes:
>> 
>> When a program executes, the full path to the executable file is kept in
>> the zero-th argument.  If you have a declaration of main which looks like this:
>> 	main(argc,argv)
>> 	int	argc;
>> 	char	*argv[];
>> 
>> then argv[0] is a pointer to the full path.
>> Is this what you were looking for?
> 
> This is true only if the program was not found in the
> current working directory, in which case argv[0] will contain
> only the name of the program, not the full path.  Of course,
> you can test to see if argv[0][0] != '/' and getcwd()
> to find the full path name.
> 
> 	Christopher Calabrese

Unfortunately, neither of these statements is entirely correct.  That
argv[0] contains any portion of the pathname of the process's executable
file is _merely_ convention on the part of _most_ shells (sh, rsh, csh, ksh,
and tcsh).  It is perfectly ok to put your mother's maiden name in argv[0]
of a process you are about to exec, although this is obviously not very
useful.

The convention that _most_ shells follows is that argv[0] contains the
same text as the first blank-separated field of the command line after
any aliasing takes place.  This means that:

    Command			Argv[0]
    cc				cc
    /bin/cc			/bin/cc
    alias rm /bin/rm -i		<internal command>
    rm jack			/bin/rm
    \rm jill			rm

The astute read may counter: "But if I do a /bin/cp with no parameters,
the computer comes back with "Usage: cp f1 f2" not "Usage: /bin/cp f1 f2".
This is because some programs intentionally use only the last portion of
the path given in argv[0] (using "argv0 = strrbrk (argv[0], '/');").

This convention is actually implemented in the shell through the use of
exec*p();  if you use execl() you can bypass it.
-- 
Russell Kent                    Phone: +1 214 995 3501
Texas Instruments               UUCP address:
P.O. Box 655012   MS 3635       ...!convex!smu!tifsie!kent
Dallas, TX 75265                ...!ut-sally!im4u!ti-csl!tifsie!kent

fnf@fishpond.UUCP (Fred Fish) (02/24/88)

In article <7304@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>I don't know of any "cc"s that work like that.  Usually the pathnames
>of the slave programs are hard-wired into the "cc" code, although
>they're sometimes configurable via the makefile for cc when it's built.

One technique that I have found very useful, and it only takes about a half
day to implement, is to define a "SGS tree" that looks something like the
following, with indentation showing the tree structure:

		ROOTDIR
			bin
				cc
				as
				ld
			lib
				cpp
				c0
				c1
				crt0.o
				libc.a
			usr
				include
					stdio.h
					...
					sys

Let ROOTDIR be any arbitrary directory, which if a particular executable
is invoked with a pathname beginning with '/', can be discovered simply
by picking it out of argv[0].  If for example, cc finds it was invoked
as "/tools/bin/cc", then it sets ROOTDIR to "/tools", and then prepends
this ROOTDIR to the names of the executables it expects to exec,
"bin/ld", "bin/as", etc.  The default ROOTDIR is "/", so everything
works as expected if the executables live in their "normal" homes.
Also, cpp must be similarly modified to let it find ROOTDIR/usr/include,
and ld must be modified to let it find ROOTDIR/lib/libc.a for "-lc".

This scheme has the advantage that the binaries dynamically adjust to
their location in the filesystem tree, and can be moved at will.  The only
serious disadvantage is that if they don't live in their normal places,
then they must always be invoked with their full pathnames, but a shell
alias usually takes care of that problem.  In addition, they MUST NOT
be found via $PATH searches unless they live in their normal places
(/bin/cc for example) because they will then pick up the wrong files 
implied by ROOTDIR, which is "/" by default.  All in all, I have found
the ease of maintaining multiple SGS's on a system, and renaming or
moving them at will, to outweigh these minor gotchas.

Incidentally, this feature seems to survived in /bin/cc long enough to
make it into A/UX, but somewhere along the way cpp and ld lost it.

-Fred




		   
-- 
# Fred Fish    hao!noao!mcdsun!fishpond!fnf     (602) 921-1113
# Ye Olde Fishpond, 1346 West 10th Place, Tempe, AZ 85281  USA

palmer@hsi.UUCP (Mike Palmer) (03/07/88)

In article <4210002@hpcllf.HP.COM>, jws@hpcllf.HP.COM (John Stafford) writes:
> The name a program was invoked under is in argv[0]; that may (or more
> likely may not) contain the "full" path name.


this can be combined with the info obtained by calling the function
	getwd(pathname)
	char *pathname;

-- 
	Mike Palmer			{uunet,ihnp4,yale}!hsi!palmer
	Health Systems International
	New Haven, CT  06511

cwf@cbterra.ATT.COM (Cary W. Fitzgerald) (03/09/88)

In article <873@hsi.UUCP> palmer@hsi.UUCP (Mike Palmer) writes:
>In article <4210002@hpcllf.HP.COM>, jws@hpcllf.HP.COM (John Stafford) writes:
>> The name a program was invoked under is in argv[0]; that may (or more
>> likely may not) contain the "full" path name.
>
>this can be combined with the info obtained by calling the function
>	getwd(pathname)
>	char *pathname;

What if the program is on your PATH?  You'll have to simulate the shell's
behavior to get the path name.  The PATH variable must, of course, be exported,
so you'll be assured of being able to look at it.

If the command was invoked through an alias, command substituion will already
have been done, so you don't have to consider that case.


Cary.

mike@turing.UNM.EDU (Michael I. Bushnell) (03/09/88)

Now, I can see a better, more complete solution that all the previous ones.

Just 

   1: Take a look in the core map and find out the inode of
      the file you are executing.
   2: Search the hierarchy looking for that inode. 

Of course, (2) can be made much shorter by first checking out
a bunch of likely locations, and then resorting to searching
the whole thing.

Further, you could look at files with the same name as argv[0] only, 
but I don't think that would give you any improvement over just
stat-ing them all.

Just remember
    "If you have a hard UNIX problem, a kernel dive is the answer."

but I don't thing
but t
				Michael I. Bushnell
				mike@turing.unm.edu
				{ucbvax,gatech}!unmvax!turing!mike

				HASA -- "A" division

mouse@mcgill-vision.UUCP (der Mouse) (03/13/88)

In article <2933@sdsu.UUCP>, turtle@sdsu.UUCP (Andrew Scherpbier) writes:
> In article <11923@brl-adm.ARPA> Leisner.Henr@xerox.com (marty) writes:
>> How does an exec program get the pathname it was execed from if it
>> wants to find out this information?

In general, it can't.

> When a program executes, the full path to the executable file is kept
> in the zero-th argument.  If you have a declaration of main which
> looks like this:
>	main(argc,argv)	int argc; char *argv[];
> then argv[0] is a pointer to the full path.

Not at all.  argv[0] is merely whatever was passed to the exec-family
routine that executed the program.  The shell conventionally passes
whatever name the program was called by in this field, but this (a) is
not always a full pathname and (b) is not a universal convention in the
first place.  If a program wants, it can run your program and make
argv[0] be "/bin/sh", or "Jan 23 1988, 11:44:82.35 GMT", or whatever it
pleases.

					der Mouse

			uucp: mouse@mcgill-vision.uucp
			arpa: mouse@larry.mcrcim.mcgill.edu