[comp.lang.c] Contents of argv0

jls@attctc.Dallas.TX.US (Jerome Schneider) (08/15/89)

Both K&R and H&S document the argv[0] string as "the name of the program".
This definition doesn't seem to mandate (or exclude) the presence of a path
component in the string.  A few  *nix programs use rindex() or strrchr() to
scan for '/', implying (to me at least) that _some_ implementations consider
"name" to include such information.  The DOS Microsoft C (5.x) startup code
constructs the full path name for argv[0] (in upper case, no less!).  
 
Are there any (proposed) standards for argv[0] syntax?  If not, should a
_portable_ application always rindex() argv[0] for a path delimiter before
optionally (under DOS and OS half) converting the name to_lower()?
 
-- 
Jerome Schneider              UUCP: killer.DALLAS.TX.US!jls (guest account)
Aspen Technology Group        Ft. Collins, CO    Voice: (303) 484-8466

cpcahil@virtech.UUCP (Conor P. Cahill) (08/15/89)

In article <9002@attctc.Dallas.TX.US>, jls@attctc.Dallas.TX.US (Jerome Schneider) writes:
> Both K&R and H&S document the argv[0] string as "the name of the program".
> This definition doesn't seem to mandate (or exclude) the presence of a path
> component in the string.  A few  *nix programs use rindex() or strrchr() to
> scan for '/', implying (to me at least) that _some_ implementations consider
> "name" to include such information.

On most, if not all, unix systems argv[0] will contain the path used to 
execute the program (i.e. the first argument to an execl()).  This may be
a relative path, a full path and/or a simple name.  Your code should handle
all cases.  (this would also make it portable to other systems that do not
pass in the path).  On some older versions of Microsoft C, argv[0] was always
empty (i.e. char * 0) because that information was not available from DOS.

davidsen@sungod.crd.ge.com (ody) (08/15/89)

In article <9002@attctc.Dallas.TX.US> jls@attctc.Dallas.TX.US (Jerome Schneider) writes:

| Are there any (proposed) standards for argv[0] syntax?  If not, should a
| _portable_ application always rindex() argv[0] for a path delimiter before
| optionally (under DOS and OS half) converting the name to_lower()?

  If the program is portable you would have no way to know what the
path delimiter is... since the set includes "/" for UNIX, "\" for
MS-DOS, ":[]" for VMS, ":<>" for TOPS, etc. Since some systems allow
ugly things like "$#_%" in filenames, you are better off not trying to
identify all the possibilities.

  You could put in some conditional code to do things for common
operating systems if you wanted to display the last level of the name,
and if you were being portable you would use strrchr (I don't see rindex
in the ANSI standard, at least in the index or 4.11).
	bill davidsen		(davidsen@crdos1.crd.GE.COM)
  {uunet | philabs}!crdgw1!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

henry@utzoo.uucp (Henry Spencer) (08/15/89)

In article <9002@attctc.Dallas.TX.US> jls@attctc.Dallas.TX.US (Jerome Schneider) writes:
>Both K&R and H&S document the argv[0] string as "the name of the program".
>This definition doesn't seem to mandate (or exclude) the presence of a path
>component in the string.  A few  *nix programs use rindex() or strrchr() to
>scan for '/', implying (to me at least) that _some_ implementations consider
>"name" to include such information...

Unix will generally give you the name the user referred to the program by,
which usually does not contain a path component but sometimes does.

>Are there any (proposed) standards for argv[0] syntax? ...

X3J11 (Oct 88 draft, but the content hasn't changed significantly) says:

1. Argc might be 0, in which case there need not be an argv[0].

2. For argc>0, argv[0] must be a modifiable pointer to a modifiable string.

3. For argc>0, argv[0] points to a string that "represents the program name",
	unless the program name is not available, in which case it's the
	empty string.  If dual-case strings are not available, the string
	shall be in lowercase.

Period.  All this applies only to a "hosted" environment; in a "freestanding"
environment, e.g. a program running a microwave oven, all bets are off.

POSIX 1003.1 constrains argv[0] to exist [I think] and point to "a filename".
Beyond that, POSIX just more or less says that argv[0] gets whatever
exec* gave for the 0th argument.  On that happy day when we actually
see a 1003.2 standard, it may say something more specific about what
argv[0] is for a program run by a standard shell.
-- 
V7 /bin/mail source: 554 lines.|     Henry Spencer at U of Toronto Zoology
1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

bob@wyse.wyse.com (Bob McGowen Wyse Technology Training) (08/16/89)

In article <1681@crdgw1.crd.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
>In article <9002@attctc.Dallas.TX.US> jls@attctc.Dallas.TX.US (Jerome Schneider) writes:
>
>| Are there any (proposed) standards for argv[0] syntax?  If not, should a
>| _portable_ application always rindex() argv[0] for a path delimiter before
>| optionally (under DOS and OS half) converting the name to_lower()?
>
>  If the program is portable you would have no way to know what the
>path delimiter is... since the set includes "/" for UNIX, "\" for
>MS-DOS, ":[]" for VMS, ":<>" for TOPS, etc. Since some systems allow
---some deleted---

How about proposing a new function to be called basename(), obviously
coded for the environment which the compiler was running under?  It
would return a pointer to a string which would be the name of the
program.

For example, in shell programming where root (with no current dir in
the PATH) must run the script as {absolute|relative path}name, $0 will
include the path used to invoke the script.  I have started using the
basename command to guarantee that the name I use in usage/error
messages is just the name of the script.

Bob McGowan  (standard disclaimer, these are my own ...)
Customer Education, Wyse Technology, San Jose, CA
..!uunet!wyse!bob
bob@wyse.com

gandalf@csli.Stanford.EDU (Juergen Wagner) (08/16/89)

In article <1017@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes:
>
>On most, if not all, unix systems argv[0] will contain the path used to 
>execute the program (i.e. the first argument to an execl()).  This may be
>a relative path, a full path and/or a simple name.  Your code should handle
>all cases.  (this would also make it portable to other systems that do not
>pass in the path).  On some older versions of Microsoft C, argv[0] was always
>empty (i.e. char * 0) because that information was not available from DOS.

The first argument to aqn execl is the path name of the program to be
executed. The arguments starting with the second are what the program will
get as an argv[]. This effectively means that in general, argv[0] cannot
be treated as a reliable source for the path name! [RTFM]

You have to distinguish between features provided by UNIX (in general) and
some UNIX (in particular). The execve system call makes it quite clear that
there doesn't have to be any relation whatsoever between the pathname of the
program being invoked and the argument vector argv[]. Check the man page for
execve().

Juergen Wagner		   			gandalf@csli.stanford.edu
						 wagner@arisia.xerox.com

cpcahil@virtech.UUCP (Conor P. Cahill) (08/16/89)

In article <10094@csli.Stanford.EDU>, gandalf@csli.Stanford.EDU (Juergen Wagner) writes:
> The first argument to aqn execl is the path name of the program to be
> executed. The arguments starting with the second are what the program will
> get as an argv[]. This effectively means that in general, argv[0] cannot
> be treated as a reliable source for the path name! [RTFM]
> 
I did RTFM many times for many flavors of unix for many of the past years.
What I ment to say was "argv[0] will NORMALLY contain the path".  The example
was not meant to say that argv[0] would have the first argument of the execl().
The example "the first argument to an execl()" was an example for the path 
used to execute the program..

> You have to distinguish between features provided by UNIX (in general) and
> some UNIX (in particular). The execve system call makes it quite clear that
> there doesn't have to be any relation whatsoever between the pathname of the
> program being invoked and the argument vector argv[]. Check the man page for
> execve().

My intent was to say that argv[0] will normally contain a path, but to be
portable the code should handle anything (hence the sample for MSC).

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/16/89)

In article <9002@attctc.Dallas.TX.US> jls@attctc.Dallas.TX.US (Jerome Schneider) writes:
>Are there any (proposed) standards for argv[0] syntax?

IEEE 1003.2 may have something to say about it, but the only portable-C
specification is that IF argc is greater than 0 AND IF argv[0][0] is not
the null character THEN argv[0] represents the "program name".  As you
see, a program name need not even be supplied by the implementation.

>If not, should a _portable_ application always rindex() argv[0] for a
>path delimiter before optionally (under DOS and OS half) converting the
>name to_lower()?

I don't think a "_portable_" application should use rindex() at all,
nor should it assume that tolower() (not to_lower()) is a meaningful
thing to do on environment-supplied text.  If you're going to display
argv[0] (which is about the only meaningful thing you can do with it),
why not just print the whole thing.  If you really need to keep the
name short and happen to KNOW how to parse pathnames into components,
then of course you could do that and print the result.  The function
used to do that is traditionally named simple().

net@tub.UUCP (Oliver Laumann) (08/16/89)

In article <1017@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes:
> On most, if not all, unix systems argv[0] will contain the path used to 
> execute the program (i.e. the first argument to an execl()).  This may be
> a relative path, a full path and/or a simple name.  Your code should handle
> all cases.

You should also anticipate the case that argv[0] is not there (i.e. argc
is zero).   It is perfectly valid to execute a program like this:

    execl (your_program, (char *)0);

The manual only says that ``by convention'' at least one argument must
be passed to the program.  However, this is not enforced.  For a good
laugh compile the following program:

    main (ac, av) char **av; {
	execl (av[1], (char *)0);
	perror ("exec");
    }

and then try, for instance,

    % a.out /bin/ls
    % a.out /usr/ucb/mail
    % a.out /bin/csh
    % a.out /usr/ucb/vi
    % a.out /bin/size

Regards,
--
Oliver Laumann              net@TUB.BITNET              net@tub.UUCP

chris@mimsy.UUCP (Chris Torek) (08/16/89)

In article <10743@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>... If you're going to display argv[0] (which is about the only meaningful
>thing you can do with it), why not just print the whole thing.

I went around with this several times myself when installing a general
error-message facility in our local C library, and came up with the
same answer: print the whole thing.  The only bad effect is that
occasionally someone will see more detail than needed.  Eliding all
but the `program name' part (on Unix, all but the last path component)
has the bad effect that occasionally someone will miss detail that
was needed (e.g., whether the error came from the new experimental
version of a program or the old known-buggy version or. . .).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

peter@ficc.uu.net (Peter da Silva) (08/16/89)

File names can be most greivously complex. I've run into the following:

	/path/file
	volume:file.type
	//machine/path/file
	machine#qualifier*file.element/type
	volume:path\file.type
	machine::volume:[path]file.type
	volume:path/file
	:volume:path/file.type
	volume:[gid,uid]file.type
	machine:/path/file
	/volume/path/file
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Business: peter@ficc.uu.net, +1 713 274 5180. | "The sentence I am now
Personal: peter@sugar.hackercorp.com.   `-_-' |  writing is the sentence
Quote: Have you hugged your wolf today?  'U`  |  you are now reading"

davidsen@sungod.crd.ge.com (ody) (08/16/89)

In article <2364@wyse.wyse.com> bob@wyse.UUCP (Bob McGowen Wyse Technology Training) writes:

| How about proposing a new function to be called basename(), obviously
| coded for the environment which the compiler was running under?  It
| would return a pointer to a string which would be the name of the
| program.

  This sounds good to me. It is a low effort procedure to write, making
it unlikely to be opposed by any vendor on the grounds of execssive
cost. It provides a nice standard way to do something which is commonly
useful, and at the calling level it is easily specified in a portable
way. When do we form the C95 committee?

	const char *basename(path)
	const char *path;
	
	basename returns a pointer to a static buffer which holds
	the filename portion of the path pointed to by the path
	argument. This is a pointer to an internal buffer and the data
	must be copied if it is to be preserved beyond the next call to
	basename. 

Please note: I have formalized bob's excelent suggestion this way
because some vendors place the filename in the middle of the path, thus
defeating any effort to point to the filename in the original string.

Example, a typical VMS filename:
	CAOVAX::Dra0:[mimsey.bin.special]zoo.exe;4
                                         ^^^^^^^
                                That's the filename!
	bill davidsen		(davidsen@crdos1.crd.GE.COM)
  {uunet | philabs}!crdgw1!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

peter@ficc.uu.net (Peter da Silva) (08/17/89)

You don't want basename(). You want something like:

	parse_file(name, buffer)
	char *name;
	struct filename *buffer;

		Parses the elements of name into the buffer. name will
		be modified as necessary to null-terminate the elements
		of buffer. Returns the actual number of elements found
		in the name... missing or meaningless elements will contain
		null pointers.

	int build_file(buffer, filename);
	char *buffer;
	struct filename *filename;

		Converts the filename into a character string acceptable
		to the host operating system. Missing elements will be
		defaulted or ignored. Meaningless elements will be
		ignored. Returns the length of the resulting name.

With:

struct filename {
	char *machine;
	char *volume;
	char *project;
	char *user;
	char *path[MAXPATH];
	char *filename;
	char *extension;
	char *version;
};

On UNIX this would extract path and filename. On MS-DOS this would extract
the volume, path, filename, and extension. On RSX this would extract the
volume, project, user, filename, and extension. And so on...

There should also be an additional function:

	struct filename *file_defaults();

Which just returns a filename containing the default values for the current
host, current directory, and so on, with nulls in meaningless elements.

Any elements I'm missing?
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Business: peter@ficc.uu.net, +1 713 274 5180. | "The sentence I am now
Personal: peter@sugar.hackercorp.com.   `-_-' |  writing is the sentence
Quote: Have you hugged your wolf today?  'U`  |  you are now reading"

mcdaniel@astroatc.UUCP (Tim McDaniel) (08/17/89)

In article <5712@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>File names can be most grievously complex. I've run into the following:
>	machine#qualifier*file.element/type

I don't have an EXEC 8 (UNIVAC 1100-series) manual, so this is from
memory.  I recall the maximal syntax as
	qualifier*file(version)/rdpswd/wrpswd.element(version)/type

Peter's basic point, however, stands.
-- 

\    Tim McDaniel		Astronautics		Madison, WI
 \   Internet: mcdaniel%astroatc.UUCP@cs.wisc.edu
/ \  USENET:   ...!ucbvax!uwvax!astroatc!mcdaniel

guido@piring.cwi.nl (Guido van Rossum) (08/17/89)

chris@mimsy.UUCP (Chris Torek) writes:

>In article <10743@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>same answer: print the whole thing.  The only bad effect is that
>occasionally someone will see more detail than needed.

Well, there's esthetics... I don't really like usage messages like this:

usage: /tmp_mnt/ober/ufs1/amoeba/guido/bin/dpv [-d] [-f funnytab] [+page] ditroff-output-file

Which is why I strip initial path components if I find them.  (The C code
I use happens to use a default if there is no argv[0], it is a null
pointer or an empty string, or ends in a slash, which is probably not a
big loss.  My shell scripts use `basename $0`.)

I also seem to remember that some shells (csh?) prefix argv[0] with the
directory in $PATH where the command was found, and some don't.

--
Guido van Rossum, Centre for Mathematics and Computer Science (CWI), Amsterdam
guido@cwi.nl or mcvax!guido or guido%cwi.nl@uunet.uu.net
"Repo man has all night, every night."

hascall@atanasoff.cs.iastate.edu (John Hascall) (08/17/89)

In article <1705> davidsen@crdos1.UUCP (bill davidsen) writes:
}In article <2364> bob@wyse.UUCP (Bob McGowen Wyse Technology Training) writes:
 
}	const char *basename(const char *path)
 
	   I like it!

}Example, a typical VMS filename:
}	CAOVAX::Dra0:[mimsey.bin.special]zoo.exe;4
}                                         ^^^^^^^
}                                That's the filename!

      Uh, sorry.  "zoo" is the filename, "exe" is the filetype.

John Hascall

usenet@cps3xx.UUCP (Usenet file owner) (08/17/89)

in article <19112@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) says:
> Keywords: start-up code, argv specifications
> In article <10743@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>>... If you're going to display argv[0] (which is about the only meaningful
>>thing you can do with it), why not just print the whole thing.
    ^
    +--- Counterpoint:

ex, vi, and view are all the same binary. They look at argv[0] to enough
which one to do.


Also, compress and uncompress do the same thing.

John H. Lawitzke           UUCP: Work: ...uunet!frith!dale1!jhl
Dale Computer Corp., R&D         Home: ...uunet!frith!dale1!ipecac!jhl
2367 Science Parkway       Internet:   jhl@frith.egr.msu.edu
Okemos, MI, 48864                             [35.8.8.108]

maart@cs.vu.nl (Maarten Litmaath) (08/18/89)

usenet@cps3xx.UUCP (Usenet file owner) writes:
\...
\Also, compress and uncompress do the same thing.

Really? :-)
(OK, quoted out of context.)
-- 
  kilogram, n.: the amount of cocaine    |Maarten Litmaath @ VU Amsterdam:
         you can buy for $100K.          |maart@cs.vu.nl, mcvax!botter!maart

flaps@dgp.toronto.edu (Alan J Rosenthal) (08/19/89)

peter@ficc.uu.net (Peter da Silva) writes:
>	parse_file(name, buffer)
>	char *name;
>	struct filename *buffer;
>
>		Parses the elements of name into the buffer. name will
>		be modified as necessary to null-terminate the elements
>		of buffer. Returns the actual number of elements found
>		in the name... missing or meaningless elements will contain
>		null pointers.

Problem is, there aren't always characters available in the right places in
name to be overwritten with zeroes to terminate the various strings.  In unix,
consider the file name "/file".  The directory is "/", the file name is "file",
but there's nowhere to put the zero to terminate the string "/".  A directory
name of "" is not acceptable; it will work under some circumstances (namely,
when a slash and a file name is appended) but not in all.

ajr

davidsen@sungod.crd.ge.com (ody) (08/19/89)

In article <5722@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:

| struct filename {
| 	char *machine;
| 	char *volume;
| 	char *project;
| 	char *user;
| 	char *path[MAXPATH];
| 	char *filename;
| 	char *extension;
| 	char *version;
| };

	[ ... ]

| Any elements I'm missing?

Well, at least device... the C:foo in MS-DOS and the DRC0:file in VMS
all specify physical devices (which might be mapped in DOS, yes I know).
No one can come up with a struct containing everything which might be in
a path someday. All that was needed by the original poster was a way to
find out the "filename" as understood by the local filesystem.
	bill davidsen		(davidsen@crdos1.crd.GE.COM)
  {uunet | philabs}!crdgw1!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

darcy@bbm.UUCP (D'Arcy Cain) (08/19/89)

In article <5712@ficc.uu.net>, peter@ficc.uu.net (Peter da Silva) writes:
> File names can be most greivously complex. I've run into the following:
> 
> [examples]

That should be no problem for a basename routine.  The whole point of
the function is to tailor it to the OS/hardware it is running on.  It
is so trivial really that writing this one function on each system you
want to port to can simplify your programming.

D'Arcy J.M. Cain
(darcy\@bbm, cain!darcy@telly.on.ca)

peter@ficc.uu.net (Peter da Silva) (08/19/89)

In article <1989Aug18.130710.13954@jarvis.csri.toronto.edu>, flaps@dgp.toronto.edu (Alan J Rosenthal) writes:
> peter@ficc.uu.net (Peter da Silva) writes:
> >	parse_file(name, buffer)
> >		...name will be modified ... to null-terminate the elements
> >		of buffer. ...missing or meaningless elements will contain
> >		null pointers.

> Problem is, there aren't always characters available in the right places in
> name to be overwritten with zeroes to terminate the various strings.  In unix,
> consider the file name "/file".  The directory is "/", the file name is "file",
> but there's nowhere to put the zero to terminate the string "/".  A directory
> name of "" is not acceptable; it will work under some circumstances (namely,
> when a slash and a file name is appended) but not in all.

But since you can't portably do anything with the name directly... you will
have to call build_file() to create a new file name... then build_file() can
be written to properly distinguish root (the directory ""), and the current
directory (either "." or no directory in the file name, so the directory is
NULL).

To put it another way, a slash and a file name would always be appended.

The question of what the name of the root in UNIX is is another topic, hashed
over at great length in comp.sys.amiga some time last year.
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
"Optimization is not some mystical state of grace, it is an intricate act   U
   of human labor which carries real costs and real risks." -- Tom Neff

peter@ficc.uu.net (Peter da Silva) (08/19/89)

In article <1767@crdgw1.crd.ge.com>, davidsen@sungod.crd.ge.com (ody) writes:
> In article <5722@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:

> | struct filename {
	...
> | 	char *volume;		<----- It's in there
	...
> | };

> | Any elements I'm missing?

> Well, at least device... the C:foo in MS-DOS and the DRC0:file in VMS

It's in there.
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
"Optimization is not some mystical state of grace, it is an intricate act   U
   of human labor which carries real costs and real risks." -- Tom Neff

gisle@ifi.uio.no (Gisle Hannemyr) (08/19/89)

gandalf@csli.Stanford.EDU (Juergen Wagner) wrote:
> This effectively means that in general, argv[0] cannot
> be treated as a reliable source for the path name! [RTFM]

Is there anything that can be relied upon for this?
I want to know which directory the executable file resided in.

- gisle hannemyr  (Norwegian Computing Center)
  X.400: gisle@nr.uninett
  RFC:   gisle@ifi.uio.no
  UUCP:  ..!mcvax!ifi!gisle
------------------------------------------------

poser@csli.Stanford.EDU (Bill Poser) (08/20/89)

I don't know of a reliable way of finding out what directory an
executable resides in from within a C program, but there is a
reasonably simple way around this, which is to call the C program
from a shell script that first records the directory in a file.

I have a program that consists of three executables that do a sequence
of overlays. (Don't ask why.) Rather than compile in the path name,
I use the shell script trick. Here is the shell script:

	which $0 > .L3_loc
	$0_top $0 $*

The "which" gets the full path name of the shell script and
writes it into a temp file. The second line executes the top
level C program and passes the arguments to it. The top level
C program then reads the path name from the file. You could also
pass the result of "which" as a command line argument.

flaps@dgp.toronto.edu (Alan J Rosenthal) (08/20/89)

gisle@ifi.uio.no (Gisle Hannemyr) writes:
>Is there anything that can be relied upon for [indicating where the executable
>file resides]?

poser@csli.Stanford.EDU (Bill Poser) writes, explaining how a shellscript
wrapper can accomplish this:
>	which $0 > .L3_loc

This will not work in all cases!  There is no way to tell where the executable
resides.

Using "which $0" will FAIL if the program is invoked with an argv[0] which the
user could not have typed to access that program.  For example:
	execl("/my/strange/directory/prog", "prog", (char *)NULL);

An example which happens to you every day is the initial execution of your
login shell, which is run with an argv[0] beginning with a minus sign to let it
know that it's a login shell.

ajr

henry@utzoo.uucp (Henry Spencer) (08/20/89)

In article <1935@ifi.uio.no> gisle@ifi.uio.no (Gisle Hannemyr) writes:
>> This effectively means that in general, argv[0] cannot
>> be treated as a reliable source for the path name! [RTFM]
>
>Is there anything that can be relied upon for this?

No.

>I want to know which directory the executable file resided in.

In current systems you simply don't know.
-- 
V7 /bin/mail source: 554 lines.|     Henry Spencer at U of Toronto Zoology
1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gandalf@csli.Stanford.EDU (Juergen Wagner) (08/21/89)

In some article I wrote a long time ago:
> This effectively means that in general, argv[0] cannot
> be treated as a reliable source for the path name! [RTFM]

Referring to this, gisle@ifi.uio.no (Gisle Hannemyr) wrote:
>Is there anything that can be relied upon for this?

In a response, henry@utzoo.uucp (Henry Spencer) writes:
>No.

There are partial solutions to the problem, which are too complex for most
cases. In most cases, the simplest thing to do is to rely on the fact that
certain conventions are observed, and depending on the environment, the
path name can be obtained in one of several ways:

[1] Directly from argv[0] (which could be the correct path name).

[2] From argv[0] and getenv("PATH") by simulating the shell's interpretation
    of the search path.

[3] In an operating system dependent way. E.g. by examining the inode
    associated with the text segment of the current process, ... Maybe
    it is would be a good idea to associate with each process a description
    of the file the executable image came from... This shouldn't be too
    difficult, in fact, in most cases, one could do something like that
    by modifying the exec*() routines in the C library such that they
    record their first argument in an environment variable "EXEC_PATH"
    which is passed to the new process.

Usually, [1] and [2] cover the cases one is interested in. Other pathological
cases might be very interesting but not necessarily important.

Happy hacking,
Juergen Wagner		   			gandalf@csli.stanford.edu
						 wagner@arisia.xerox.com

PS: Also note that the most famous example of argv[0] not being the program
    name is the way 'login' calls login shells as "-sh" or "-csh"...

PPS: It is important to hang on to the idea of conventions (same theme can
     be found in that ongoing discussion about main() or not main()). They
     can make like much easier (and reduce our problem to cases [1] and [2]).

jem@cs.hut.fi (Johan Myreen) (08/21/89)

In article <10154@csli.Stanford.EDU> gandalf@csli.Stanford.EDU (Juergen Wagner) writes:

>path name can be obtained in one of several ways:

>[3] In an operating system dependent way. E.g. by examining the inode
>    associated with the text segment of the current process, ... Maybe

But how do you get the path name from the inode?

--
* Johan Myreen
* jem@spica.hut.fi

tneff@bfmny0.UUCP (Tom Neff) (08/21/89)

As far as the C language itself goes, this discussion is completely
meaningless.  It's ludicrously parochial to insist that argv[0] hold a
"PATHNAME," whatever in the world that may be.  In the universe of
possible C implementations, you don't even necessarily associate one
program with one "FILE," or even have such a thing as "FILE."  All that
argv[0] gives you is a "handle" by which you can refer to the program.
For error messages and such, that's all you need.  Attempting to be
precious by "extracting" the "name" portion only for use in error
messages is a non portable operation.

To the extent that UNIX-oid type people have a legitimate interest in
being able to parse argv[0] pathwise, the issue should be addressed by
POSIX and discussed in comp.sys.whatever.
-- 
"We walked on the moon --	((	Tom Neff
	you be polite"		 )) 	tneff@bfmny0.UU.NET

net@tub.UUCP (Oliver Laumann) (08/22/89)

In article <1935@ifi.uio.no> gisle@ifi.uio.no (Gisle Hannemyr) writes:
> gandalf@csli.Stanford.EDU (Juergen Wagner) wrote:
> > This effectively means that in general, argv[0] cannot
> > be treated as a reliable source for the path name! [RTFM]
> 
> Is there anything that can be relied upon for this?
> I want to know which directory the executable file resided in.

To obtain the full path name of the executable you can do the following:

If argv[0][0] is a '/', do a stat() on the string and, if a file with
this name exists, test whether it is a regular file and whether it is
executable.  If argv[0][0] is not a '/', append a '/' and argv[0] to
each component of $PATH (use getenv() to obtain it) and perform the
above test until it succeeds.

However, there is no guarantee that this procedure yields the correct
path name (or anything useful at all).  As I said in an earlier
article, argv[0] may not even be there, i.e. it is not guaranteed that
argc is non-zero.

--
Oliver Laumann              net@TUB.BITNET              net@tub.UUCP