tgl@zog.cs.cmu.edu (Tom Lane) (07/25/90)
Is there any way for a program to discover the path name of the file from which it was executed? I would like to be able to access auxiliary files stored in the same directory as the main executable file. However, after reading the man page for exec(2), it doesn't seem like the program gets enough information to reliably determine what file/directory it was loaded from. (For one thing, you don't know whether PATH was used, and for another, you don't know whether argv[0] is the same as the supplied path or is just the last component.) If the info is in fact squirrelled away somewhere, please tell me where! -- tom lane Internet: tgl@cs.cmu.edu UUCP: <your favorite internet/arpanet gateway>!cs.cmu.edu!tgl BITNET: tgl%cs.cmu.edu@cmuccvma CompuServe: >internet:tgl@cs.cmu.edu
jik@athena.mit.edu (Jonathan I. Kamens) (07/25/90)
In article <9995@pt.cs.cmu.edu>, tgl@zog.cs.cmu.edu (Tom Lane) writes: |> Is there any way for a program to discover the path name of the file |> from which it was executed? Appended to this message is a message posted to comp.unix.wizards by Greg Limes the nth time this was asked (your message was the mth time, where m is about three or four more than n if I recall correctly, and n is nonnegligable :-). It says just about all there is to say about this question. I haven't had occasion to use his source code yet, so I don't know whether or not it has bugs, but I doubt it... Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8495 Home: 617-782-0710 Article 19066 of comp.unix.wizards: Path: bloom-beacon!usc!apple!sun-barr!newstop!sun!limes From: limes@sun.com (Greg Limes) Newsgroups: comp.unix.wizards Subject: Re: Reading the symbol table of the currently running executable Message-ID: <LIMES.89Sep7153103@ouroborous.wseng.sun.com> Date: 7 Sep 89 22:31:03 GMT References: <9104@june.cs.washington.edu> <6131@lynx.UUCP> Sender: news@sun.Eng.Sun.COM Organization: Sun Microsystems, Inc. Lines: 243 In-reply-to: mitch@lynx.uucp's message of 5 Sep 89 17:17:36 GMT In article <6131@lynx.UUCP> mitch@lynx.uucp (Mitch Bunnell) writes: > In article <9104@june.cs.washington.edu> bcn@cs.washington.edu (Clifford Neuman) writes: > > 2) Obtaining the full path name of the presently running executable. > 2 - Not possible. Back before I knew this was impossible, I wrote the following piece of support code. It has been doing the impossible for me for quite some time (geez, has it been that long?) with limitations as stated. /* * findx package 25may88 limes@sun.com * * Over the last few days (weeks?) there has been some traffic about how to * tell where a running program came from. Well, there is a way to find * out without changing the shell, the kernel, C language startup * conventions, or whatever. * * Anyway, here is the basic idea, presented as a package that should compile * and run without too many problems. * * WHAT IT DOES First, it locates the path to the executable that was used by * the exec() that started this process. If the command name starts with a * "/", it must be taken literally; if it contains a "/", then it is * always relative to the current working directory at the start of the * program; otherwise, we have to chase across the PATH value in the * environment. If there is no PATH, or the PATH is empty, check the * current working directory. * * On systems with symbolic links, we are not through yet. The purpose is to * locate the directory it is in, so we can get at any related data files. * So, we chase symbolic links until we have the real path name of the * final resolution file. * * SECURITY This can be spoofed easily by making a hard link to, or a copy * of, the executable. If you want your program to be sure that it has * found the one true installation location, you will have to verify that * for yourself. findx() just locates the most likely candidate. * * PORTABILITY This was developed on a Sun3 running SunOS 4.0, but I think I * at least made the algorithm portable. You may need to mess with include * files and such. Symbolic link searching is turned on if your errno.h * supplies ELOOP, and off otherwise; I assume that all systems with * symlinks have a readlink() call. * * HOW TO USE IT Here is the definition of the various parameters. Further * down you will find an example main, so fear not ... * * findx (cmd, cwd, dir, pgm, run, path) * * cmd pass the command name (argv[0]) here. findx() knows how to handle * just about anything. If it starts with /, then we use the absolute * name, and ignore the path. If it contains a /, then use the relative * name and ignore the path. Otherwise, look for the file in each * directory named in the path for the file; if there is no path, pretend * its "." like the execvp does. * * cwd pass a big buffer here. if this begins with a slash, I will assume * it is filled in with the current working directory; otherwise, I will * fill it in using getcwd(). Should be at least MAXPATHLEN bytes, if you * do not fill it in yourself. * * dir pass a big buffer here. this gets the full path name of the * directory that the executable was read from. Should be at least * MAXPATHLEN bytes. * * pgm pass THE ADDRESS of a pointer variable here. findx() will fill the * pointer variable with a pointer to the final component of the string * passed as cmd above. Send a (char **)0 if you don't care about this. * * run pass THE ADDRESS of a pointer variable here. findx() will fill the * pointer variable with a pointer to the final component of the name of * the runnning program. Send a (char **)0 if you don't care about this. * * path pass the user's PATH variable here. I made it a parameter so you * can fiddle with the path first. If you do not want to fiddle, pass * getenv("PATH"). * * RETURN VALUES: Normally, findx() will return zero if all is well. If * something goes wrong, it will return -1 with the global variable * "errno" set to a corresponding error number. */ #include <strings.h> #include <errno.h> #include <sys/param.h> #define X_OK 1 #ifndef MAXPATHLEN #define MAXPATHLEN 1024 #endif #ifndef ENAMETOOLONG #define ENAMETOOLONG EINVAL #endif int findx (); /* get location of directory */ int resolve (); /* get link resolution name */ #ifdef TESTMAIN extern char *getenv (); /* read value from environment */ char *pn = (char *) 0;/* program name */ char *rn = (char *) 0;/* run name */ char rd[MAXPATHLEN]; /* run directory */ char wd[MAXPATHLEN] = "."; /* working directory */ int main (argc, argv) int argc; char **argv; { findx (*argv, wd, rd, &pn, &rn, getenv ("PATH")); printf ("%s: %s running in %s from %s\n", pn, rn, wd, rd); return 0; } #endif /*- * findx - find executable file in PATH * PARAMETERS: * cmd filename as typed by user * cwd where to return working directory * dir where to return program's directory * pgm where to return what user called it * run where to return final resolution name * path user's path from environment * RETURNS: returns zero for success, -1 for error (with errno set properly). */ int findx (cmd, cwd, dir, pgm, run, path) char *cmd; char *cwd; char *dir; char **pgm; char **run; char *path; { int rv = 0; char *f, *s; if (!cmd || !*cmd || !cwd || !dir) { errno = EINVAL; /* stupid arguments! */ return -1; } if (!path || !*path) /* missing or null path */ path = "."; /* assume sanity */ if (*cwd != '/') if (!(getcwd (cwd, MAXPATHLEN))) return -1; /* cant get working directory */ f = rindex (cmd, '/'); if (pgm) /* user wants program name */ *pgm = f ? f + 1 : cmd; if (dir) { /* user wants program directory */ rv = -1; if (*cmd == '/') /* absname given */ rv = resolve ("", cmd + 1, dir, run); else if (f) /* relname given */ rv = resolve (cwd, cmd, dir, run); else if (f = path) { /* from searchpath */ rv = -1; errno = ENOENT; /* errno gets this if path empty */ while (*f && (rv < 0)) { s = f; while (*f && (*f != ':')) ++f; if (*f) *f++ = 0; if (*s == '/') rv = resolve (s, cmd, dir, run); else { char abuf[MAXPATHLEN]; sprintf (abuf, "%s/%s", cwd, s); rv = resolve (abuf, cmd, dir, run); } } } } return rv; } /* * resolve - check for specified file in specified directory sets up * dir, following symlinks. returns zero for success, or -1 for error * (with errno set properly) */ int resolve (indir, cmd, dir, run) char *indir; /* search directory */ char *cmd; /* search for name */ char *dir; /* directory buffer */ char **run; /* resultion name ptr ptr */ { char *p; int rv = -1; #ifdef ELOOP int lcc = 0; int sll; char symlink[MAXPATHLEN + 1]; #endif do { errno = ENAMETOOLONG; if (strlen (indir) + strlen (cmd) + 2 > MAXPATHLEN) break; sprintf (dir, "%s/%s", indir, cmd); if (access (dir, X_OK) < 0) break; /* not an executable program */ #ifdef ELOOP while ((sll = readlink (dir, symlink, MAXPATHLEN)) >= 0) { symlink[sll] = 0; if (*symlink == '/') strcpy (dir, symlink); else sprintf (rindex (dir, '/'), "/%s", symlink); } if (errno != EINVAL) break; #endif p = rindex (dir, '/'); *p++ = 0; if (run) /* user wants resolution name */ *run = p; rv = 0; /* complete, with success! */ } while (0); return rv; } -- -- Greg Limes limes@sun.com ...!sun!limes 73327,2473 [choose one]
tgl@zog.cs.cmu.edu (Tom Lane) (07/25/90)
In article <1990Jul25.064956.22757@mintaka.lcs.mit.edu>, jik@athena.mit.edu (Jonathan I. Kamens) writes: > In article <9995@pt.cs.cmu.edu>, tgl@zog.cs.cmu.edu (Tom Lane) writes: > |> Is there any way for a program to discover the path name of the file > |> from which it was executed? > > [Jonathan provides a chunk of code written by Greg Limes (limes@sun.com), > which uses argv[0] and the PATH environment string to try to determine > where the current executable file came from.] Thanks for posting this code; I had been planning to write the same thing, and this saves me from reinventing the wheel. The business about following a symbolic link to the real executable is a nice refinement that I hadn't thought of. HOWEVER, this doesn't really answer my question. There are a couple of assumptions implicit in this method, which Greg didn't document: 1. It has to assume that argv[0] is identical to the path parameter given to exec. The manuals I've checked say "by convention, argv[0] must be supplied and must point to a string identical to path *or path's last component*" (emphasis added). If the invoking program follows that last clause, then we'll fail when the user does something like $ ../otherdir/progname parameters The shells I've tried around here seem to make argv[0] be the whole string, but who knows whether they all do? 2. It has to assume that the exec call was execlp() or execvp(), and not one of the other forms of exec. With the other forms, a simple name will always be found in the current directory. With execlp/execvp, this is true only if PATH contains "." as its first element. In practice these problems probably don't materialize often, so Greg's code probably gets the right answer 99% of the time. Still, I would like to know if it is possible to avoid these assumptions. -- tom lane Internet: tgl@cs.cmu.edu UUCP: <your favorite internet/arpanet gateway>!cs.cmu.edu!tgl BITNET: tgl%cs.cmu.edu@cmuccvma CompuServe: >internet:tgl@cs.cmu.edu
jik@athena.mit.edu (Jonathan I. Kamens) (07/25/90)
In article <10004@pt.cs.cmu.edu>, tgl@zog.cs.cmu.edu (Tom Lane) writes: |> [lists the assumptions that the code I posted makes] |> |> In practice these problems probably don't materialize often, so Greg's |> code probably gets the right answer 99% of the time. Still, I would |> like to know if it is possible to avoid these assumptions. As I said in my original message, the code I posted "says just about all there is to say about this question." In other words, no, there is no portable way to avaoid these assumptions, and indeed most systems don't even have a non-portable way to avoid them, unless you consider making your executable setuid root or setgid kmem and grovelling through kernel memory to figure out how the process was started to be a reasonable way to do things :-). Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8495 Home: 617-782-0710