bethge@wums.wustl.edu (02/08/90)
I am a 10+ year veteran of VMS programming who is trying to learn to like Unix. (Really!) It would help if I could find out how to port some of my favorite VMS tricks. I like to write programs that users can use without having to know details of their inner workings. Suppose a program needs some standard data which the user doesn't need to be concerned with, and which for various reasons needs to be read from a file rather than compiled in. The question is, how does the program find the file? My VMS solution is to keep the file in the same directory as the program executable, and use the system service which returns the full pathname of the cuurrently running executable, and get the disk, directory, etc. from that. But I don't know of a comparable system routine in Unix. I have looked at Unix programs which deal with this problem, and found that the pathname for the data file is hard-coded into the program. This of course means that the program has to be edited and recompiled if it becomes necessary to move the file. Environment variables are a better solution, but they require the user to define the environment variable before running the program. I could define the program as a shell script which defines the environment variable and then fires up the executable, but that's one more file to maintain. Is there a better (more transparent) way? ____________________________________________________________________ Paul H. Bethge bethge@wums.wustl.edu Washington University, St. Louis bethge@wums.bitnet
jik@athena.mit.edu (Jonathan I. Kamens) (02/08/90)
The subject of finding the full path name of the currently running process was discussed in comp.unix.wizards fairly recently (in September of last year, to be precise), and pretty much beaten until dead. The best posting on the subject was the one I've tacked onto the end of this article; it is a routine to do what you want whenever it's possible, with a whole bunch of comments explaining when (and why) it's not. I hope it helps. BTW, I didn't write it, I'm just reposting it. Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8495 Home: 617-782-0710 ------------------------------ Article 19066 of comp.unix.wizards: Path: bloom-beacon!usc!apple!sun-barr!newstop!sun!limes From: limes@sun.com (Greg Limes) Newsgroups: comp.unix.wizards Subject: Re: Reading the symbol table of the currently running executable Date: 7 Sep 89 22:31:03 GMT References: <9104@june.cs.washington.edu> <6131@lynx.UUCP> Sender: news@sun.Eng.Sun.COM Organization: Sun Microsystems, Inc. Lines: 243 In-reply-to: mitch@lynx.uucp's message of 5 Sep 89 17:17:36 GMT In article <6131@lynx.UUCP> mitch@lynx.uucp (Mitch Bunnell) writes: > In article <9104@june.cs.washington.edu> bcn@cs.washington.edu (Clifford Neuman) writes: > > 2) Obtaining the full path name of the presently running executable. > 2 - Not possible. Back before I knew this was impossible, I wrote the following piece of support code. It has been doing the impossible for me for quite some time (geez, has it been that long?) with limitations as stated. /* * findx package 25may88 limes@sun.com * * Over the last few days (weeks?) there has been some traffic about how to * tell where a running program came from. Well, there is a way to find * out without changing the shell, the kernel, C language startup * conventions, or whatever. * * Anyway, here is the basic idea, presented as a package that should compile * and run without too many problems. * * WHAT IT DOES First, it locates the path to the executable that was used by * the exec() that started this process. If the command name starts with a * "/", it must be taken literally; if it contains a "/", then it is * always relative to the current working directory at the start of the * program; otherwise, we have to chase across the PATH value in the * environment. If there is no PATH, or the PATH is empty, check the * current working directory. * * On systems with symbolic links, we are not through yet. The purpose is to * locate the directory it is in, so we can get at any related data files. * So, we chase symbolic links until we have the real path name of the * final resolution file. * * SECURITY This can be spoofed easily by making a hard link to, or a copy * of, the executable. If you want your program to be sure that it has * found the one true installation location, you will have to verify that * for yourself. findx() just locates the most likely candidate. * * PORTABILITY This was developed on a Sun3 running SunOS 4.0, but I think I * at least made the algorithm portable. You may need to mess with include * files and such. Symbolic link searching is turned on if your errno.h * supplies ELOOP, and off otherwise; I assume that all systems with * symlinks have a readlink() call. * * HOW TO USE IT Here is the definition of the various parameters. Further * down you will find an example main, so fear not ... * * findx (cmd, cwd, dir, pgm, run, path) * * cmd pass the command name (argv[0]) here. findx() knows how to handle * just about anything. If it starts with /, then we use the absolute * name, and ignore the path. If it contains a /, then use the relative * name and ignore the path. Otherwise, look for the file in each * directory named in the path for the file; if there is no path, pretend * its "." like the execvp does. * * cwd pass a big buffer here. if this begins with a slash, I will assume * it is filled in with the current working directory; otherwise, I will * fill it in using getcwd(). Should be at least MAXPATHLEN bytes, if you * do not fill it in yourself. * * dir pass a big buffer here. this gets the full path name of the * directory that the executable was read from. Should be at least * MAXPATHLEN bytes. * * pgm pass THE ADDRESS of a pointer variable here. findx() will fill the * pointer variable with a pointer to the final component of the string * passed as cmd above. Send a (char **)0 if you don't care about this. * * run pass THE ADDRESS of a pointer variable here. findx() will fill the * pointer variable with a pointer to the final component of the name of * the runnning program. Send a (char **)0 if you don't care about this. * * path pass the user's PATH variable here. I made it a parameter so you * can fiddle with the path first. If you do not want to fiddle, pass * getenv("PATH"). * * RETURN VALUES: Normally, findx() will return zero if all is well. If * something goes wrong, it will return -1 with the global variable * "errno" set to a corresponding error number. */ #include <strings.h> #include <errno.h> #include <sys/param.h> #define X_OK 1 #ifndef MAXPATHLEN #define MAXPATHLEN 1024 #endif #ifndef ENAMETOOLONG #define ENAMETOOLONG EINVAL #endif int findx (); /* get location of directory */ int resolve (); /* get link resolution name */ #ifdef TESTMAIN extern char *getenv (); /* read value from environment */ char *pn = (char *) 0;/* program name */ char *rn = (char *) 0;/* run name */ char rd[MAXPATHLEN]; /* run directory */ char wd[MAXPATHLEN] = "."; /* working directory */ int main (argc, argv) int argc; char **argv; { findx (*argv, wd, rd, &pn, &rn, getenv ("PATH")); printf ("%s: %s running in %s from %s\n", pn, rn, wd, rd); return 0; } #endif /*- * findx - find executable file in PATH * PARAMETERS: * cmd filename as typed by user * cwd where to return working directory * dir where to return program's directory * pgm where to return what user called it * run where to return final resolution name * path user's path from environment * RETURNS: returns zero for success, -1 for error (with errno set properly). */ int findx (cmd, cwd, dir, pgm, run, path) char *cmd; char *cwd; char *dir; char **pgm; char **run; char *path; { int rv = 0; char *f, *s; if (!cmd || !*cmd || !cwd || !dir) { errno = EINVAL; /* stupid arguments! */ return -1; } if (!path || !*path) /* missing or null path */ path = "."; /* assume sanity */ if (*cwd != '/') if (!(getcwd (cwd, MAXPATHLEN))) return -1; /* cant get working directory */ f = rindex (cmd, '/'); if (pgm) /* user wants program name */ *pgm = f ? f + 1 : cmd; if (dir) { /* user wants program directory */ rv = -1; if (*cmd == '/') /* absname given */ rv = resolve ("", cmd + 1, dir, run); else if (f) /* relname given */ rv = resolve (cwd, cmd, dir, run); else if (f = path) { /* from searchpath */ rv = -1; errno = ENOENT; /* errno gets this if path empty */ while (*f && (rv < 0)) { s = f; while (*f && (*f != ':')) ++f; if (*f) *f++ = 0; if (*s == '/') rv = resolve (s, cmd, dir, run); else { char abuf[MAXPATHLEN]; sprintf (abuf, "%s/%s", cwd, s); rv = resolve (abuf, cmd, dir, run); } } } } return rv; } /* * resolve - check for specified file in specified directory sets up * dir, following symlinks. returns zero for success, or -1 for error * (with errno set properly) */ int resolve (indir, cmd, dir, run) char *indir; /* search directory */ char *cmd; /* search for name */ char *dir; /* directory buffer */ char **run; /* resultion name ptr ptr */ { char *p; int rv = -1; #ifdef ELOOP int lcc = 0; int sll; char symlink[MAXPATHLEN + 1]; #endif do { errno = ENAMETOOLONG; if (strlen (indir) + strlen (cmd) + 2 > MAXPATHLEN) break; sprintf (dir, "%s/%s", indir, cmd); if (access (dir, X_OK) < 0) break; /* not an executable program */ #ifdef ELOOP while ((sll = readlink (dir, symlink, MAXPATHLEN)) >= 0) { symlink[sll] = 0; if (*symlink == '/') strcpy (dir, symlink); else sprintf (rindex (dir, '/'), "/%s", symlink); } if (errno != EINVAL) break; #endif p = rindex (dir, '/'); *p++ = 0; if (run) /* user wants resolution name */ *run = p; rv = 0; /* complete, with success! */ } while (0); return rv; } -- -- Greg Limes limes@sun.com ...!sun!limes 73327,2473 [choose one]
merlyn@iwarp.intel.com (Randal Schwartz) (02/08/90)
In article <1610.25d028a3@wums.wustl.edu>, bethge@wums writes: [wants to know how to find the name of the current executable] | Is there a better (more transparent) way? No. This was hashed out about a year ago in either c.u.q or c.u.w. Basically, it boils down to the fact that you have no idea where you came from, and the closest you could come is to count on the shells to *mostly* give you the right answer *most* of the time in argv[0]. However, programs that do execv() are free to provide *whatever* they want. So, you're out of luck, and subject to spoofing. Just another UNIX hacker, -- /=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\ | on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III | | merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn | \=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/
les@chinet.chi.il.us (Leslie Mikesell) (02/09/90)
In article <1610.25d028a3@wums.wustl.edu> bethge@wums.wustl.edu writes: >I like to write programs that users can use without having to know >details of their inner workings. Suppose a program needs some >standard data which the user doesn't need to be concerned with, and >which for various reasons needs to be read from a file rather than >compiled in. The question is, how does the program find the file? The normal unix choices would be: 1) connect the file to one of the stdio streams before execution. This has the advantage of allowing pipes to work and can be hidden from the user by a shell wrapper. 2) have a "start-up" configuration file in a standard place that set all the other options. You might also look for a second set up file in the user's HOME directory. 3) (my favorite) Put all the options on the command line with reasonable defaults compiled in. Then if the desired options become clumsy to type in, just add a shell wrapper for the common variations. As long as you are calling getopt() you might as well anticipate every option anyone might want. Les Mikesell les@chinet.chi.il.us
bph@buengc.BU.EDU (Blair P. Houghton) (02/13/90)
In article <1990Feb7.211538.3894@iwarp.intel.com> merlyn@iwarp.intel.com (Randal Schwartz) writes: >In article <1610.25d028a3@wums.wustl.edu>, bethge@wums writes: >[wants to know how to find the name of the current executable] >| Is there a better (more transparent) way? > >No. This was hashed out about a year ago in either c.u.q or c.u.w. >Basically, it boils down to the fact that you have no idea where you >came from, and the closest you could come is to count on the shells to >*mostly* give you the right answer *most* of the time in argv[0]. >However, programs that do execv() are free to provide *whatever* they >want. So, you're out of luck, and subject to spoofing. > >Just another UNIX hacker, Sum hecker. Take argv[0], if it doesn't have the path, or the full path, cut it up to get the command name, say "prog", then strcat(3) it onto "/usr/ucb/which" and call system(3): foo = "/usr/ucb/which prog" system(foo); As long as you're still in the directory from which the program was run, and as long as your path was the same as the one set in your .cshrc (someone please tell me why which(1) reads the .cshrc...) then you'll come up with /usr/foo/bar/bletch/prog, barring surreptition. As we saw last week, you can use any of a number of rather machine-specific exec*() commands to get which(1) to run, but only system(3) shows up in ANSI C. Getting the output of which(1) back to the program can take a number of routes, by a temporary file ("/usr/ucb/which prog > file"), or a socket, or dup'ping streams, or... --Blair "...but that's another question..."
kohli@gemed (Jim Kohli) (02/13/90)
In article <5378@buengc.BU.EDU>, bph@buengc.bu.edu (Blair P. Houghton) writes: <In article <1990Feb7.211538.3894@iwarp.intel.com> merlyn@iwarp.intel.com (Randal Schwartz) writes: <>In article <1610.25d028a3@wums.wustl.edu>, bethge@wums writes: <>[wants to know how to find the name of the current executable] <>| Is there a better (more transparent) way? <> <>No. This was hashed out about a year ago in either c.u.q or c.u.w. <>Basically, it boils down to the fact that you have no idea where you <>came from, and the closest you could come is to count on the shells to <>*mostly* give you the right answer *most* of the time in argv[0]. <>However, programs that do execv() are free to provide *whatever* they <>want. So, you're out of luck, and subject to spoofing. <> <>Just another UNIX hacker, < <Sum hecker. < <Take argv[0], if it doesn't have the path, or the full path, <cut it up to get the command name, say "prog", then strcat(3) <it onto "/usr/ucb/which" and call system(3): < < foo = "/usr/ucb/which prog" < system(foo); < <As long as you're still in the directory from which the <program was run, and as long as your path was the same <as the one set in your .cshrc (someone please tell me ^^^^^^^^^^^^^^^^^^^^^^ <why which(1) reads the .cshrc...) then you'll come up ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ <with /usr/foo/bar/bletch/prog, barring surreptition. < <[...the rest of Blair's fine response excised...] < reading .cshrc may be somewhat helpful in resolving aliases, eh? although this doesn't help, as you said, for cases where the path gets munged at the same time. jim kohli ge medical systems
scott@csusac.csus.edu (L. Scott Emmons) (02/13/90)
In article <5378@buengc.BU.EDU> bph@buengc.bu.edu (Blair P. Houghton) writes: >(someone please tell me why which(1) reads the .cshrc...) Because 'which' is a C-shell script itself...and when a C-shell script is called a new csh is forked, and .cshrc is read whenever a csh is started up. -- L. Scott Emmons --------------- ...[!ucbvax]!ucdavis!csusac!scott ucdavis!csusac!scott@ucbvax.berkeley.edu
wisner@hayes.fai.alaska.edu (Bill Wisner) (02/13/90)
In article <5378@buengc.BU.EDU>, bph@buengc (Blair P. Houghton) writes: > (someone please tell me >why which(1) reads the .cshrc...) Because which(1) is a csh script. Bill Wisner <wisner@hayes.fai.alaska.edu> Gryphon Gang Fairbanks AK 99775
merlyn@iwarp.intel.com (Randal Schwartz) (02/14/90)
In article <5378@buengc.BU.EDU>, bph@buengc (Blair P. Houghton) writes: | Sum hecker. | | Take argv[0], if it doesn't have the path, or the full path, | cut it up to get the command name, say "prog", then strcat(3) | it onto "/usr/ucb/which" and call system(3): | | foo = "/usr/ucb/which prog" | system(foo); | | As long as you're still in the directory from which the | program was run, and as long as your path was the same | as the one set in your .cshrc (someone please tell me | why which(1) reads the .cshrc...) then you'll come up | with /usr/foo/bar/bletch/prog, barring surreptition. But, this is exactly what I said was subject to spoofing and failure! There is no general solution that works in all cases, although you can get a useful answer under *many* typical circumstances. In case this isn't *very* obvious... remember: argv[0] is ARBITRARY! Just because the shells typically pass the name of the command (with or without a leading path, depending on the shell) in argv[0] *doesn't* mean you can depend on it! Try reading a little closer next time, please. Just another UNIX hacker, -- /=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\ | on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III | | merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn | \=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/
dce@smsc.sony.com (David Elliott) (02/14/90)
In article <1990Feb13.095913.29040@hayes.fai.alaska.edu> wisner@hayes.fai.alaska.edu (Bill Wisner) writes: >In article <5378@buengc.BU.EDU>, bph@buengc (Blair P. Houghton) writes: >> (someone please tell me >>why which(1) reads the .cshrc...) > >Because which(1) is a csh script. Chuckle. Pretty good. I don't think that's what Blair was asking. Obviously he knows it's a csh script, since it would be hard for him to know that it reads .cshrc otherwise. The question is: Why does which run without the -f option, which would cause it *not* to read .cshrc? I think that the answer is that it wants to handle aliases. The problem is that this can also cause the path to be changed (people who use remote shells know to define the path in .cshrc). One possibility to "fix" which would be to have it run with -f, check the path for all possibilities, source the .cshrc, and then check for aliases. Of course, this has problems, too. Personally, I prefer builtin commands for doing this job, like type in sh and Tony Birnseth's builtin which for csh. -- David Elliott dce@smsc.sony.com | ...!{uunet,mips}!sonyusa!dce (408)944-4073
richard@aiai.ed.ac.uk (Richard Tobin) (02/14/90)
In article <5378@buengc.BU.EDU> bph@buengc.bu.edu (Blair P. Houghton) writes: > someone please tell me why which(1) reads the .cshrc... If you're asking "why does which(1) assume I use csh", then: Different shells potentially interpret commands in completely different ways. A command like which *has* to depend on your shell. It seems clear to me that which should be built-in to csh and sh - that way it would always be right. -- Richard -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin
barnett@crdgw1.crd.ge.com (Bruce Barnett) (02/14/90)
>>> (someone please tell me >>>why which(1) reads the .cshrc...) >> >>Because which(1) is a csh script. >The question is: Why does which run without the -f option, which would >cause it *not* to read .cshrc? The question is why people post without spending a couple of minutes reading the script file. It DOES use the -f option on start-up. It DOES explicitly source the .cshrc file. Use the source, Luke! -- Bruce G. Barnett <barnett@crd.ge.com> uunet!crdgw1!barnett
gwyn@smoke.BRL.MIL (Doug Gwyn) (02/14/90)
In article <1747@skye.ed.ac.uk> richard@aiai.UUCP (Richard Tobin) writes: >Different shells potentially interpret commands in completely different >ways. A command like which *has* to depend on your shell. It seems clear >to me that which should be built-in to csh and sh - that way it would >always be right. Well, without a precise definition of what it is that we expect "which" to do, the issue cannot be settled. My own view is that "which" and "every" should report ONLY on $PATH-based commands (assuming standard UNIX, i.e., not multiple mounts on /bin), and that "whatis" should be a builtin that produces a definition suitable for feeding back to the shell (unlike System V's "type" builtin). Here are some typical examples: $ echo $PATH /usr/lbin:/usr/5bin:/bin:/usr/bin:~/bin:/usr/ucb:. $ whatis which /usr/lbin/which $ whatis every /usr/lbin/every $ whatis whatis # 8th or 9th Edition UNIX or BRL Bourne shell builtin whatis $ whatis cd builtin cd $ whatis builtin # 8th or 9th Edition UNIX or BRL Bourne shell builtin builtin $ whatis l l () { ( set +u ; exec ls -bCF $* ) } $ whatis sh /usr/lbin/sh $ whatis xyzzy # xyzzy not found $ which which /usr/lbin/which $ which every /usr/lbin/every $ which whatis /usr/ucb/whatis $ which cd which: cd: not found $ which builtin which: builtin: not found $ which l which: l: not found $ which sh /usr/lbin/sh $ which xyzzy which: xyzzy: not found $ every which /usr/lbin/which /usr/ucb/which $ every every /usr/lbin/every $ every whatis /usr/ucb/whatis $ every cd every: cd: not found $ every builtin every: builtin: not found $ every l every: l: not found $ every xyzzy every: xyzzy: not found $ every sh /usr/lbin/sh /usr/5bin/sh /bin/sh (Actually, for interactive use I normally redefine "cd" and "which" using shell functions, but the example is clearer if I show the default behavior.)
jc@minya.UUCP (John Chambers) (02/15/90)
> As long as you're still in the directory from which the > program was run, and as long as your path was the same > as the one set in your .cshrc (someone please tell me > why which(1) reads the .cshrc...) then you'll come up > with /usr/foo/bar/bletch/prog, barring surreptition. I've been mystified about this on some Ultrix machines at work, especially since this causes it to give the wrong result most of the time. When I got ahold of this machine (an ESIX system), I was further surprised to find that which didn't even exist. And here I'd thought it was a universal csh builtin. Just shows how naive I was. So I decided to try my hand at implementing it. Half an hour later, I had it working. It works in the obvious way, using the PATH from its environment, and gives the right result. Something even more surprising: You know how the csh builtin has this several-second delay before it answers? Well, my little program answers with no discernable delay. How could they have all gotten it all so wrong? I feel like posting my program, but I'd feel a bit silly to do so, because it's such a piece of trivia. I mean, talk about a Programming 101 assignment. At least, I think I'll take it to work, so I can find things on the Ultrix systems. (Random sounds of disgust and exasperation.) -- John Chambers ...!{harvard,ima,mit-eddie}!minya!jc [Sorry, no clever saying today.]