gnu@hoptoad.uucp (John Gilmore) (05/11/88)
tif@cpe.UUCP wrote: > ...it sounds like perl should have a special variable > that is like $0 only contains a full path. I have often wanted exactly this for C utilities. Wouldn't it be nice if you didn't have to build in the names of your control files -- if the executable could derive the names of its config files from its own name? The problem is that Unix doesn't provide a way to tell your own name. What is passed in argv[0] need not bear any relation to the name of the program (and often doesn't, if the shell has searched PATH to find the executable). On the other hand, the first argument to exec() is always the correct path of the executable (either an absolute or relative path). But it's not available to the executed program. If exec() would pass this value to the executed program, say as argv[-1], then a program could reliably know its own name, and apply a simple transformation to it to find its data files (e.g. for program "XXXXXX/foo", its data files are found in "XXXXXX/lib/foo/whatever"). This works for all values of XXXXXX, whether absolute or relative. For a subsystem like uucp, you would turn e.g. XXXXXX/uucico into XXXXXX/lib/uucp/whatever (replace program name with subsystem name). This would make lots of application programs easier to install; you just copy it into somewhere on your PATH and it will run. For all those "shrink wrap applications" that ABI is likely to provide, this would be a major win. It would also reduce the volume of arcane knowledge required to run a Unix system (e.g. where are the netnews control files kept? How about crontab? How about sendmail configs? How about inet daemon config?) If anyone implements this, I recommend providing a #define AV_EXECNAME -1 and documenting that argv[AV_EXECNAME] is the pathname given to exec(). No sense embedding another magic number (-1) into programs... -- John Gilmore {sun,pacbell,uunet,pyramid,ihnp4}!hoptoad!gnu gnu@toad.com "Use the Source, Luke...."
limes@sun.uucp (Greg Limes) (05/12/88)
In article <4527@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >If exec() would pass this value to the executed program, say as >argv[-1], then a program could reliably know its own name, and apply a >simple transformation to it to find its data files (e.g. for program >"XXXXXX/foo", its data files are found in "XXXXXX/lib/foo/whatever"). >This works for all values of XXXXXX, whether absolute or relative. >For a subsystem like uucp, you would turn e.g. XXXXXX/uucico into >XXXXXX/lib/uucp/whatever (replace program name with subsystem name). In Turbo-C, Borland passes the complete path name of the program executed as argv[0]. This may be specific to Turbo-C, or may be general across MS-DOS. Are there any programs that this would break? >If anyone implements this, I recommend providing a #define AV_EXECNAME -1 >and documenting that argv[AV_EXECNAME] is the pathname given to exec(). >No sense embedding another magic number (-1) into programs... ... and defining AV_EXECNAME as (0) would make this work for Turbo-C and any other environments that do as I described above. As things stand now, if argv[0][0] is '/' then that string is *usually* the name of the executing program. -- Greg Limes #include <std-disclaimer.h>
root@cca.ucsf.edu (Computer Center) (05/12/88)
In article <4527@hoptoad.uucp>, gnu@hoptoad.uucp (John Gilmore) writes: > If exec() would pass this value to the executed program, say as > argv[-1], then a program could reliably know its own name, and apply a > simple transformation to it to find its data files (e.g. for program > "XXXXXX/foo", its data files are found in "XXXXXX/lib/foo/whatever"). > This works for all values of XXXXXX, whether absolute or relative. > For a subsystem like uucp, you would turn e.g. XXXXXX/uucico into > XXXXXX/lib/uucp/whatever (replace program name with subsystem name). Noooooooooo! If the program is in XXXXXX/bin/foo its support should be reachable ^^^ via XXXXXX/lib/foo. Thos Thos Sumner (thos@cca.ucsf.edu) BITNET: thos@ucsfcca (The I.G.) (...ucbvax!ucsfcgl!cca.ucsf!thos) OS|2 -- an Operating System for puppets. #include <disclaimer.std>
andrew@frip.gwd.tek.com (Andrew Klossner) (05/12/88)
[] "This would make lots of application programs easier to install; you just copy it into somewhere on your PATH and it will run." If an application uses this scheme to find its associated files, some useful Unix idioms cease to work. For example, say that "rn" lives in /usr/news, but I don't want /usr/news in my PATH (too many nasty commands are also there). At present I can put a link to /usr/news/rn in a directory that is in my path (e.g., my local bin). With the proposed scheme, that would cause rn to look in my_local_bin/lib/* for its data files instead of in /usr/news/lib/*. -=- Andrew Klossner (decvax!tektronix!tekecs!andrew) [UUCP] (andrew%tekecs.tek.com@relay.cs.net) [ARPA]
jv@mhres.mh.nl (Johan Vromans) (05/13/88)
From article <4527@hoptoad.uucp>, by gnu@hoptoad.uucp (John Gilmore): > If anyone implements this, I recommend providing a #define AV_EXECNAME -1 > and documenting that argv[AV_EXECNAME] is the pathname given to exec(). I'm already using the convention that library/data files belonging to a program are located in a path relative to the name of the program. So I strongly second this suggestion. Until this is adopted by the next C standard, we'll need to have a library routine which does the job, based on argv[0] and the PATH variable (despite of the possible problems - there's no better way). -- Johan Vromans | jv@mh.nl via European backbone Multihouse N.V., Gouda, the Netherlands | uucp: ..{uunet!}mcvax!mh.nl!jv "It is better to light a candle than to curse the darkness"
wesommer@athena.mit.edu (William Sommerfeld) (05/13/88)
In article <9987@tekecs.TEK.COM> andrew@frip.gwd.tek.com (Andrew Klossner) writes: >[] > > "This would make lots of application programs easier to > install; you just copy it into somewhere on your PATH and it > will run." > >If an application uses this scheme to find its associated files, some >useful Unix idioms cease to work. For example, say that "rn" lives in >/usr/news, but I don't want /usr/news in my PATH (too many nasty >commands are also there). At present I can put a link to /usr/news/rn >in a directory that is in my path (e.g., my local bin). With the >proposed scheme, that would cause rn to look in my_local_bin/lib/* for >its data files instead of in /usr/news/lib/*. I remarked (in private mail) to John Gilmore that what he described was very similar to the Multics referencing_dir mechanism. If it's done right, the application gets passed its _real_ absolute pathname, after all the symlinks have been chased. While I'm here, I might as well lobby for support for a library function/system call which canonicalizes a pathname, chasing all the links and turning it into an absolute pathname. abs_path(".", buf) should be equivalent to getwd(buf). It was useful on Multics. It would be very useful in some cases on UNIX. While normally I am opposed to creating new system calls, a reasonable implementation of abs_path in user code (assuming it didn't "cheat" and use chdir()) would most likely be O(n**2) in terms of directory lookups, whereas a version inside the kernel would be O(n). (Of course, you could always do what was done in both Amber and AEGIS: move namei() or its equivalent out of the kernel and into a shared library..) Bill Sommerfeld wesommer@athena.mit.edu
dmcanzi@watdcsu.waterloo.edu (David Canzi) (05/13/88)
How about if binary software was routinely distributed as (1) a library containing most of the compiled code, (2) a short C source program in which configurable information is compiled as external variables, and (3) a makefile which can be edited to define the configurable options by compiling the C source file with suitable "-D" options. (Or perhaps it would be simpler to edit the C source file directly.) This way, if the program as distributed searches for config and data files under /usr/lib/thingumbob, and you would rather install these files under /usr/local/lib/thingumbob, you'd have that option. And there will be no need to add another feature to the kernel. -- David Canzi
dg@lakart.UUCP (David Goodenough) (05/13/88)
In article <4527@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >If exec() would pass this value to the executed program, say as >argv[-1], then a program could reliably know its own name, and apply a >simple transformation to it to find its data files (e.g. for program >"XXXXXX/foo", its data files are found in "XXXXXX/lib/foo/whatever"). >This works for all values of XXXXXX, whether absolute or relative. >For a subsystem like uucp, you would turn e.g. XXXXXX/uucico into >XXXXXX/lib/uucp/whatever (replace program name with subsystem name). Wait just a minute. If the information is REALLY important, argv[0] is the FULL PATH NAME that the program was invoked with: Script started on Fri May 13 10:32:21 1988 lakart!dg(~)[1]-> cat eco.c main(argc, argv) char **argv; { printf("%s\n", argv[0]); } lakart!dg(~)[2]-> eco eco lakart!dg(~)[3]-> ./eco ./eco lakart!dg(~)[4]-> cd .. lakart!dg(/u2)[5]-> dg/eco dg/eco lakart!dg(/u2)[6]-> cd dg/src lakart!dg(src)[7]-> ../eco ../eco lakart!dg(src)[8]-> echo ~dg/eco /u2/dg/eco lakart!dg(src)[9]-> ~dg/eco /u2/dg/eco lakart!dg(src)[10]-> ^D script done on Fri May 13 10:33:05 1988 Now, if argv[0][0] is a '/' everything is OK, else just do a popen("pwd", "r"); suck it all up, and prepend it to argv[0], with an intervening '/'. You may not have an optimal path, BUT IT WILL BE CORRECT, and ABSOLUTE. Now you can go to work. -- dg@lakart.UUCP - David Goodenough +---+ | +-+-+ ....... !harvard!adelie!cfisun!lakart!dg +-+-+ | +---+
daveb@geac.UUCP (David Collier-Brown) (05/16/88)
In article <5307@bloom-beacon.MIT.EDU> wesommer@athena.mit.edu (William Sommerfeld) writes: | I remarked (in private mail) to John Gilmore that what he described | was very similar to the Multics referencing_dir mechanism. If it's | done right, the application gets passed its _real_ absolute pathname, | after all the symlinks have been chased. | | While I'm here, I might as well lobby for support for a library | function/system call which canonicalizes a pathname, chasing all the | links and turning it into an absolute pathname. abs_path(".", buf) should | be equivalent to getwd(buf). It was useful on Multics. It would be | very useful in some cases on UNIX. |... | Bill Sommerfeld | wesommer@athena.mit.edu I have a copy of a program called "name" which appears to do just that (the O(n**2) variant), whose origin is unknown. Would the author care to (re)post it? Shall I? -- David Collier-Brown. {mnetor yunexus utgpu}!geac!daveb Geac Computers Ltd., | "His Majesty made you a major 350 Steelcase Road, | because he believed you would Markham, Ontario. | know when not to obey his orders"
swh@hpsmtc1.HP.COM (Steve Harrold) (05/18/88)
Re: the "name()" function Please post it --------------------- Steve Harrold ...hplabs!hpsmtc1!swh HPG200/13 (408) 447-5580 ---------------------
mer6g@uvaarpa.virginia.edu (Marc E. Rouleau) (05/18/88)
In article <107@lakart.UUCP> dg@lakart.UUCP (David Goodenough) writes: > [ some examples pointing out that (in some cases) what ends up in > argv[0] can be turned into a full pathname by prepending `pwd` to it ] This technique works only if the program being executed is invoked by specifying a relative or absolute path for it. If it is found by your shell as specified by your $PATH variable, all bets are off ... The whole point of John Gilmore's proposal was to address this problem of executables being found by path-search and therefore having no invocation-time knowledge of where they reside. -- Marc Rouleau
allbery@ncoast.UUCP (Brandon S. Allbery) (05/19/88)
As quoted from <4527@hoptoad.uucp> by gnu@hoptoad.uucp (John Gilmore): +--------------- | If exec() would pass this value to the executed program, say as | argv[-1], then a program could reliably know its own name, and apply a | simple transformation to it to find its data files (e.g. for program | "XXXXXX/foo", its data files are found in "XXXXXX/lib/foo/whatever"). | This works for all values of XXXXXX, whether absolute or relative. +--------------- ...until the program does a chdir(), at which point the program must have resolved a relative pathname into an absolute one or it won't be able to use the path any more. Actually, the biggest problem with this is that by the time the kernel has the executable, the pathname has been changed to a (dev, ino) pair. This is less than useful. And as far as know, the kernel doesn't keep the pathname around any longer than necessary (that being namei()). And what happens if I "ln /usr/lib/uucp/uucico ~/etc/poll"? (Not that I advocate doing so, but....) -- Brandon S. Allbery, moderator of comp.sources.misc {well!hoptoad,uunet!marque,cbosgd,sun!mandrill}!ncoast!allbery Delphi: ALLBERY MCI Mail: BALLBERY
dgk@ulysses.homer.nj.att.com (David Korn[eww]) (05/21/88)
ksh passes the full pathname of the executable as the first environment variable and names it _. Thus, if the program is run by ksh, genenv("_"); returns a pathname for the executable. Now if everyone would follow this convention the problem would be solved. David Korn ulysses!dgk
gnu@hoptoad.uucp (John Gilmore) (05/23/88)
mer6g@uvaarpa.virginia.edu (Marc E. Rouleau) wrote: > The whole point of John Gilmore's proposal was to address this problem > of executables being found by path-search and therefore having no > invocation-time knowledge of where they reside. Actually, that's not the point (you could always write a subroutine that searched the path to find argv[0]). The point is that I want a mechanism that cannot be spoofed. Mystery variables in the environment, library routines that look at argv[0], etc, can all be spoofed by a 3-line program (that changes the environment then calls exec(), or that passes different things as the filename to execute versus argv[0] to exec()). If real applications are going to use this, it's critical that they are able to depend upon the pointer they find. Imagine if you could invoke uucp with your own private set of spool directories -- all the security built into it would be pretty useless. You could steal mail, forge things, etc. Several people pointed out that hard links to a program would foul up my proposed mechanism -- they are right. You can prevent people from copying the application to another spot and running it (by making the program unreadable by mortals, even though executable by them); the kernel could resolve all the symlinks before giving you the path; but you can hard-link to anything you can name, even if it has '0' permissions. Perhaps a solution is to only provide this facility to programs with a link count of 1? In other words, if you hard-link to a program, it would no longer be provided its name, and could exit with an error message. This is probably even worse -- by simply creating a hard link to /usr/lib/uucico, you could make it unable to find its directories, even when run from the right place. Perhaps a variation on one of these themes would work. If the kernel was to output pathnames to user programs (presumably so that the programs can trust that the pathnames are uncorrupted), it might be reasonable to put some kinds of access control on hard links. Maybe if/when Unix file protection ever gets revised, this can be considered. Someone else pointed out that Multics can reliably tell you the pathname of your running program. Maybe Multics didn't have hard links, so it was trustable? -- John Gilmore {sun,pacbell,uunet,pyramid,ihnp4}!hoptoad!gnu gnu@toad.com "Use the Source, Luke...."
dlm@cuuxb.ATT.COM (Dennis L. Mumaugh) (05/25/88)
In article <4626@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >Actually, that's not the point (you could always write a subroutine >that searched the path to find argv[0]). The point is that I want a >mechanism that cannot be spoofed. Mystery variables in the >environment, library routines that look at argv[0], etc, can all be >spoofed by a 3-line program (that changes the environment then calls >exec(), or that passes different things as the filename to execute >versus argv[0] to exec()). If real applications are going to use this, >it's critical that they are able to depend upon the pointer they find. There is such a facility that originated in Version 8 that will see light of day in NEW ATT releases of UNIX. This is part of the /proc file system. It is an ioctl that returns a file descriptor of the text of the process. PIOCOPENT -- provides a read-only file descriptor for the executable file associated with the "traced" process. This allows a debugger to find the symbol table without having to know any path names. Once you have the file descriptor, fstat it and get the device, inode pair, and then execute ncheck -i on the correct device to get a path name. Of course this is modulo links ( hard or symbolic). What? You must be root to run ncheck? True, but why would you be so concerned about being lied to otherwise. Actually, the PIOCOPENT would have been very useful for the V6 Adventure game that Jim Gillogly wrote. He put the messages for the game at the end of the a.out following the "meaningful" part of the a.out. His program did all sorts of contortions to find the name of the file so it could be opened and read. [It had to be installation and user independent/proff.] With the ioctl from /proc, it would be a three line code section: sprintf(procname,"/proc/%05d",getpid()); procfd = open(procname,O_RDONLY); textfd = ioctl(procfd,PIOCOPENT); lseek(textfd, (long)offset,0); So, if that is what you are really intending to do, use Version 8 or wait a year -- it is already available for System V Release 3.1. for the 3B4000. -- =Dennis L. Mumaugh Lisle, IL ...!{ihnp4,cbosgd,lll-crg}!cuuxb!dlm
pjh@mccc.UUCP (Pete Holsberg) (05/25/88)
In article <10310@ulysses.homer.nj.att.com> dgk@ulysses.homer.nj.att.com (David Korn[eww]) writes:
...
...ksh passes the full pathname of the executable as the first environment
...variable and names it _. Thus, if the program is run by ksh,
...genenv("_"); returns a pathname for the executable. Now if everyone
...would follow this convention the problem would be solved.
Aspen Technology's implementation of ksh sets _ to just the name of the
executable. (At least, that is what is stored in _ in the environment space.)
aburt@isis.UUCP (Andrew Burt) (05/26/88)
In article <4626@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >mer6g@uvaarpa.virginia.edu (Marc E. Rouleau) wrote: >> The whole point of John Gilmore's proposal was to address this problem >> of executables being found by path-search and therefore having no >> invocation-time knowledge of where they reside. > >Actually, that's not the point (you could always write a subroutine >that searched the path to find argv[0]). The point is that I want a >mechanism that cannot be spoofed. Mystery variables in the >environment, library routines that look at argv[0], etc, can all be >spoofed by a 3-line program (that changes the environment then calls >exec(), or that passes different things as the filename to execute >versus argv[0] to exec()). If real applications are going to use this, >it's critical that they are able to depend upon the pointer they find. >Imagine if you could invoke uucp with your own private set of spool >directories -- all the security built into it would be pretty useless. >You could steal mail, forge things, etc. If all you need is a secure method of obtaining a single pathname (e.g., for the lib dir of an application) why not use [**kludge alert**] a dummy entry in /etc/passwd with home dir set to the path desired (actual login disabled, of course)? Program wanting to know its lib dir just does getpwnam(compiled_in_application_id) and off it goes. Now I hate to junk up /etc/passwd with this sort of thing (and have an alternative suggestion below) but this is easily done with current tools. I'd like to see a convention on usernames chosen for application "users", maybe prepending an underscore to a simple name for the application (_uucp, _news, _nethack, etc.). Admittedly this isn't needed but it makes it more obvious the entry isn't for a real user. What would even be more useful (and what this approximates in an ugly way) is a global environment (that isn't user changeable). A true global env. could be implemented by a lib func (getsysenv(var_name)) that looks for "var_name=..." in a file, /etc/environment say. (Granted, we could make this a system call and store the environment in core all the time, but it strikes me programs wouldn't look up definitions so often that much time would be saved.) Makes the system admin job easier too. I always feel a little uneasy editing passwd (I dial in on often noisy phone lines), I'd feel better editing a less crucial file. Implementation-wise there's not much to this -- anyone who wants a copy let me know -- but there is the problem of getting it universally adopted. The passwd approach has the advantage that the file exists and people know what it does. (On the other hand, the /etc/environment would be more adoptable for non-unix systems where there is no passwd...) Thoughts anyone? -- Andrew Burt isis!aburt Fight Denver's pollution: Don't Breathe and Drive.
jv@mhres.mh.nl (Johan Vromans) (05/28/88)
From article <2272@isis.UUCP>, by aburt@isis.UUCP (Andrew Burt): > What would even be more useful (and what this approximates in an ugly way) > is a global environment (that isn't user changeable). A true global env. > could be implemented by a lib func (getsysenv(var_name)) that looks for > "var_name=..." in a file, /etc/environment say. (Granted, we could make > this a system call and store the environment in core all the time, but it > strikes me programs wouldn't look up definitions so often that much time > would be saved.) Makes the system admin job easier too. I always feel > a little uneasy editing passwd (I dial in on often noisy phone lines), I'd > feel better editing a less crucial file. > > Thoughts anyone? I like the idea. It seems to me that it is also usefull for tailorable system constants like hostname, domain-name, number-of-lines for the system printer and (default) timezone. The main thing is, that although everyone can use it, only the system administrator can change the settings. This allows for security and reliability. To get it accepted: make a solid, good-looking implementation and post it to comp.sources.unix. And then start posting programs which use it. -- Johan Vromans | jv@mh.nl via European backbone Multihouse N.V., Gouda, the Netherlands | uucp: ..{uunet!}mcvax!mh.nl!jv "It is better to light a candle than to curse the darkness"
boyd@basser.oz (Boyd Roberts) (05/31/88)
You people are sick. The best you're going to get is to change the command interpretter to make argv[0] == the pathname passed to exec. Then it's a convention. Security, ``..'' & symbolic links be buggered. I don't believe this discussion. This is why Roche made diazepam (Valium). With 5mg of diazepam comp.unix.wizards would become bearable (laughable). Disclaimer: what wizards? Boyd Roberts boyd@basser.cs.su.oz ``When the going gets wierd, the weird turn pro...''
limes@ouroborous (Greg Limes) (06/10/88)
GENERAL COMMENTS First off, thanks in advance for not wiring the base directory into the program anywhere; your application will fit nicely into a networked workstation environment where the users may mount your installed directory tree anywhere. IGNORE THE ENVIRONMENT Fancy environment variables are fine, but these fail in unexpected ways; remember that the variable is blindly inherited across exec() calls. Thus, if your program was started by a "make" (or similar utility), you may get pointed to the wrong guy. Also, you may find that a large number of installations will not support this special new environment variable in any case. FORGET MODIFYING THE KERNEL Can you imagine trying to get all the Unix vendors together on this? Can you imagine trying to get all the customers to upgrade? I know of at least one major installation of Sun workstations that is still running SunOS 3.2 Beta! DUPLICATE exec()'s WORK The only thing we can really count on (and even this not always) is that, if we do the same kind of search that exec() does, we should come up with the same destination. So, it looks like we will need to scan the $PATH variable, looking for an executable called (argv[0]). REMEMBER SYMBOLIC LINKS Now, we probably want to find the directory, so toss in a readlink() and you are there. Add error checking to taste, season well with lint. FINGERPRINT THE DIRECTORY To make this secure, fingerprint your directory. Make a read-only file that is set-uid to a user id number that your EXECUTABLE knows about, and put some data in the file so you are sure this is the right fingerprint. If I were worried about making, say, GnuEmacs "absolutely sure" of its start point, I would set up a "message of the day", owned by (say) daemon, setuid, and read only. Make all your critical files owned by and writable only by the same user. Joe Hacker who duplicates the installation with the intention of changing things around will be unable to duplicate the key file, and the application will know that it has found an improper installation directory. You may want to fingerprint each directory in the tree, just in case someone gets fancy with mount points. Anybody see any big holes here? (yea, a stupid question, I know...) -- Greg Limes [limes@sun.com] frames to /dev/fb