ramsdell@linus.UUCP (11/30/87)
In writing programs that dynamically load code, one usually needs to know from which file the executing image was exec'ed. I conclude from my own review of Draft 12 of the POSIX standard (P1003.1), that there is no way of knowing this information. What do you think about adding the following two system calls, execlfd and execvfd? Synopsis execlfd like execl execvfd like execv Description These exec's replace the current process with a new image. Before replacing the image, but after determining the identity of the file to be exec'ed, they open the binary image and place the file descriptor in an external int called "boot_fd". The exec fails if the open fails. Thus when a C program is executed as result of these one of calls, it is entered as a C procedure call as follows: extern int boot_fd; extern char **environ; int main(argc, argv) int argc; char **argv; John
chris@mimsy.UUCP (Chris Torek) (11/30/87)
In article <18491@linus.UUCP> ramsdell@linus.UUCP (John D. Ramsdell) writes: >In writing programs that dynamically load code, one usually needs to >know from which file the executing image was exec'ed. I conclude from >my own review of Draft 12 of the POSIX standard (P1003.1), that there >is no way of knowing this information. True enough. >What do you think about adding the following two system calls, execlfd >and execvfd? ... These exec's [would] replace the current process with >a new image. Before replacing the image, but after determining the >identity of the file to be exec'ed, they open the binary image and >place the file descriptor in an external int called "boot_fd" [in >the new image!]. Points: 0) hard to implement (kernel has to look for symbol table!) 1) does not help programs that are run by the old exec calls 2) as a side issue, there is no execl system call; only execve is in fact a syscall (the rest can be done with library routines). A much simpler method for getting a handle on the current process's file would be to have a /dev entry that can be opened for (at least) reading. It would mean that the kernel would have to `open' and `close' files being executed, so that while they run, they cannot be removed, but this is already true for all but OMAGIC files. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris Anagrams: Ate rock, cork tea, a rocket: racket, O!
ok@quintus.UUCP (Richard A. O'Keefe) (12/01/87)
In article <18491@linus.UUCP>, ramsdell@linus.UUCP (John D. Ramsdell)
suggests two new variants of the already confusing exec() family,
which leave the executable file open with its fd in "extern int boot_fd"
of the new image.
But what happens if the file being executed is a shell script?
There isn't the slightest need for this, anyway. You can already
obtain pretty much the effect he described by writing your own
execlfd or execvfd which
- determines the identity of the file
- opens it, yielding FD
- adds "BOOTFD="FD to the new environment
- calls the appropriate exec() with the new environment
Then the called program can just getenv("BOOTFD").
IEEE 1003.1 s 3.1.2.2 explicitly says "The argument arg0 should point
to a filename that is associated with the process being started by
one of the exec functions." If the calling program is careful to
obey this rule, particularly if it is careful to pass the absolute
file name of the called program, the called program can open its
own file if it wants to without the need for any new exec variant.
karl@tut.cis.ohio-state.edu (Karl Kleinpaste) (12/01/87)
ramsdell@linus.UUCP writes:
Synopsis
execlfd like execl, execvfd like execv
Description
These exec's replace the current process with a new image. Before
replacing the image, but after determining the identity of the file to
be exec'ed, they open the binary image and place the file descriptor in
an external int called "boot_fd". The exec fails if the open fails.
This sounds ill-advised to me. On the one hand, it forces the
consumption of yet another file descriptor. But more importantly, it
confuses the meaning of the permissions r-xr-xr-x and --x--x--x. What
do you give the program in boot_fd, if permissions are the latter,
deliberately preventing the reading of the program?
--
Karl
mbr@aoa.UUCP (Mark Rosenthal) (12/01/87)
In article <18491@linus.UUCP> ramsdell@linus.UUCP (John D. Ramsdell) writes: >In writing programs that dynamically load code, one usually needs to >know from which file the executing image was exec'ed. I conclude from >my own review of Draft 12 of the POSIX standard (P1003.1), that there >is no way of knowing this information. What do you think about adding >the following two system calls, execlfd and execvfd? He then goes on to describe system calls which would provide the exec'ed program with a file descriptor which could be used to read the file. You've identified a real problem, but your proposed solution puts the responsibility in the wrong place. If you implement execlfd and execvfd as described, the exec'er (i.e. the program calling exec??()) must determine whether or not the exec'ee (i.e. the program called by exec??()) will want to dynamically load code, in order to know whether to call execl() or execlfd(). This allows the possibility of a situation in which the exec'er uses execl() when it should have used execlfd(). Thus boot_fd would not get initialized, and when the exec'ee tries to read the file from which it was exec'ed, the read fails, with potentially disastrous consequences for the exec'ee. A better design is to prevent the situation from ever arising. The system call should be the same regardless of whether or not the exec'ed file dynamically loads code. The desired result could be achieved by adding a new special file (and device driver) which any program could open to read the file from which it was exec'ed. -- Mark of the Valley of Roses ...!{harvard,ima}!bbn!aoa!mbr
weiser.pa@Xerox.COM (12/01/87)
"Richard A. O'Keefe" <ok@quintus.uucp> (and others) point out, any method that requires that a file to be executed by a special call can be simulated by using environment variables or other means to first remember the file being executed, then do the exec. As Chris T. points out, this is really not adequate, since you simply may not know ahead of time that the program you are running will want to figure out where it is running from. A /dev scheme will work ok, or a new system call "givemeanfdfromwhereIam running". As a workaround, I do the following in the dynamical loading/linking system I wrote for the Portable Cedar project at PARC: any program which wants to inquire about its load status is run from a shell script which invokes it via its full path name. Then by looking at argv[0] it can find the file from which it is running. In other words, program 'foo' in /usr/local/bin is really the script #/bin/csh -f exec /usr/lib/foo.real This way folks with /usr/local/bin on their path can just say 'foo' to run it, but the real foo gets run with a full path name. Note that the execer doesn't have to do anything special to run foo, but foo has to be careful to install itself in the world in a special way. -mark
karl@mumble.cis.ohio-state.edu (Karl Kleinpaste) (12/02/87)
weiser.pa@Xerox.COM writes:
you
simply may not know ahead of time that the program you are running
will want to figure out where it is running from. A /dev scheme
will work ok, or a new system call "givemeanfdfromwhereIam
running".
One of the research versions of UNIX at Murray Hill supports this.
It's implemented as an ioctl(2) (or was it an fcntl(2)?) against a
file descriptor. This can be used in conjunction with /proc; you open
the /proc/pid special file for the process you want to see (using
atoi(getpid()) should do the trick to look at yourself), then issue
this ioctl() against that file descriptor; you get back a file
descriptor pointing to the original binary which was exec'd.
This is used in a debugger, where you give it the pid of the currently-
existing process to debug; the debugger opens /proc/that-pid, then
gets a hold of the symbol table for that program by using this new
ioctl. The debugger never needed to know the name of the program file
being debugged, and the debugged program doesn't need to be a child of
the debugger, either.
I believe this was discussed in one of the papers in the Oct 1984 Bell
Labs Tech. Journal.
-=-
Karl
neilb@elecvax.eecs.unsw.oz (Neil F. Brown) (12/03/87)
Mark of the Valley of Roses writes: >In article <18491@linus.UUCP> ramsdell@linus.UUCP (John D. Ramsdell) writes: >>In writing programs that dynamically load code, one usually needs to >>know from which file the executing image was exec'ed. I conclude from >>my own review of Draft 12 of the POSIX standard (P1003.1), that there >>is no way of knowing this information. What do you think about adding >>the following two system calls, execlfd and execvfd? > >He then goes on to describe system calls which would provide the exec'ed program >with a file descriptor which could be used to read the file. > >You've identified a real problem, but ... > >..... The desired result could be achieved by adding a >new special file (and device driver) which any program could open to read >the file from which it was exec'ed. >-- I have two small comments to make. The first is about this idea of a new special file and device driver. What is suggested is writing code to fiddle with file descriptor for you, and making it look like a character special device. It hardly seems the right place to put the code to me. After all, its not a character device that we want opened (or that we get). This all started with the /dev/fd drivers. Very useful things, but not really devices. The problem is that there isn't a clean place in Unix to but this functionality, so one should be made. Probably a new file type, mode & IFMT == IFFILEDESC or something like that. The second is that we should look at this problem of accessing a processes object file a bit more deeply. The problem as I see it is that there are 20+4 (or 64+4 or whatever) file descriptors available to a process, of which only 20 (or 64..) are first class citizens. The other four are current directory - available by opening "" or "." root directory - available as "/" controlling termial - "/dev/tty" (probably) processes object file - not currently available. Admittedly the controling terminal is not stored as a file descriptor, but it easily could be. Ideally, each of these should be available on an equal standing with other descriptors (though the semantics of closing them would need careful thought). Maybe they could be file descriptors -1 -2 -3 -4 - or maybe not, there could be problems. Anyway, its food for thought. NeilBrown
allbery@ncoast.UUCP (Brandon Allbery) (12/03/87)
As quoted from <2470@tut.cis.ohio-state.edu> by karl@tut.cis.ohio-state.edu (Karl Kleinpaste): +--------------- | ramsdell@linus.UUCP writes: | replacing the image, but after determining the identity of the file to | be exec'ed, they open the binary image and place the file descriptor in | an external int called "boot_fd". The exec fails if the open fails. | | consumption of yet another file descriptor. But more importantly, it | confuses the meaning of the permissions r-xr-xr-x and --x--x--x. What +--------------- Since the expressed intent is to be able to get at the namelist of the program, maybe a special system call should be created to return the process's namelist. I admit, it's ugly and non-Unix, but maybe this is itself an indication that, with dynamic loading being the latest development, the whole idea of the namelist should be re-evaluated. A possible solution is to distinguish between compiler symbols and "exported" dynamic-linking symbols; the latter would be stored as part of the process's "environment" (ublock). Another possibility is produced by System V shared libraries: system functions are always at a fixed address, "linkage" variables could also be placed at a fixed address, and a global table of linked-in functions would be used to call such routines. Of course, this makes dynamic functions not look like statically-linked ones, but I doubt that we could deal with that problem anyway (unless, of course, we all program in Lisp ;-). It also produces a difference in variables, which is an even more severe limitation. -- Brandon S. Allbery necntc!ncoast!allbery@harvard.harvard.edu {hoptoad,harvard!necntc,cbosgd,sun!mandrill!hal,uunet!hnsurg3}!ncoast!allbery Moderator of comp.sources.misc
allbery@ncoast.UUCP (Brandon Allbery) (12/08/87)
As quoted from <3851@elecvax.eecs.unsw.oz> by neilb@elecvax.eecs.unsw.oz (Neil F. Brown): +--------------- | This all started with the /dev/fd drivers. Very useful things, but not really | devices. The problem is that there isn't a clean place in Unix to but this | functionality, so one should be made. | Probably a new file type, | mode & IFMT == IFFILEDESC | or something like that. +--------------- May I suggest that an existing feature in BSD UNIX could be generalized to provide this? Why not generalize "bind"? (i.e. "flink()") As for the file descriptor of an executable -- sounds like the process file system of V8 is the best way to do this. Either another special file system or an ioctl() or fcntl() on a "file" in /proc could do this. In fact, it seems to me that the two facilities might be generalized together, thus also producing shared-memory "files" and other exotic (exotic? Tell that to Xenix!) critters. +--------------- | The problem as I see it is that there are 20+4 (or 64+4 or whatever) file | descriptors available to a process, of which only 20 (or 64..) are first | class citizens. +--------------- 80 (+4) [3B1. Golly, it even beats 4.3BSD! ;-)] +--------------- | The other four are | current directory - available by opening "" or "." | root directory - available as "/" | controlling termial - "/dev/tty" (probably) | processes object file - not currently available. | Admittedly the controling terminal is not stored as a file descriptor, | but it easily could be. +--------------- I thought V8 put /dev/tty on fd 127? +--------------- | Ideally, each of these should be available on an equal standing with other | descriptors (though the semantics of closing them would need careful thought). +--------------- With the exception of the process's object file, I'd disallow close() on those fd's; close() should probably return EINVAL. One could argue for the tty fd being closeable, however; the result should be to both close the tty and detach the process from the terminal (set pgrp to 0 or to pid). -- Brandon S. Allbery necntc!ncoast!allbery@harvard.harvard.edu {hoptoad,harvard!necntc,cbosgd,sun!mandrill!hal,uunet!hnsurg3}!ncoast!allbery Moderator of comp.sources.misc