marcp@beryl.berkeley.edu.UUCP (04/14/87)
Hello! I've a couple of questions in UNIX 4.2. 1. How do programs like "more" distinguish between text files and executable files? Hopefully, there's something surer than just taking a sample of a file and testing it. (This question came up when a bunch of people started accidentally sending executables to a line printer, and I was trying to figure out a way to filter out the execs from the texts). 2. Is it possible to, while in a C program, call another program and put it into the background? Actually, I know it's possible, 'cause I can do it with a line like: system("cat textfile &"); This won't work, however, if I try to "more" the file instead. What determines what can be put in the background and what can't? Is there some way to run a program from within a program, and have it return upon completion to the original program besides "system"? (The execl series, of course, doesn't return.) Thanks for your time, Marc M. Pack
avolio@gildor.dec.com (Frederick M. Avolio) (04/14/87)
In article <3164@jade.BERKELEY.EDU> marcp@beryl.berkeley.edu (Marc M. Pack) writes: >Hello! I've a couple of questions in UNIX 4.2. > >1. How do programs like "more" distinguish between text files and > executable files? More 1) checks the file type to make sure it is not a directory. 2) Looks for a magic number at the head of the file. For example, 0407 (exec.), 0410 (pure exec.), 0413(demand paged), 0177545(old archive)..... it knows that it should not more it and gives an appropriate message. It does not know about data files, as far as I can tell and more will go ahead and show them to you if you want. Not too good for a printer either... The file program samples data. If it finds characters greater than 255 it decides it is a "data" file. A filter such as this is maybe your best bet if you have a "simple" lineprinter... >2. Is it possible to, while in a C program, call another program and > put it into the background? Actually, I know it's possible, > 'cause I can do it with a line like: > system("cat textfile &"); > > This won't work, however, if I try to "more" the file instead. > What determines what can be put in the background and what can't? > Is there some way to run a program from within a program, and have > it return upon completion to the original program besides "system"? > (The execl series, of course, doesn't return.) You're on the right track... execl et al. don't return but overlays so what you do is a fork and execl. The parent can wait for the child to finish or not. BAsically this is what system is. It ... FORK EXEC SHELL WITH COMMAND LINE PARENT WAITS As in: /* FORK RETURNS THE PID OF THE CHILD TO THE PARENT AND 0 TO THE CHILD */ if ((pid = fork()) == 0) /* IF CHILD */ { execl("/bin/sh", "sh", "-c", arg, 0); printf("HELP! SOMEONE STOLE THE SHELL!\n"); exit(1); } wait(0); /* WAIT SATISFIED WHEN CHILD FINISHES So this kind of thing will do what you want. See manual page on fork (section 2) for more details. (If on a 4.*BSD ort Ultrix system use cfork instead... Why? I don't know, the same reason you type 'sync' three times before halting :-).) BTW, "more" doesn't work too well in the background as you did it because it is not associated at that point with the tty anymore. Cat doesn't care what it cats to. Fred
josh@hi.UUCP (04/14/87)
In article <3164@jade.BERKELEY.EDU> marcp@beryl.berkeley.edu (Marc M. Pack) writes: >Hello! I've a couple of questions in UNIX 4.2. > >1. How do programs like "more" distinguish between text files and > executable files? Hopefully, there's something surer than > just taking a sample of a file and testing it. I think this is the way it does it. First it stat's it to see if it is a directory then it reads in some of it to see if it is ascii. > >2. Is it possible to, while in a C program, call another program and > put it into the background? Actually, I know it's possible, > 'cause I can do it with a line like: > > system("cat textfile &"); > > This won't work, however, if I try to "more" the file instead. > What determines what can be put in the background and what can't? > Is there some way to run a program from within a program, and have > it return upon completion to the original program besides "system"? > (The execl series, of course, doesn't return.) This is a loaded question and can become far more complicated then you know... The command more(1) checks to see if it is run on a terminal (a tty). This is why "more" will not run in a system() call. You CAN put almost anything in the background, it is just harder if it requires to be run on a terminal (talk(1), more(1), etc). A short source for system might look like (Hmm...): system(string) char string[]; { if(vfork()) { return(wait(0)); } else { ... parse string ... execl(prog,args); } } /* This is not a complete system() but I am writting it from memory. */ If you wish to run a more(1) or a talk(1) or rogue(6), it requires the program to open a pty. This makes the program (more) think it is on a terminal while it really is not. This is how rlogin works. Try the following: 1) Set up a .rhosts file in your home directory 2) rsh to that account using the following command: % rsh {machine} /bin/csh -i 3) Try doing a "more" of a file. Gets sick eh? This is one reason why pty's were invented. I have a whole library of routines for networking, ptys, etc that I can post if there are any requests for it... > >Thanks for your time, > > Marc M. Pack Hope I was helpful --Josh Siegel -- Josh Siegel (siegel@hc.dspo.gov) (505) 277-2497 (Home) I'm a mathematician, not a programmer!
dce@mips.UUCP (04/15/87)
In article <3164@jade.BERKELEY.EDU> marcp@beryl.berkeley.edu (Marc M. Pack) writes: >Hello! I've a couple of questions in UNIX 4.2. > >1. How do programs like "more" distinguish between text files and > executable files? Hopefully, there's something surer than > just taking a sample of a file and testing it. (This question > came up when a bunch of people started accidentally sending > executables to a line printer, and I was trying to figure > out a way to filter out the execs from the texts). In standard 4.2/4.3 systems, more, vi, and file all have "magic numbers" (numbers found at the beginning of special files like object files) coded in them. In System V-based systems (and some BSD-based commercial systems), the file command uses a file (/etc/magic) that contains magic number information. Tektronix's UTek even supplies a libc subroutine that interfaces to /etc/magic. In any event, you probably want to make your filter check for nulls and high bits ((x & 0200) != 0), since not all "garbage files" are known to vi and more. >2. Is it possible to, while in a C program, call another program and > put it into the background? Actually, I know it's possible, > 'cause I can do it with a line like: > > system("cat textfile &"); > > This won't work, however, if I try to "more" the file instead. > What determines what can be put in the background and what can't? > Is there some way to run a program from within a program, and have > it return upon completion to the original program besides "system"? > (The execl series, of course, doesn't return.) The more command works funny if you try to put it in the background because it works very closely with the tty. As for there being a way to execute subprograms other than with system(), there is a way (remember: the shell is just another command as far as Unix is concerned). The idea is something like: pid = fork(); if (pid < 0) { perror("fork"); return; } if (pid == 0) { /* child */ execlp(command, command, arg1, arg2, ..., 0); perror(command); _exit(127); } while ((wpid = wait(0)) != pid) { if (wpid < 0) { break; } } Note that the last loop is important (this is due to the way pipes work, and the lack of this loop used to cause a bug to show up in the crypt command). The above code is effectively what system() does, but if you don't need any of the special shell features (redirection, pipes, variables, etc.) or want to do things like redirection yourself (see fpopen(3s)), this saves you from forking a shell. -- David Elliott {decvax,ucbvax,ihnp4}!decwrl!mips!dce
jon@eps2.UUCP (04/15/87)
In article <1296@decuac.DEC.COM>, avolio@gildor.dec.com (Frederick M. Avolio) writes: > if ((pid = fork()) == 0) /* IF CHILD */ > { > execl("/bin/sh", "sh", "-c", arg, 0); > printf("HELP! SOMEONE STOLE THE SHELL!\n"); > exit(1); Use _exit() in this instance instead of exit() because there is the possibility of flushing the stdio buffers twice, once in the child and once in the parent, which wouldn't really be right. You might see some output twice. I would probably use an fputs to stderr here, to avoid buffering. This was probably just an oversight on Frederick's followup. To really confuse people running ps, replace "sh" with a clever saying in double quotes. Jonathan Hue DuPont Design Technologies/Via Visuals leadsv!eps2!jon
gwyn@brl-smoke.UUCP (04/16/87)
In article <80@eps2.UUCP> jon@eps2.UUCP (Jonathan Hue) writes: -In article <1296@decuac.DEC.COM>, avolio@gildor.dec.com (Frederick M. Avolio) writes: -> if ((pid = fork()) == 0) /* IF CHILD */ -> { -> execl("/bin/sh", "sh", "-c", arg, 0); -> printf("HELP! SOMEONE STOLE THE SHELL!\n"); -> exit(1); - -Use _exit() in this instance instead of exit() because there is the -possibility of flushing the stdio buffers twice, once in the child and -once in the parent, which wouldn't really be right. You might see some -output twice. I would probably use an fputs to stderr here, to avoid -buffering. This was probably just an oversight on Frederick's followup. -To really confuse people running ps, replace "sh" with a clever saying in -double quotes. The above advice, which is good as far as it goes, is insufficient. Using _exit() will mean that the printf may not be seen (although with default line-buffering to the terminal it probably would be), since stdio buffer flushing would be skipped. Before the fork(), there should be an fflush(stdout) to clear out any buffered parent data. Error messages should probably be written to stderr rather than stdout (stdout is often directed down a pipe). Finally, it is much better style to return an error indication to a higher level of program control and let the higher level determine strategy (such as whether to print an error message). The fact that it is hard to get this stuff exactly right is why one should use the library routines such as system() instead, whenever possible. (If the library routine is broken, get it fixed!)
baccala@USNA.MIL (Brent W Baccala) (04/17/87)
"Frederick M. Avolio" <avolio@gildor.dec.COM> writes: >...(If on a 4.*BSD ort Ultrix system use >cfork instead... Why? I don't know, the same reason you type 'sync' >three times before halting :-).) I've never heard of cfork, and it can't find a manual page for it on our 4.3 BRL system. I don't know much (anything) about Ultrix, but do you mean vfork? vfork does a "virtual" fork - most of the parent's memory space is not copied. Instead, the parent is suspended while the child uses some of its memory. If you're going to do an exec of some flavor, you don't to change the parent's memory anyway, so this is very memory-efficient. There are (of course) strings attached to what you can and can't do in a vfork - read the man page. In particular, you can't return from the function that called vfork because that would screw up the stack frame. You also can't use exit on an error (use _exit) because exit will close stdio structures in the parent. Its even wrong (as some people have suggested) to use exit from a fork, because even though you have a separate set of file descriptors, data buffered before the fork will get flushed twice. P.S. "_exit" is a fast exit - it terminates the process without doing any of the housekeeping that "exit" does (by calling "_cleanup"). - BRENT W. BACCALA - Computer Aided Design/Interactive Graphics U.S. Naval Academy Annapolis, MD <decvax!brl-smoke!usna!baccala> <seismo!usna!baccala> <baccala@usna.arpa>
robertd@ncoast.UUCP (Rob DeMarco) (04/18/87)
In article <3164@jade.BERKELEY.EDU> marcp@beryl.berkeley.edu (Marc M. Pack) writes: >Hello! I've a couple of questions in UNIX 4.2. > >1. How do programs like "more" distinguish between text files and > executable files? Hopefully, there's something surer than > just taking a sample of a file and testing it. (This question > came up when a bunch of people started accidentally sending > executables to a line printer, and I was trying to figure > out a way to filter out the execs from the texts). I would believe that a pretty sure method would be to test the file permisions, if an "x" accours in the file permisions, then it is executable, other wise its text. >2. Is it possible to, while in a C program, call another program and > put it into the background? Actually, I know it's possible, > 'cause I can do it with a line like: > > system("cat textfile &"); > > This won't work, however, if I try to "more" the file instead. > What determines what can be put in the background and what can't? My guess is that more accepts input, since you have to press <SPACE> to go on. Since it is in background, it doesn't wait for it to complete before going on, therefor , getting input is imposible, because it doesn't check for input. >Thanks for your time, Your welcome! :-) > Marc M. Pack -- [=====================================] [ Rob DeMarco ] [ UUCP:decvax!cwruecmp!ncoast!robertd ] [ ] [ "bus error - passengers dumped" ] [===============7@rid/* (/* (/, /Wagiste
dsg@mitre-bedford.arpa (Dave Goldberg) (04/21/87)
> I would believe that a pretty sure >method would be to test the file >permisions, if an "x" accours in the >file permisions, then it is executable, >other wise its text. If only it were that easy. However, I can make any text file have permission of 755 (rwxrwxrwx) and still be a pure ascii file. This is even useful in the case of shell scripts. Dave Goldberg dsg@mitre-bedford.arpa Disclaimer: for this you want a disclaimer!?!
dsg@mitre-bedford.arpa (Dave Goldberg) (04/21/87)
>of 755 (rwxrwxrwx) and still be a pure ascii file. This is even useful in the
Before anyone gets a chance to flame me, 755 was a typo, I meant 777.
Dave Goldberg
dsg@mitre-bedford.arpa
Disclaimer: for this you want a disclaimer!?!
chuckles@aoa.UUCP (Charles Stern) (04/22/87)
In article <2382@ncoast.UUCP> robertd@ncoast.UUCP (Rob DeMarco) writes: >>Hello! I've a couple of questions in UNIX 4.2. >> >>1. How do programs like "more" distinguish between text files and >> executable files? Hopefully, there's something surer than >> just taking a sample of a file and testing it. (This question >> came up when a bunch of people started accidentally sending >> executables to a line printer, and I was trying to figure >> out a way to filter out the execs from the texts). > > I would believe that a pretty sure >method would be to test the file >permisions, if an "x" accours in the >file permisions, then it is executable, >other wise its text. This is not truly accurate. Consider the case of an executable shell script... more SURELY works on that! (thank G-d ;-)) >>Thanks for your time, > Your welcome! :-) >> Marc M. Pack > > >-- >[=====================================] >[ Rob DeMarco ] >[ UUCP:decvax!cwruecmp!ncoast!robertd ] >[ ] >[ "bus error - passengers dumped" ] >[=====================================] -- Charles Stern ...!{decvax,linus,ima,ihnp4}!bbncca!aoa!chuckles ...!{wjh12,mit-vax}!biomed!aoa!chuckles "What's black and dangerous and sits in a tree?" "A crow with a machine gun." -- "Star Smashers of the Galaxy Rangers" Harry Harrison
goudreau@dg_rtp.UUCP (Bob Goudreau) (04/23/87)
In article <2382@ncoast.UUCP> robertd@ncoast.UUCP (Rob DeMarco) writes: >In article <3164@jade.BERKELEY.EDU> marcp@beryl.berkeley.edu (Marc M. Pack) writes: >>Hello! I've a couple of questions in UNIX 4.2. >> >>1. How do programs like "more" distinguish between text files and >> executable files? Hopefully, there's something surer than >> just taking a sample of a file and testing it. (This question >> came up when a bunch of people started accidentally sending >> executables to a line printer, and I was trying to figure >> out a way to filter out the execs from the texts). > > I would believe that a pretty sure >method would be to test the file >permisions, if an "x" accours in the >file permisions, then it is executable, >other wise its text. > This isn't such a good idea, for three reasons: 1) Shell scripts are executable files, but they are also printable ASCII files. It might make some of your users a little mad to find some print jobs refused. 2) On the other side of the coin, there exist files which are not executable and which are also unprintable; a common example is any ".o" file. 3) Finally, pr can accept redirected input. How is it supposed to do a stat() on stdin? A better filter would be a program that looks for indications that the file is an object or a.out (program) file. This is in fact what more does; it checks for for a "magic number" at the beginning of a file indicating that the file is a program or object file. >>2. Is it possible to, while in a C program, call another program and >> put it into the background? Actually, I know it's possible, >> 'cause I can do it with a line like: >> >> system("cat textfile &"); >> >> This won't work, however, if I try to "more" the file instead. >> What determines what can be put in the background and what can't? > > My guess is that more accepts input, >since you have to press <SPACE> to go >on. Since it is in background, it >doesn't wait for it to complete before >going on, therefor , getting input is >imposible, because it doesn't check for >input. This is along the right lines, but not correct. Consider the following program: main() { system ("more /etc/termcap &"); for (;;) ; } This will work, (try it) because only one process (the more) is trying to read its stdin. (The system() does a fork() and the child process inherits identical copies of its parent's file descriptors, including stdin). The place where you will run into trouble is when your main program is also trying to read stdin at the same time; the two processes will fight over the input. The same is true of stdout and stderr -- you will get jumbled-together output. The moral is, Be careful of what child processes do with file descriptors inherited from the parent. If the child is going to open its own files, go right ahead and use system(foo &) to put it in the background. You may want to become familiar with the fork() system call, since it allows you to bypass some of the overhead of the system() lib function. -- Bob Goudreau Data General Corp. 62 Alexander Drive Research Triangle Park, NC 27709 (919) 248-6231 ...!mcnc!rti-sel!dg_rtp!goudreau
jfh@killer.UUCP (John Haugh) (04/29/87)
I love people who don't know what they are talking about, which is why I always say - 'Read the documentation whether you know it or not'. Disinformation correction in progress ... MUNCH ... MUNCH ... MUNCH In article <1752@dg_rtp.UUCP>, goudreau@dg_rtp.UUCP (Bob Goudreau) writes: > In article <2382@ncoast.UUCP> robertd@ncoast.UUCP (Rob DeMarco) writes: > >In article <3164@jade.BERKELEY.EDU> marcp@beryl.berkeley.edu (Marc M. Pack) writes: > >>Hello! I've a couple of questions in UNIX 4.2. > >> > >>1. How do programs like "more" distinguish between text files and > >> executable files? Hopefully, there's something surer than > >> just taking a sample of a file and testing it. [ More stuff ] > > > > I would believe that a pretty sure > >method would be to test the file > >permisions, if an "x" accours in the > >file permisions, then it is executable, > >other wise its text. > > > > This isn't such a good idea, for three reasons: [ Dumb comment about shell scripts being executable and printable ... ] [ Dumb comment about .o's not being executable. Of course, the question sent this guy in that direction. ] [ Misinfomation Alert ] > 3) Finally, pr can accept redirected input. How is it supposed to do > a stat() on stdin? Try the originally suggested idea. It does work, and see what file(1) says " ... If an argument appears to be ASCII, _file_ examines the first 512 bytes and tries to guess its language. ... " - Quoted from "Plexus Sys5 UNIX User's Reference Manual". Read 512 bytes (or 1024 if you want to be surer) and check to see if all of the characters are printable. How about using something like ctypes(3). The two macros isspace() and isprint() should do the trick. Then, to make things real robust (remember that word from Comp-Sci 101 :-) print character not in isspace() || isprint() with some special convention. Now for alittle disinformation correction. (I knew the manual would get a big workout today. From stat(2), I read "Similaryly, _fstat_ obtains information about an open file known by the file descriptor _filedes_, ..." - Quoted from "Plexus Sys5 UNIX Programmer's Reference Manual". Any file descriptor can be stat(2)'d, including 0, 1, and 2 which were opened long, long ago. If you wanted to, you could even find out the name of the file that was connected to the descriptor. (It is _not_ easy :-( ) > > A better filter would be a program that looks for indications that the > file is an object or a.out (program) file. This is in fact what more > does; it checks for for a "magic number" at the beginning of a file > indicating that the file is a program or object file. > No, this is a stupid idea. The problem with printouts screwing up printers is not because they are _object_ files, it is because they contain characters that screw up the printer. Look for those characters. What happens if your users decide to print core dumps, directories, /etc/wtmp and the like. A well thought out approach, or even the one I suggested (it took me about 12 seconds to come up with it.) will find out if the file can be printed. > > >>2. Is it possible to, while in a C program, call another program and > >> put it into the background? Actually, I know it's possible, [ And he tells us why (system ("command &");) ] > > > > My guess is that more accepts input, > >since you have to press <SPACE> to go > >on. Since it is in background, it > >doesn't wait for it to complete before > >going on, therefor , getting input is > >imposible, because it doesn't check for > >input. [ I can't even follow what this poster wants to say ... ] > > This is along the right lines, but not correct. > Consider the following program: > > main() > { > system ("more /etc/termcap &"); > for (;;) ; > } > > This will work, (try it) because only one process (the more) is trying to read > its stdin. (The system() does a fork() and the child process inherits identical > copies of its parent's file descriptors, including stdin). > No, once again you are wrong, wrong, wrong. The system may do a fork(2), but the child does an exec(2) of the sh(1), which says "If a command is followed by & the default standard input for the command is the empty file _/dev/null_. ..." -Quoted from "Plexus Sys5 UNIX User's Reference Manual". The shell closes file descriptor 0 and open(2)'s /dev/null, with the consequence that the file descriptor that is returned is 0. So it don't get the same file descriptor. And besides, _stdin_ is NOT a file descriptor. Try using it in a place where one is needed. > The place where you will run into trouble is when your main program is > also trying to read stdin at the same time; the two processes will > fight over the input. The same is true of stdout and stderr -- you will get > jumbled-together output. > Different problem with standard error and standard output. If you look at what the book's got to say, it tells you that with two processes reading from a terminal, the system flips a coin to decide which process gets it. It actually lets the two processes beat each other in the head for it. (I have seen the code for coinflip() an clobber() in the kernel with my own to eyes :-) :-) :-). > The moral is, Be careful of what child processes do with file descriptors > inherited from the parent. If the child is going to open its own files, > go right ahead and use system(foo &) to put it in the background. > You may want to become familiar with the fork() system call, since it > allows you to bypass some of the overhead of the system() lib function. > You could always close them and not worry about it if that is such a big deal. More(1) will still grab you because it doesn't use stdin to get the commands from the keyboard!!! Try 'cat /etc/passwd | more'. This works pretty much the same as 'more /etc/passwd'. If it didn't, the man(1) command wouldn't work for those of us that have, or added a more(1) in the output pipeline. Since the 'cat /etc/passwd |' is the standard input, it can't be reading from there. I can't find strings(1) on this machine so I can't tell you that more(1) is opening /dev/tty, BUT - the last more(1) clone I wrote did just that. Unless the output wasn't a file or pipe or somthing other that a tty (remember isatty(2)?). You of course, might want to become familiar with the manuals. Of course, you can always get had and say stupid things about system calls they just added in the newest release. But that is a different brand of stupidity. The moral of the story is - don't just say 'This is a nice article, I think I'll reply' unless you want to contribute some real information. Also, not everyone has the time to research or the knowelege to contribute a worth while reply, so don't feel bad if you can't. (But you still ought to read the manuals anyway.) - John. (jfh@killer.UUCP) Disclaimer: No disclaimer. Whatcha gonna do, sue me?
gwyn@brl-smoke.ARPA (Doug Gwyn ) (05/06/87)
In article <816@killer.UUCP> jfh@killer.UUCP (John Haugh) writes: >If you wanted to, you could even find out the name >of the file that was connected to the descriptor. (It is _not_ easy :-( ) I would think it's actually impossible. The kernel doesn't remember the name of the path you used to open an inode, and some descriptors (e.g. pipes) have no associated names.
cdash@boulder.UUCP (05/07/87)
In article <5835@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >In article <816@killer.UUCP> jfh@killer.UUCP (John Haugh) writes: >>If you wanted to, you could even find out the name >>of the file that was connected to the descriptor. (It is _not_ easy :-( ) > >I would think it's actually impossible. The kernel doesn't remember >the name of the path you used to open an inode, and some descriptors >(e.g. pipes) have no associated names. actually, it is possible. you know the inode associated with the descriptor start at "/" and just keep looking for that i# keeping track of where you are. Like the man said, it ain't easy, but it CAN be done. -- cdash aka cdash@boulder.colorado.edu aka ...hao!boulder!cdash aka ...nbires!boulder!cdash
gwyn@brl-smoke.ARPA (Doug Gwyn ) (05/07/87)
In article <634@boulder.Colorado.EDU> cdash@boulder.Colorado.EDU (Charles Shub) writes: >actually, it is possible. you know the inode associated with the descriptor >start at "/" and just keep looking for that i# keeping track of where you are. This can find A name for the inode (assuming that there IS one and that you avoid the many pitfalls that are possible), but not THE name that was used to open the file. (Even in the absence of more than one link, a variety of names could have been used.) You wouldn't want to wait for this anyway on some of the large filesystems we have around here.
chris@mimsy.UUCP (05/07/87)
>>In article <816@killer.UUCP> jfh@killer.UUCP (John Haugh) writes: >>>If you wanted to, you could even find out the name >>>of the file that was connected to the descriptor. >In article <5835@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn) writes: >>I would think it's actually impossible. In article <634@boulder.Colorado.EDU> cdash@boulder.Colorado.EDU (Charles Shub) writes: >actually, it is possible. you know the inode associated with the descriptor >start at "/" and just keep looking for that i# keeping track of where you are. >Like the man said, it ain't easy, but it CAN be done. `It' can be done: What can be done? The problem is ill-defined in the first place. `Find *the* name of the file.' Who says there is only one? % ln file ../other/file `Find *a* name of the file.' That can be done iff the file has at least one name. `Find the internal name of the file.' Easy: this is just the <dev, ino> pair. One problem is that few people wish to deal with this form of the name. Incidentally, `find / -inum <n>' takes a *long* time on a big system. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: seismo!mimsy!chris
guy@gorodish.UUCP (05/07/87)
> actually, it is possible. you know the inode associated with the descriptor > start at "/" and just keep looking for that i# keeping track of where you > are. No, Doug is correct. In the general case, it is *not* possible - there may not *be* a directory entry that refers to the inode in question. The inode may be an unnamed pipe, or may be a file that was unlinked. Even if there is a directory entry that refers to the inode in question, it is not possible if you lack read permission on any of the directories leading up to that file. Furthermore, given the procedure you suggest, it may be technically possible in many cases but it is not practical in many, probably most, of them. Doing a top-down search for a given inode, starting at "/", can take a *very* long time unless you're very near the root.
shz@desoto.UUCP (05/07/87)
In article <634@boulder.Colorado.EDU> cdash@boulder.Colorado.EDU (Charles Shub) writes: >In article <5835@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >>In article <816@killer.UUCP> jfh@killer.UUCP (John Haugh) writes: >>>If you wanted to, you could even find out the name >>>of the file that was connected to the descriptor. (It is _not_ easy :-( ) >> >>I would think it's actually impossible. The kernel doesn't remember >>the name of the path you used to open an inode, and some descriptors >>(e.g. pipes) have no associated names. > >actually, it is possible. you know the inode associated with the descriptor >start at "/" and just keep looking for that i# keeping track of where you are. >Like the man said, it ain't easy, but it CAN be done. >-- Actually, it is even more difficult. I-numbers are only unique within a filesystem. You would first have to determine which filesystem your FD pointed to, and then start the search from the root of that filesystem. Even then it may not be possible to find the name of the file that was opened. Assuming the FD did reference a file (as opposed to an unnamed pipe or other 'magic' device), the last remaining name could have been unlinked. Which brings us to the last point: there may be multiple links to the file. The best you could hope for is to find the first name or all names, not necessarily the correct name. Check out the [usually undocumented] '-inum' option of FIND(1). Seth desoto!shz
schwartz@swatsun (Scott Schwartz) (05/07/87)
In article <6582@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: > Incidentally, `find / -inum <n>' takes a *long* time on a big system. But if you had to do it, wouldn't you use ncheck(8)? (Assuming you have the required permissions) -- # Scott Schwartz # UUCP: ...{{seismo,ihnp4}!bpa, cbmvax!vu-vlsi, sun!liberty}!swatsun!schwartz # AT&T: (215)-328-8610 /* lab phone */
rbj@icst-cmr.arpa (Root Boy Jim) (05/07/87)
In article <634@boulder.Colorado.EDU> cdash@boulder.Colorado.EDU (Charles Shub) writes: >actually, it is possible. you know the inode associated with the descriptor >start at "/" and keep looking for that i# keeping track of where you are. Doug & Chris have already pointed out the possibility that a file may have more than one name. I wish to point out that it may have none at all. In addition to the obvious case of a pipe (or FIFO, socket, or any other abstraction that uses an inode abstraction internally), consider typing: rm -f foo; yes > foo & rm foo Don't try this at home, kids! (Root Boy) Jim "Just Say Yes" Cottrell <rbj@icst-cmr.arpa>
jgy@hropus.UUCP (05/08/87)
>>>In article <816@killer.UUCP> jfh@killer.UUCP (John Haugh) writes: >>>>If you wanted to, you could even find out the name >>>>of the file that was connected to the descriptor. > >>In article <5835@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn) writes: >>>I would think it's actually impossible. > >In article <634@boulder.Colorado.EDU> cdash@boulder.Colorado.EDU >(Charles Shub) writes: >>actually, it is possible. you know the inode associated with the descriptor >>start at "/" and just keep looking for that i# keeping track of where you are. >>Like the man said, it ain't easy, but it CAN be done. > >`It' can be done: What can be done? The problem is ill-defined >in the first place. `Find *the* name of the file.' Who says there >is only one? > > % ln file ../other/file > >`Find *a* name of the file.' That can be done iff the file has at >least one name. > >`Find the internal name of the file.' Easy: this is just the <dev, >ino> pair. One problem is that few people wish to deal with this >form of the name. > >Incidentally, `find / -inum <n>' takes a *long* time on a big system. >-- Try finding the name if it was unlinked(rm'd) after opening!
peter@citcom.UUCP (Peter Klosky) (05/08/87)
In article <6582@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: > Incidentally, `find / -inum <n>' takes a *long* time on a big system. It's true that scanning the whole file system to find a given inum would take a long time. This approach is like scanning a whole document for a given word by examining each word. A better approach is to have a sorted list of words with pointers to occurences. Then the words can be scanned using binary search. The same approach can be used with inode numbers by preparing a sorted list of inode numbers and file names. Given a list of file system id, inode number, file name records, it is possible to locate possible names for a file open by a process. In many cases, this will let the enhanced "ofiles" recently posted to net.sources reveal the names of the files open by a given process. It will have trouble in the case where the list is out of date, as the system does not update the inum list. For this reason the program can be fine-tuned to scan directories where changes occur often such as /tmp or other directories often used by the application. If the file table of the process has a tcp/ip deal going, "ofiles" knows about that, too, and will report if the process is waiting to receive datagrams concerning "rwho" or whatever. "ofiles" will also cat an unreferenced file, so even with yes >foo& rm foo it is possible to see all the exciting data. n.b. This program is a security hole, so only use it on systems where the users are trusted. -- Peter Klosky, Citcom Systems (materiel de telecommunications) seismo!vrdxhq!baskin!citcom!peter (703) 689-2800 x 235
chris@mimsy.UUCP (Chris Torek) (05/09/87)
>In article <6582@mimsy.UUCP> I wrote: >>`find / -inum <n>' takes a *long* time on a big system. In article <1116@pompeii.UUCP> schwartz@swatsun (Scott Schwartz) writes: >But if you had to do it, wouldn't you use ncheck(8)? (Assuming you have >the required permissions) A dangerous assumption: % df /usr Filesystem kbytes used avail capacity Mounted on /dev/hp4a 182123 138994 24916 85% /usr % ls -lg /dev/*hp4a brw-r----- 1 root operator 0, 32 Oct 17 1986 /dev/hp4a crw-r----- 1 root operator 4, 32 Apr 27 08:52 /dev/rhp4a % groups staff wheel daemon sys kmem operator uucp internet emacs zmob tex bridge mcmob info speech um-software % Anyone can run `find /usr', but only root and group operator can read the drive directly. Had *I* to do it, I would indeed use ncheck; but the average program seeking to tie a name to, e.g., stdin could not assume that ncheck would succeed. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: seismo!mimsy!chris
brandon@tdi2.UUCP (05/12/87)
In article <5835@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: +--------------- | In article <816@killer.UUCP> jfh@killer.UUCP (John Haugh) writes: | >If you wanted to, you could even find out the name | >of the file that was connected to the descriptor. (It is _not_ easy :-( ) | | I would think it's actually impossible. The kernel doesn't remember | the name of the path you used to open an inode, and some descriptors | (e.g. pipes) have no associated names. +--------------- Well, it's possible for non-pipes; unfortunately, you have to essentially recode ncheck(1m) to do it. ++Brando -- Brandon S. Allbery UUCP: cbatt!cwruecmp!ncoast!tdi2!brandon Tridelta Industries, Inc. CSNET: ncoast!allbery@Case 7350 Corporate Blvd. INTERNET: ncoast!allbery%Case.CSNET@relay.CS.NET Mentor, Ohio 44060 PHONE: +1 216 255 1080 (home +1 216 974 9210)
jfh@killer.UUCP (John Haugh) (05/13/87)
In article <314@desoto.UUCP>, shz@desoto.UUCP (S. Zirin) writes: > In article <634@boulder.Colorado.EDU> cdash@boulder.Colorado.EDU (Charles Shub) writes: > >In article <5835@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: > >>In article <816@killer.UUCP> jfh@killer.UUCP (John Haugh) writes: > >>>If you wanted to, you could even find out the name > >>>of the file that was connected to the descriptor. (It is _not_ easy :-( ) > >> > >>I would think it's actually impossible. The kernel doesn't remember > >>the name of the path you used to open an inode, and some descriptors > >>(e.g. pipes) have no associated names. > > > >actually, it is possible. you know the inode associated with the descriptor > >start at "/" and just keep looking for that i# keeping track of where you are. > >Like the man said, it ain't easy, but it CAN be done. > >-- > I wanted to remove Doug's remarks at this point and would like to have included Guy Harris's as he had a much better disagreement with me than Doug did. This next guy does not understand the meaning of the phrase It is _not_ easy :-( <-- frown face ... > Actually, it is even more difficult. Like I said, X X | ___ / \ > ... I-numbers are only unique within > a filesystem. You would first have to determine which filesystem your FD > pointed to, and then start the search from the root of that filesystem. A file is described by a device/i-number pair to be more exact. Pipes have the pipe device for their device number or so I recall, unless they are named pipes - > ... Even > then it may not be possible to find the name of the file that was opened. Did I say I want THE name or just a name? I forget, and I don't want to scroll up in the editor - So I say now what I meant. A NAME. Multiple links to a file are equivalent in USG Unix - only BSD Unix has 'symbolic links' (last time I checked). These might be different, in which case, all bets are off. > Assuming the FD did reference a file (as opposed to an unnamed pipe or other > 'magic' device), the last remaining name could have been unlinked. Then I guess the file doesn't have a name. In which case noone could find out the name. Sorry. :-( > ... Which > brings us to the last point: there may be multiple links to the file. The > best you could hope for is to find the first name or all names, not > necessarily the correct name. > I think I handled this one in the paragraph a ways back. If all paths lead to the same inode, why should I care which one I want. And if I do, I can go look for the rest of the names. Once again, it ain't easy. > Check out the [usually undocumented] '-inum' option of FIND(1). You work for AT&T right? Well what is the deal with the '-depth' option of find(1)? > > Seth > desoto!shz - John. Disclaimer - No disclaimer. Whatcha gonna do, sue me?
jfh@killer.UUCP (John Haugh) (05/15/87)
Every one writes:
yes - no - yes - no - yes - no.
Every one reads:
confusion - confusion - confusion.
Some files have no name[s]. If the file descriptor is for one of those,
you ain't gonna get a name. Files in this catagory that come to mind are
unnamed pipes, and files that have been removed. You will never find a
name for these guys.
Some files have names that you can't access. If you aren't root you can't
look everywheres for the directory names, unless your system permits it.
You may never find a name for these guys unless you are root.
Some files have more than one name. You really only need one unless you
want to do something special, (like unlink it) since all links are equal
(except in BSD land). You CAN find at least one name for these files.
Did I cover everything? Lets put this one to rest. Respond by E-Mail
rather than posting. Also, somewheres I have a PD copy of ftw(3) - I
think someone posted one also a while ago.
- John. (jfh@killer.UUCP)
Disclaimer -
If my boss knew what USENET was, he'd want one of his own.
Favorite Borrowed Saying -
"It's never too late for a happy childhood."