robertd@ncoast.UUCP (05/20/87)
Can some one tell me what a "File Descriptor" is? Thank you. [> Rd -- [=====================================] [ Rob DeMarco ] [ UUCP:decvax!cwruecmp!ncoast!robertd ] [ ] [ "I hate 'Wheel of fortune'....and ] [ proud of it!!" ] [=====================================]
gwyn@brl-smoke.UUCP (05/21/87)
In article <2532@ncoast.UUCP> robertd@ncoast.UUCP (Rob DeMarco) writes: > Can some one tell me what a "File Descriptor" is? Thank you. It's just an index into an open-file table maintained inside the kernel; the open-file table is used to keep track of the state of the open file (such as, where is the actual data, and how far into the data is the file position pointer associated with this F.D.). Think of it as a "handle" on the file that the kernel gives you when you open it. For further information read Ken Thompson's "UNIX Implementation" (used to be in Vol. 2 of the UNIX manual set).
terryl@tekcrl.TEK.COM (05/22/87)
In article <5875@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >In article <2532@ncoast.UUCP> robertd@ncoast.UUCP (Rob DeMarco) writes: >> Can some one tell me what a "File Descriptor" is? Thank you. > >It's just an index into an open-file table maintained inside the >kernel; the open-file table is used to keep track of the state of >the open file (such as, where is the actual data, and how far into >the data is the file position pointer associated with this F.D.). >Think of it as a "handle" on the file that the kernel gives you when >you open it. For further information read Ken Thompson's "UNIX >Implementation" (used to be in Vol. 2 of the UNIX manual set). I can't believe this. Doug, you really blew it on this one. Close but no cigar. A file descriptor (since we haven't established an official definition of a file descriptor, let me propose one: A file descriptor is the value you get back from an "open" or "creat" system call. If that's not Doug's definition, then he could be right) is just an index into a data structure; this data struc- ture is part of the PER PROCESS information, not part of the system wide as what Doug is alluding to. Now, this data structure in the per process infor- mation does have pointers to the system wide data structures (what Doug refers to as the "open-file table"), and everything else is as Doug described it (well, there are a couple of other things, but they really don't have anything to do with the original question). All of this is true for Berkeley 4.2/4.3 (I just looked at the code just to make sure 4.3 didn't change what file descriptors mean). I have absolutely no idea if this is the way System V does things. Terry Laskodi of Tektronix
guy%gorodish@Sun.COM (Guy Harris) (05/22/87)
> All of this is true for Berkeley 4.2/4.3 (I just looked at the code just > to make sure 4.3 didn't change what file descriptors mean). I have absolutely > no idea if this is the way System V does things. Well, this happens to be one of the things that has remained pretty much constant in UNIX implementations since at least V6 (and probably back further than that). 4.3 didn't change anything of consequence (which isn't really surprising - there's really little that needs changing), and neither 4.2BSD, nor System III, nor System V, introduced any major changes. (The only change 4.2BSD made was to have objects other than inodes attached to a file table entry; Bill Shannon noted that, given a system with multiple "file system types", it might have been possible to use that mechanism instead.) The distinction between a file descriptor (i.e., either the small number you get back from "open", "dup", etc.) and a file table entry (the entry in the system-wide table that indicates things like the current seek pointer) is not significant in most cases, so Doug's description is, at worst, a slight over-simplification. The only per-file-descriptor state in the system is the "close on 'exec'" flag. Most operations treat all file descriptors that refer to the same file table entry as equivalent. To quote from the S5R3 manual page DUP(2): "dup" returns a new file descriptor having the following in common with the original: Same open file (or pipe). Same file pointer (i.e., both file descriptors share one file pointer). Same access mode (read, write or read/write). which describes most of the state stored with a file descriptor. Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
gwyn@brl-smoke.ARPA (Doug Gwyn ) (05/23/87)
In article <1673@tekcrl.TEK.COM> terryl@tekcrl.tek.com writes: >Doug, you really blew it on this one. Ahem, I certainly am aware that the file-descriptor is an index into a per-process data structure, which I chose to call an "open-file table" both for simplicity (after all, if the inquirer knew all this stuff he wouldn't be asking the question) and because that's what Ken Thompson called it in the cited article. Actually, he explained the distinction between the "per-user open file table", which is what I was describing, and the more global "open file table". The only reason there are two such tables rather than one is to permit the sharing of the file position pointer across a fork. I don't see the necessity for this particular characteristic (I can't recall ever having made use of it), so as far as I'm concerned there might as well be only one open-file table. File descriptors could then be unique indices into the system table. (The other use for per-process indices is that one can then guarantee the use of small values such as 0, 1, and 2.) I didn't feel it was worth trying to explain this two- level aspect of the open file tables when the inquirer would undoubtedly be happy to get any definite grasp of the concept.
aegl@root44.UUCP (05/27/87)
In article <5881@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes: >The only reason there are two such tables rather than one is to permit >the sharing of the file position pointer across a fork. I don't see >the necessity for this particular characteristic (I can't recall >ever having made use of it), so as far as I'm concerned there might >as well be only one open-file table. You *must* have used this ... consider about what happens when you have a shell script like this: $ cat hello.sh /bin/echo "Hello" /bin/echo "world" and you run it with output redirected to a file. $ ./hello.sh >outputfile Your shell opens/creates "outputfile", truncates it, does tricks with dup() to make sure it is file descriptor 1. The file pointer for stdout is now 0. The shell forks and execs the first "echo" this outputs "Hello" - and the file pointer for stdout is set to 6 (5 chars in "Hello" + newline). Then echo exits and the shell wakes up and execs the next echo. If the file pointer hadn't been shared across the fork/exec then the shell would still have it set at 0 - so the "world" would get written on top of the "Hello". Luckily (for every shell script that ever ran more than one program that produced any output) the pointer was shared so the "world" starts at byte offset 6 in "outputfile". $ cat outputfile Hello world Tony Luck - Technical Manager, Root Computers Ltd. <aegl@root.co.uk>
roy@phri.UUCP (05/27/87)
In <5881@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: > The only reason there are [distinct per-system and per-process open file] > tables rather than one is to permit the sharing of the file position > pointer across a fork. I don't see the necessity for this particular > characteristic (I can't recall ever having made use of it) The last program I can think of that took advantage of this sharing was the v6 shell (boy, I seem to be on a nostalgia trip these past few days; maybe that old-timers BOF isn't such a bad idea). Exit and goto (and if) were not built in to the shell, but were fork/exec'ed just like any other command. Since they shared stdin with the shell (and the shell didn't do buffered reads) they could do seeks on stdin and alter what line in your shell script file the parent shell would read next. Goto would rewind stdin and search for the label, leaving the file pointer right after it; exit would simply seek to EOF on stdin; when it exited, the shell would see EOF and exit just as if you typed control-D. I never thought to try this before, but I wonder what would have happened if you did "(sleep 10; goto foo)&" inside a shell script. Yuck! Modern shells have goto and exit, as well as if/then/else, for, while, and the kitchen sink built in. This makes shell scripts run faster. It also makes them not fit into a 64k address space. BTW, I agree with Doug; when answering a question, it is better to leave out some details if that makes the gist of the answer clearer. The questioner can always come back for more later. -- Roy Smith, {allegra,cmcl2,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016
guy%gorodish@Sun.COM (Guy Harris) (05/29/87)
> Modern shells have goto and exit, as well as if/then/else, for, > while, and the kitchen sink built in. This makes shell scripts run faster. > It also makes them not fit into a 64k address space. If you leave the kitchen sink out, you can fit it into a 64k address space. I've seen the Bourne shell and the PWB/UNIX 1.0 or Mashey shell both run on a non-split-I&D-space PDP-11. (The Bourne shell, BTW, doesn't have "goto" built in; it's a "goto"less language.) Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com