rside@uvicctr.UUCP (Robert Side) (09/10/88)
Here is finally the summary on how to get a file name from the file descriptor. First, I would like to thank all the people that responded to my problem on checkpointing processes as well as how to get file name from a file descriptor. I tried to respond to all people who sent me mail and I think I was more successful this time, but if a reply did not reach you, let me now say *thank you* for your reply. Second, I would like to thanks two people Dave Curry and der Mouse for sending me source to their solutions to my problem. As an aside I will *NOT* send their source to anyone since I do not have permission from them to do so. If you feel you need the source I suggest you mail to these two people directly. Finally, in short I believe the problem of checkpointing a process with open files has been solved. At least to my satisfaction. The specific question of an easy way of finding the file name from a file descriptor is not solved. There may not even be a solution to this problem as is discussed below. ----------------- From: Amos Shapir <taux01!taux02.taux01.UUCP!amos@nsc> >Summary: it's impossible. All a process has is a file descriptor, which >may be connected to a pipe (and in modern systems, to a socket whose >other end is in Timboktu). Even if it is a regular file, it may have >been inherited from a great-grandparent, so changing fopen to keep track >of file names is not sufficient. >-- > Amos Shapir amos@nsc.com >National Semiconductor (Israel) >6 Maskit st. P.O.B. 3007, Herzlia 46104, Israel Tel. +972 52 522261 >34 48 E / 32 10 N (My other cpu is a NS32532) ----------------- From: uunet!dalsqnt!vector!chip (Chip Rosenthal) >Not easily. > >You could do it by calling fstat() with the filedes, which will give you >the inode of the file and device it resides on. Then you have to search >through that device for all directory entries which reference this inode. >This is what the SysV ncheck(1) does -- or at least it's what the XENIX >V ncheck(C) does. In both cases you need superuser privileges. Also, >this is not real clean -- possible problems: the filedes is a pipe, the >file contains multiple links, the file has been rm'ed by another process, >etc. >--- >Chip Rosenthal chip@vector.UUCP | I've been a wizard since my childhood. >Dallas Semiconductor 214-450-0486 | And I've earned some respect for my art. ----------------- [ I lost the first message received by Dave Curry, (shame on me), however I will try to state approximately what he said ] From: davy@relay.ubc.ca (Dave Curry) (Message 1) > [I (Dave Curry) have written a set of library routines that will] > [checkpoint and recover processes. They where written on a VAX for] > [BSD 4.2. I do not remember if they handle sockets, but they do] > [handle open files and pipes. If you like I can mail you a copy.] > [The only request that I make is that if you use my code that you] > [send me the diffs] > [ I (Rob speaking now) sent Dave mail asking him if he could dig up the source and send it to me and his next response (along with a transcript of my message) follows ] From: davy@relay.ubc.ca (Dave Curry) (Message 2) > [ This is Rob speaking in the indented stuff ] > > [ Some stuff deleted ] > > I would like to take a look at the code, from what you have said > it is pretty close to meeting my specs. There are a few things > I am worried about. There will be open sockets. I guess I never > said this in the article but when a rollback occurs it must > overwrite the current memory image to keep the same processes id > > [ Some Stuff deleted ] > >Keeping the same pid is easy enough, I guess. The library writes the >executable to the file "chkpt.dat" (user-settable), so assuming you >have a process with the correct pid running, all you need to do is >execl() "chkpt.dat", and you're all set. > >I'm still not sure how you'd go about creating sockets. It's easy enough >to "repoen" them I guess, and you could probably even save all the connect >info and reconnect them to their servers. But unless your servers and >clients are all stateless, you're going to have a hard time putting the >whole mess back into the same state. > > It sounds that your library can modified to meet my needs. I > have written routines to checkpoint and rollback processes that > do not have open files, so if I could see how you restore the > files this would be a great help. > > It would be *much* appreciated if you could dig up the code and > mail it to me. If I make any changes I CERTAINLY will mail you > the diffs or the complete source of the changes if it is deemed > necessary. > >I'll probably have to pull it off tape. I'll see if I can get to it >today or tomorrow, if at all possible. > > [ My signature Deleted ] > >--Dave > [ In Dave's last correspondence I received his code and low and behold it also handles sockets (almost) ] From: davy@relay.ubc.ca (Dave Curry) [ Message 3 ] >Here it comes... I looked through it, and it seems that it already does >catch some of the socket system calls (the ones that allocate file >descriptors), but there's also code that checks to see if the >descriptor is a socket in chkpt.c and restore.c, so you'll need to fix >that. Also check the two #ifdef vax sections, which will require a >few lines of assembler if you're not on a Vax. > >Finally, check the Makefile - it probably doesn't install things where >you will be wanting them... > >--Dave > > [ The actual code is deleted ] [ I beleive Dave's code will work and I was in the process of getting it compiled when our Suns went down. They will be up this weekend I hope and early next week I should be able to test it ] ----------------- From: uunet!hao.UCAR.EDU!pag (Peter Gross) >One problem: file descriptors do not always refer to files. Depending >on which version of Unix you are running, they could be pipes, sockets, >fifo's, etc. Thus your solution of redoing the stdio lib to trap >file names would leave some holes. > >--peter gross ----------------- From: alberta!edm!steve >stat(2) gives both an inode and a device #. I'm not exactly sure about the >mapping from device # to device name/map point but, as a worst case, you could >always fstat /<mountpoint>/. for each mounted device and then stop when you >get a correct value. > > One point: from an inode #, the best that I can figure out what to get is >A file name. If a file has multiple links, then you can sometimes find >multiple names for the file but, in most cases, this should not be a problem >for you. > >btw: the way ncheck (probably) gets file nams from inode #s is to fstat every >file in the apropriate mounted filesystem. To speed things up, it might be >worthwile to assume that most of the files are in (or below) the current >directory, and start by spanning that tree before you go thru the rest of the >file system. > Sorry for being so verbose. >------------- >Stephen Samuel (userzxcv@ualtamts.bitnet or alberta!edm!steve) >MS-DOS : CPM impersonating UNIX ** OS/2 : IBM impersonating APPLE > ----------------- From: uunet!gatech.gatech.edu!emory!vss (V.S.Sunderam) > I just read your recent postings regarding checkpointing & wanted > to let you know of our attempts in this regard. Our main > interest is process migration, but checkpoint restarts are a > special case & we do have some software that does this for Sun's. > However, we do not (yet) handle processes that use sockets; the > only other limitation is that the process use only NFS files. > > The Winter 88 Usenix proceedings (pp 357) has our paper that > describes the mechanisms & the software. If you are interested > I would be happy to give you more info and/or source code. > > V.S.Sunderam > Dept.of Math & CS > Emory University > Atlanta, GA 30322 > vss@mathcs.emory.edu > ...!gatech!emory!vss ----------------- From: der Mouse <mcgill-vision!uunet!Larry.McRCIM.McGill.EDU!mouse> [Message 1] >I implemented something similar once. What I did was to checkpoint a >process into a file for later resumption, but the constraints were >somewhat different. In particular, the whole point was to be able to >restore a simulatior run after a crash, which makes restoring open >files and so on effectively impossible. This is the difficult part of >this: open files. My "solution" was to force the program to close all >files before checkpointing; this was feasible in our case. > >Have you considered forking and letting one process run on, with the >"resumption" consisting of switching to the other process? Depending >on what you want, this might be good enough. > >Doing this would involve just adding two syscalls, one to dump a >process and one to restore it. Yes, it's possible. I wouldn't attempt >it without kernel source, but then I get very dogmatic about having >source. I'd be glad to send you the code I have for dumping and >restoring later, in another process, though it won't be directly useful. > > der Mouse > > old: mcgill-vision!mouse > new: mouse@larry.mcrcim.mcgill.edu [ I wrote the >> parts ] From: der Mouse <mcgill-vision!uunet!Larry.McRCIM.McGill.EDU!mouse>[Message 2] >> 1) If it is not too much trouble could you please send the code. I >> have implemented two routines to save and restore a process and it >> does seem to work on small test programs and these program must >> not have open files. I am currently working on the problem with >> open files. > >> 3) One of the limitations thrust upon me is NO KERNEL CHANGES > >I will be astonished if you get it to work with no kernel changes, >unless you always use OMAGIC executables, and even then I would expect >it to be quite a can of worms. > >My code consists of two syscalls, one to dump a process and the other >to restore it. The kernel code is in the following shar as snapshot.c; >the only other tricky part is that the user-level code surrounding the >snapshot syscall is special. Everything but the stack pointer is saved >on the stack to make life easier for the kernel. This code follows >after the shar. > >The kernel code here is for a mtXinu 4.3+NFS system; for real 4.3 all >that needs changing is to scrap the silly vnode code and put back the >real inode stuff. > [ Actual Code Delete ] > >Since you are forbidden kernel changes, this probably won't be much use >to you. If you'd like to talk about this some more, feel free to send >me mail. > >der Mouse > >old: mcgill-vision!mouse >new: mouse@larry.mcrcim.mcgill.edu -- Robert Side <rside@uvunix.uvic.cdn> UUCP: ...!{ubc-vision,uw-beaver,ssc-vax}!uvicctr!rside BITNET: rside@uvunix.bitnet