anthony@alberta.UUCP (Anthony Mutiso) (11/02/88)
I have a feeling that someone out there has tried to implement a _remote fork_ system call. This is necessary for process migration etc. MY PROBLEM: How does one copy a active process execution image, and restart it else where jumping to same location the parent process is at. REQUIREMENTS: Open file descriptors and the offset that where in the parent process are available in the so called child (Parent child relationship slightly altered). All variables hold the same values as they did in the parent just prior to the _remote fork_ call. The child continues it's existence from the point the parent forked at. == all the above are the results we have all come to love == in the fork(2) system call. I have looked at all sorts of things with very poor results. Copying the parents file descriptor table, but where does that leave me, at best I will end up with inode numbers that are rather difficult to map back to file pathnames. Generating a core of the running process (stopped of course), and finding a way to transform the core(5) to a.out(5) format with the program entry point somewhere else. (How does one do that). I need ideas, clues, insight, and general all-round help. Please if anyone has looked at this issue please fill me in (mail). Thanks for any hints Anthony Mutiso anthony@alberta.uucp or {watmath, ubc-vision}!alberta!anthony
dlm@cuuxb.ATT.COM (Dennis L. Mumaugh) (11/03/88)
In article <1777@pembina.UUCP> anthony@alberta.UUCP (Anthony Mutiso) writes: > I have a feeling that someone out there has tried to > implement a _remote fork_ system call. This is necessary for > process migration etc. > MY PROBLEM: How does one copy a active process execution > image, and restart it else where jumping to same location the > parent process is at. > REQUIREMENTS: Open file descriptors and the offset that > where in the parent process are available in the so called > child (Parent child relationship slightly altered). All > variables hold the same values as they did in the parent just > prior to the _remote fork_ call. The child continues it's > existence from the point the parent forked at. == all the > above are the results we have all come to love == in the > fork(2) system call. > I have looked at all sorts of things with very poor results. > Copying the parents file descriptor table, but where does > that leave me, at best I will end up with inode numbers that > are rather difficult to map back to file pathnames. > Generating a core of the running process (stopped of course), > and finding a way to transform the core(5) to a.out(5) format > with the program entry point somewhere else. (How does one do > that). > I need ideas, clues, insight, and general all-round help. > Please if anyone has looked at this issue please fill me in > (mail). Ordinarily I would respond by email BUT people haven't heard of the following, so I will append the necessary pointers. By the way, it includes the scheme for mapping from file descriptor to file name, etc. [ This is refer format ]. %A D. H. Lawrie %A J. M. Randal %A R. R. Barton %T Experiments with Automatic File Migration %J COMP %I University of Illinois %D 1982 %P 45-55 %A David Maier %R UIUCDCS-R-86-1240 %I Department of Computer Science, University of Illinois %C Urbana, Illinois 61801 %A R. P. Cagel %T Process Suspension and Resumption in The UNIX System V Operating System %D January 1986 %K process migration %R M.S. Thesis %X Process suspension and resumption features were added to the UNIX kernel. This will allow a user to reboot the operating system without having to kill long running processes. The process images are extracted from the kernel and saved in disk storage before the system halts. Each process may be restarted from the point where it left off or even moved to another machine to be resumed. This thesis describes the kernel changes to accomplish this. %T An unix 4.2 BSD implementation of Process suspension and resumption %A A.Y. Chen %D June 86 %R UIUCDCS-R-86-1286 %I Department of Computer Science, University of Illinois %C Urbana, Illinois 61801 %T Process Suspension and Resumption in The UNIX System V Operating System %K process migration %R M.S. Thesis %X Process suspension and resumption features were added to the UNIX kernel. This will allow a user to reboot the operating system without having to kill long running processes. The process images are extracted from the kernel and saved in disk storage before the system halts. Each process may be restarted from the point where it left off or even moved to another machine to be resumed. This thesis describes the kernel changes to accomplish this. -- =Dennis L. Mumaugh Lisle, IL ...!{att,lll-crg}!cuuxb!dlm OR cuuxb!dlm@arpa.att.com
ag@elgar.UUCP (Keith Gabryelski) (11/03/88)
In article <1777@pembina.UUCP> anthony@alberta.UUCP (Anthony Mutiso) writes: >MY PROBLEM: How does one copy a active process >execution image, and restart it else where >jumping to same location the parent process is at. And on a related topic... How does one stop a process in a way that it can be restarted after a cold boot? It would seem to me that restarting a core image would be the best way. I remember some discussion on restarting core dumps a few months back. Does anyone have a copy of the thread? Pax, Keith -- ag@elgar.CTS.COM Keith Gabryelski ...!{ucsd, jack}!elgar!ag
anthony@alberta.UUCP (Anthony Mutiso) (11/06/88)
In article <2163@cuuxb.ATT.COM>, dlm@cuuxb.ATT.COM (Dennis L. Mumaugh) writes: > In article <1777@pembina.UUCP> anthony@alberta.UUCP (Anthony Mutiso) writes: > > > MY PROBLEM: How does one copy a active process execution > > image, and restart it else where jumping to same location the > > parent process is at. > > [ This is refer format ]. > > %A D. H. Lawrie > %A J. M. Randal > %A R. R. Barton > %T Experiments with Automatic File Migration > %J COMP > %I University of Illinois > %D 1982 > %P 45-55 (1) How would one go about converting a core image to a a.out object. (2) All the data in the new a.out is initialized to the values present in the core image at the time it was made. (3) Have the program entry point somewhere in the program other than in the main function, "the process contiunes as if it always exsisted". of course some type of inti function will have to open all the former process files and wind then up to the correct locations. Hints, ideas anything. Anthony Mutiso anthony@alberta.UUCP
jbn@glacier.STANFORD.EDU (John B. Nagle) (11/07/88)
It's been done. See "The LOCUS Distributed System Architecture", by Popek and Walker, MIT Press, 1985. ISBN 0-262-16102-8 LOCUS is a distributed UNIX kernel developed at UCLA. It's 4.2BSD compatible, yet allows full distribution over a network of heterogeneous machines. Processes can be migrated from one similar machine to another while running, using the migrate(II) system call. Open files, pipes, signals, and sockets survive migration. Even shared file position works; the mechanism for doing this efficiently is very clever. A user can migrate his own tasks, or a background scheduler may force task migration. Across heterogeneous CPUs, one can perform "exec"; an "exec" of an object program that needs a different kind of machine results in execution on a suitable machine elsewhere in the network. Very impressive. Not clear why it never caught on. John Nagle type of CPU than the one the process is running on will result in the process
ekrell@hector.UUCP (Eduardo Krell) (11/07/88)
In article <17819@glacier.STANFORD.EDU> jbn@glacier.UUCP (John B. Nagle) writes: (about LOCUS) >Not clear why it never caught on. But it did. It is licensed by IBM as part of AIX. They call it "Transparent Computing Facility", I think. Eduardo Krell AT&T Bell Laboratories, Murray Hill, NJ UUCP: {att,decvax,ucbvax}!ulysses!ekrell Internet: ekrell@ulysses.att.com
gwyn@smoke.BRL.MIL (Doug Gwyn ) (11/09/88)
In article <16@elgar.UUCP> ag@elgar.UUCP (Keith Gabryelski) writes: >How does one stop a process in a way that it can be restarted after a >cold boot? You obviously can't, in general.
geoff@eagle_snax.UUCP ( R.H. coast near the top) (11/09/88)
In article <17819@glacier.STANFORD.EDU> jbn@glacier.UUCP (John B. Nagle) writes: > > It's been done. See "The LOCUS Distributed System Architecture", >by Popek and Walker, MIT Press, 1985. ISBN 0-262-16102-8 >LOCUS is a distributed UNIX kernel developed at UCLA. It's 4.2BSD >compatible, yet allows full distribution over a network of heterogeneous >machines. [...] Very impressive. Not clear why it never caught on. Well, the early versions were pretyy s-l-o-o-o-w at doing the niftier things, but I think a lot of that got fixed. However I understand that quite a number of companies entered into licensing negotiations with Locus, including at least one which bet the - software - future of the company on being able to get hold of Locus and use it to compete with Apollo and Sun. Unbeknownst to these hopefuls, IBM had funded the Locus startup, and eventually decided to exercise their option to an exclusive license, thus causing Locus to pull the plug on all of the other suitors. A number of the elements of Locus are now beginning to trickle out in the form of AIX features. -- Geoff Arnold, Sun Microsystems Inc. +------------------------------------+ PC Distributed Systems(home of PC-NFS)|Someone, somewhere, wants an RFC822 | UUCP: {hplabs,decwrl...}!sun!garnold |message from YOU. | ARPA: garnold@sun.com +------------------------------------+