mike@sachiko.acc.stolaf.edu (Mike Haertel) (04/15/91)
In article <10422@pitt.UUCP> jonathan@cs.pitt.edu (Jonathan Eunice) writes: >Why isn't process migration common? Two reasons I can think of right off the top of my head are: 1. It is probably difficult to retrofit to existing operating systems. My guess (as someone who is attempting to design an OS for fun) is that some early design decisions (such as whether to have lots of implicit per-process state, like Unix) can make or break the feasibility of migration. 2. The benefits may not be all that great. For example, is it worthwhile to migrate a process that is paged to a local disc to another machine from which it will have to be paged over the network? Similar questions apply regarding other kinds of I/O bound jobs. One OS I know of that could easily support migration is Amoeba; it's interested to note that the current version of Amoeba supports load balancing to control processor allocation at process creation time, but does not have any more general migration facilities. I suspect this is a case of getting 90% of the benefits at 10% of the cost. Personally, I tend to think of process migration as being a good test of the flexibility of an OS, but not something you really want to use in practice.
gdtltr@chopin.udel.edu (root@research.bdi.com (Systems Research Supervisor)) (04/17/91)
In article <10422@pitt.UUCP> jonathan@cs.pitt.edu (Jonathan Eunice) writes: => =>Why isn't process migration common? => Two reasons: 1) It is hard to do in general. There is more to a process's state than most people realize, including the kernel state and relationships with other processes. There is also the potential overhead of copying the entire code and data across a network. There are a couple papers on the subject that confirm this, but I don't have my bibliographical stuff with me. 2) Depending on the system model, you may get comparable performance simply by using load balancing when new processes are created and leaving them where they go. The only case where I would want process migration is when more than one long-running CPU-bound processes are assigned (erroneously) to a single node. The most prominent use of processes migration I know of is in Sprite, which primarily uses it as a policy tool: foreign processes running on a user's idle workstation are evicted to their "home" node when the user returns. Gary Duzan Time Lord Third Regeneration -- gdtltr@brahms.udel.edu _o_ ---------------------- _o_ [|o o|] Two CPU's are better than one; N CPU's would be real nice. [|o o|] |_o_| Disclaimer: I AM Brain Dead Innovations, Inc. |_o_|
gdtltr@chopin.udel.edu (root@research.bdi.com (Systems Research Supervisor)) (04/17/91)
In article <1991Apr14.215738.8745@news.stolaf.edu> mike@sachiko.acc.stolaf.edu (Mike Haertel) writes: => =>One OS I know of that could easily support migration is Amoeba; it's =>interested to note that the current version of Amoeba supports load =>balancing to control processor allocation at process creation time, =>but does not have any more general migration facilities. I suspect =>this is a case of getting 90% of the benefits at 10% of the cost. => I believe that Amoeba supports a process checkpointing function (stun?) for dumping process state to a file. Theoretically, this could be restarted on another compatible processor. Amoeba avoids the problem of redirecting I/O by making capabilities for files location-independent. Note, however, that if a related process or file is killed/deleted while the process is being migrated, the restarted process would most likely fail at some point. Gary Duzan Time Lord Third Regeneration -- gdtltr@brahms.udel.edu _o_ ---------------------- _o_ [|o o|] Two CPU's are better than one; N CPU's would be real nice. [|o o|] |_o_| Disclaimer: I AM Brain Dead Innovations, Inc. |_o_|
douglis@cs.vu.nl (Fred Douglis) (04/17/91)
mike@sachiko.acc.stolaf.edu (Mike Haertel) writes: >One OS I know of that could easily support migration is Amoeba; it's >interested to note that the current version of Amoeba supports load >balancing to control processor allocation at process creation time, >but does not have any more general migration facilities. I suspect >this is a case of getting 90% of the benefits at 10% of the cost. As the designer of process migration in Sprite, and the person who plans to implement full process migration in Amoeba, I might as well throw in my two cents: - Process migration is not that hard, once you have transparent remote execution. As Amoeba already supports the latter, adding migration shouldn't be the other 90% of the cost. Rather, I'd guess (and I'm hoping) that 90% of the cost has already been paid, and I can put in full migration for the other 10%. - You're absolutely right about process creation time being the important point. I think the general consensus about migration is that it's more useful for other things, like machine autonomy (as in Sprite, with personal workstations) or to move from a machine that's being shut down (TCF/AIX has been used for this purpose). -- ============================================================================= Fred Douglis, Vrije Universiteit, douglis@cs.vu.nl +31 20 548-5777 =============================================================================
timk@cs.qmw.ac.uk (Tim Kindberg) (04/17/91)
In <16932@chopin.udel.edu> gdtltr@chopin.udel.edu (root@research.bdi.com (Systems Research Supervisor)) writes: >In article <10422@pitt.UUCP> jonathan@cs.pitt.edu (Jonathan Eunice) writes: >=> >=>Why isn't process migration common? >=> > Two reasons: 1) It is hard to do in general. There is more to a >process's state than most people realize, including the kernel state and >relationships with other processes. There is also the potential overhead >of copying the entire code and data across a network. To minimise copying, standard VM techniques can be employed. Sprite, for example, writes dirty data/stack pages to disc and page-faults its address space components back in again as necessary at the new site. Incidentally, Sprite exhibits another migration issue related to your first point. A migrated process in Sprite still depends for some facilities on its original site, making it vulnerable to that site's failure; and also meaning that it continues to impose a certain amount of load there. Migration is a better prospect on a distributed memory multiprocessor with a high interconnection bandwidth (my own kernel, Equus, shows this: 60 milliseconds to migrate all of a 100K process over a VME bus-based network; 420 milliseconds for a 1M process). >There are a couple >papers on the subject that confirm this, but I don't have my bibliographical >stuff with me. See Y. Artsy, R. Finkel, 'Designing a process migration facility - the Charlotte experience, IEEE Computer, vol 22, no 9, Sep 89, pp 47-56, for info and references on a number of designs. >2) Depending on the system model, you may get comparable performance simply >by using load balancing when new processes are created and leaving them >where they go. The only case where I would want process migration is when >more than one long-running CPU-bound processes are assigned (erroneously) Erroneously? Who/what knew they were going to be CPU bound? Or, suppose the system put them there because it made sense given the load on the other machines at the time? Having a process migration facility means that, when the cross-machine load profile becomes unbalanced due to processes dying or entering new phases with different load-related behaviours, you can do something about it. Other reasons for having process migration (apart from withdrawing when someone logs on to a previously idle workstation): 1) if two processes start a lengthy interaction involving only synchronous communication, migrate one of them to the other's site to save network overhead. 2) I don't have virtual memory in Equus; if a process attempts to increase its data size and fails for lack of memory, it can migrate to another site where there is sufficient memory. I'm interested to hear about other implementations. In particular, I'm not familiar with the AIX one: can anyone give me a reference for that? -- Tim Kindberg UUCP: timk@qmw-cs.uucp | Computer Science Dept ARPA: timk%cs.qmw.ac.uk@nsfnet-relay.ac.uk | QMW, University of London JANET: timk@uk.ac.qmw.cs | Mile End Road Voice: +44 71 975 5236 (Direct Dial) | London E1 4NS
gd@geovision.gvc.com (Gord Deinstadt) (04/18/91)
In article <10422@pitt.UUCP> jonathan@cs.pitt.edu (Jonathan Eunice) writes: >Why isn't process migration common? Several replies suggested that it may not be all that useful. I can think of a use: fault-tolerant systems. Though you need more than just process migration, it would be an interesting way to start. -- Gord Deinstadt gdeinstadt@geovision.UUCP
gdtltr@brahms.udel.edu (root@research.bdi.com (Systems Research Supervisor)) (04/19/91)
In article <3057@redstar.cs.qmw.ac.uk> timk@cs.qmw.ac.uk (Tim Kindberg) writes: =>In <16932@chopin.udel.edu> gdtltr@chopin.udel.edu (root@research.bdi.com =>(Systems Research Supervisor)) writes: => =>>where they go. The only case where I would want process migration is when =>>more than one long-running CPU-bound processes are assigned (erroneously) =>Erroneously? Who/what knew they were going to be CPU bound? I suppose my above not-terribly-well-thought-out statement reflects a subconscious desire for more intelligent process behavior prediction. In any case, I am most familiar with the Amoeba processor pool model, in which the normal case would have the processes running on different processors anyway. Of course, if Dr. Douglis is correct in his estimate of the complexity of adding process migration to Amoeba, I certainly have no objections. It can certainly be used to support a number of scheduling policies. If MP can be done without a significant reduction in system performance or major increase in kernel complexity, there is little reason not to put it in. Gary Duzan Time Lord Third Regeneration -- gdtltr@brahms.udel.edu _o_ ---------------------- _o_ [|o o|] Two CPU's are better than one; N CPU's would be real nice. [|o o|] |_o_| Disclaimer: I AM Brain Dead Innovations, Inc. |_o_|
gdtltr@brahms.udel.edu (root@research.bdi.com (Systems Research Supervisor)) (04/20/91)
In article <1505@geovision.gvc.com> gd@geovision.gvc.com (Gord Deinstadt) writes: =>In article <10422@pitt.UUCP> jonathan@cs.pitt.edu (Jonathan Eunice) writes: =>>Why isn't process migration common? => =>Several replies suggested that it may not be all that useful. I can =>think of a use: fault-tolerant systems. Though you need more than =>just process migration, it would be an interesting way to start. How do you propose to migrate a process of a node that has failed? I suppose if you had a fairly recent checkpoint for the process you could restart it, but you have to deal with any I/O the process has performed between the checkpoint and the failure. In any case, I don't think this is process migration, per se, but it is related. Gary Duzan Time Lord Third Regeneration -- gdtltr@brahms.udel.edu _o_ ---------------------- _o_ [|o o|] Two CPU's are better than one; N CPU's would be real nice. [|o o|] |_o_| Disclaimer: I AM Brain Dead Innovations, Inc. |_o_|