aglew@crhc.uiuc.edu (Andy Glew) (10/26/90)
Does anyone have a shareable, networked, swap device? Anyone = commercial or academic / public domain, any flavour of UNIX. Shareable, networked, swap device: NOT mkfile/swapon ! What I seek is a swap device that multiple workstations share dynamically, not a disk that can support several statically sized swap partitions (well, not quite statiic, but more swap space isn't allocated automagically). In any typical network of n machines, each machine is allocated, say, 32M of swap space, for a total of 32*n megabytes. But, in our network, typically half of the machines are not being used at any time, so they typically have around 30M of swap-space free. At the same time, a much smaller fraction of our machines are running very large jobs, and really need swapspace around 64-128M - but these aren't always the same machines (not designated compute servers). It really would be nice if the unused swap memory of some machines could be used to temporarily, transparently, expand the swap memory of others, on demand. Ie. it would be nice if swap space was a centralized resource pool, rather than fragmented. As I've said above, management via mkfile/swapon is possible, but klugey. You want the swap space to be transparently added, without human intervention, and without causing processes to die with the message "too big". Things aren't helped by SUN and friends not supporting the swapon -l and swapon -d (list and delete) commands found on System V. How hard would this be to do? Modest. The typical /dev/drum interface, where the kernel assumes that it has exclusive control over a large space, could be modified. All that is really needed is an information call to indicate when a swap page has been freed, so that it can be physically removed from under one machine's /dev/drum and given to another. And a trap so that an attempt to access a /dev/drum page that has been removed can be handled by requesting over the network. Safety properties, of course, are a bit harder, and it really would be nicer to have a "give me NNN pages of swap space call" made to the shareable, networked, swap device. Has anyone seen something like this? This is posted to comp.unix.internals, because any such device is probably a driver, a server, an interface to /dev/drum, or all three; to comp.unix.large, because large systems with lots of workstations are likely to be playing the swap allocation game; and to comp.unix.admin because such a device would make system administration on large systems easier. Followups to comp.unix.inyernals. -- Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]
lm@slovax.Sun.COM (Larry McVoy) (10/27/90)
In article <AGLEW.90Oct25235828@cobra.crhc.uiuc.edu> aglew@crhc.uiuc.edu (Andy Glew) writes: >Does anyone have a shareable, networked, swap device? > > In any typical network of n machines, each machine is allocated, >say, 32M of swap space, for a total of 32*n megabytes. But, in our >network, typically half of the machines are not being used at any time, >so they typically have around 30M of swap-space free. At the same >time, a much smaller fraction of our machines are running very large >jobs, and really need swapspace around 64-128M - but these aren't >always the same machines (not designated compute servers). > It really would be nice if the unused swap memory of some machines >could be used to temporarily, transparently, expand the swap memory of >others, on demand. Ie. it would be nice if swap space was a >centralized resource pool, rather than fragmented. Do you want any fairness? Should hostA be able to use up all the swapspace to the exclusion of b, c, d, and e? Should the OS provide hooks to allow you to tune this? How would you tune it? What hooks do you want? --- Larry McVoy, Sun Microsystems (415) 336-7627 ...!sun!lm or lm@sun.com
bzs@world.std.com (Barry Shein) (10/28/90)
I have to admit it is an interesting idea. Policy needs to be designed obviously, but in the end what you really want is a bibop (big bag o' pages, name stolen from the lisp culture) shareable area, instead of typed pages the "types" are host id's (perhaps IP addresses.) Couldn't NFS *almost* do this right now. You'd create this big file and the clients would keep their own seek pointers (allocated by the server, but otherwise stateless since each request to pagein would include the seek pointer and perhaps the page size, or maybe size is fixed, seek pointer becomes a funny kind of file handle.) Sounds like most of the work is on the client side (not unusual for these types of things.) So basically the operations are "store this page somewhere in file X", which would return a magic cookie used later to get that page back. Maybe a better name would be a "cloakroom swap discipline", you check-in your page and get your ticket to retrieve it later. There is still the whole pre-allocation problem (then again this might be a nice opportunity to splice in some long-overdue subterfuges...) -- -Barry Shein Software Tool & Die | {xylogics,uunet}!world!bzs | bzs@world.std.com Purveyors to the Trade | Voice: 617-739-0202 | Login: 617-739-WRLD
buck@siswat.UUCP (A. Lester Buck) (10/28/90)
In article <AGLEW.90Oct25235828@cobra.crhc.uiuc.edu> aglew@crhc.uiuc.edu (Andy Glew) writes: >Does anyone have a shareable, networked, swap device? Even better, how about remote virtual memory? Check out the Usenix paper from summer 1990 by Comer and one of his students about their implementation of a networked virtual memory server. They show how their implementation is quite competitive with other forms of virtual memory. -- A. Lester Buck buck@siswat.lonestar.org ...!uhnix1!lobster!siswat!buck
richard@aiai.ed.ac.uk (Richard Tobin) (10/29/90)
In article <BZS.90Oct27161143@world.std.com> bzs@world.std.com (Barry Shein) writes: >Couldn't NFS *almost* do this right now. I considered this. The way I looked at it was that the problem is that there's no way for the client to tell the server when a page is freed. Apart from this, it could work - the server wouldn't even have to give the client a pointer into the file, it could just map (client, client's-offset) to (server's-offset). -- Richard -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin
bzs@world.std.com (Barry Shein) (10/30/90)
From: richard@aiai.ed.ac.uk (Richard Tobin) [responding to me] >>Couldn't NFS *almost* do this right now. > >I considered this. The way I looked at it was that the problem is >that there's no way for the client to tell the server when a page is >freed. Apart from this, it could work - the server wouldn't even have >to give the client a pointer into the file, it could just map (client, >client's-offset) to (server's-offset). Hmm, basically a distributed scatter-gather MMU device. The client believes it has Xmb of swap and the server just manages the address mappings thru typical associative memory maps. I suppose the easiest way to free pages would be by use of a tag (the process id on the client would be a good candidate, the server doesn't much care so long as it's client-unique.) Then all you need is a free_tag() operation (I assume that once a page is allocated to a process it's not freed until the process is finished, something more flexible can be left as an exercise for the reader.) So you have a three-tuple to identify any page: f(host_address,page_offset,tag) -> server_page_location for each page in the page server (again, assuming pages are fixed in size, otherwise throw in size, bother.) Interestingly, with the tag it allows for each process to have its own virtual page-address space w/o the client needing to manage that at all. It could work. It's even stateless enough to survive server crashes, and doesn't much interfere with the current model in a Unix client of how swap is allocated. -- -Barry Shein Software Tool & Die | {xylogics,uunet}!world!bzs | bzs@world.std.com Purveyors to the Trade | Voice: 617-739-0202 | Login: 617-739-WRLD