ast@cs.vu.nl (Andy Tanenbaum) (03/25/90)
In article <6548@becker.UUCP> bdb@becker.UUCP (Bruce Becker) writes: > More or less. All resources can be allocated > to a job, including cpu, memory, files, disk > drives, modems, etc. The difference is that > you can tell (until the new version this > year perhaps) where the resource lives, but > you treat it locally (except for diskette > drives - inserting a diskette halfway across > a continent is a bit inconvenient 8^). One can > get an executable from one machine, execute > it on another, and have the screen/kybd i/o > sent to your own machine - all simple to do, > built in to the system. The name space isn't > as global as the way you describe, in the current > version - I imagine better name space management > might come with the POSIX version due out this > spring... The key issue in a distributed system is transparency. Are the users aware of where things are? I propose a kind of Turing test for distributedness: You take two teams of users and give each person a terminal. The people on team A all have terminals on a standard timesharing system. The people on Team B all have terminals attached to a distributed system. The object of the game is for each team to figure out which system it has. The object of the game for the system designers is to make them fail. If nobody can tell, you have a distributed system. In a funny way, the old CDC Cybers are very distributed. They have one or more CPUs and a whole bunch of PPUs that do lots of work. The users to not rlogin to PPUs or tell them to do anything. The average user doesn't know they exist even. It looks like one system. However, the Cybers are not really distributed because an additional requirement really should be that the processors need not necessarily all be in the same room, which they must be with the Cybers. It is legitimate for person 1 to start up a huge CPU-bound computation to see if person 2 notices any performance degradation. On a timesharing system, person 2 would see this (Butler Lampson once called this a covert channel.) If the system being tested has the property that your jobs always run on your "home" machine, so that the other team members who happen to be logged in elsewhere don't notice anything, it is not distributed. There must be a common pool of resources shared equally among all users, just like in a timesharing system. Remember, the goal of the system designers is to make a system where the team can't tell. This implies (as a bare minimum, that the *system* does process scheduling, e.g. assigning processes to processor at random, using round-robin, using lowest load, or something like that. The user should have nothing to do with it. I don't think Amoeba would fully pass at the moment, but I don't think any other existing system would either. Nevertheless, that is clearly the goal we are aiming at, and I think we are not doing all that much worse than everyone else. Could QNX pass the test? How about Mach? Chorus? Sprite? Others? Andy Tanenbaum (ast@cs.vu.nl)
gl8f@astsun9.astro.Virginia.EDU (Greg Lindahl) (03/26/90)
In article <6117@star.cs.vu.nl> ast@cs.vu.nl (Andy Tanenbaum) writes: >The key issue in a distributed system is transparency. Are the users >aware of where things are? I propose a kind of Turing test for >distributedness: [...] >It is legitimate for person 1 to start up a huge CPU-bound computation to >see if person 2 notices any performance degradation. On a timesharing >system, person 2 would see this (Butler Lampson once called this a covert >channel.) Unfortunately, the terms of this test can easily be abused. If the distributed system has more processors than processes, and the 2 users start up "truely" CPU-bound programs (e.g. no paging, no I/O except to print "I'm done after consuming 57 hours of cpu time, etc.), then the distributed system should show no slowdown while the timesharing system will see a large slowdown. This abuse of the test depends on the observation that, under sane operating systems (I exclude VM/CMS CPU partitions), a single CPU-bound program running on an idle single-processor timesharing system can consume essentially all of the CPU time. And that's all there is, so multiple programs can't consume twice as much. You might think that a "normal" program, which depends on several resources, would provide a better test. In that case, you might expect that several normal programs would be able to consume more total resources than one normal program on both the timesharing and the distributed system. Of course, since this is ast's test, my CPU-bound program would solve a problem which could be solved using integers between 0 and 255, such as finding minimal Golomb rulers. None of this floating point stuff. :-) Greg Lindahl gl8f@virginia.edu I gave my lunch for space-sickness research.
feustel@well.sf.ca.us (David Alan Feustel) (03/26/90)
Regarding the Amoeba system: I don't remember whether you said you would make the source code available. I will be interested if you do. -- Phone: (home) 219-482-9631 E-mail: feustel@well.sf.ca.us {ucbvax,apple,hplabs,pacbell}!well!feustel USMAIL: Dave Feustel, 1930 Curdes Ave, Fort Wayne, IN 46805
douglis@ALLSPICE.BERKELEY.EDU (Fred Douglis) (03/26/90)
Sprite would not pass Andy's test for distributedness, though it, too, would come close. Originally we had the notion that workstations wouldn't even have hostnames -- a workstation would just be a display onto a distributed time-sharing system. The problem is that making everything anonymous, including machines on people's desks, would make it harder to distinguish between machines. The rest of the world believes in separate hosts. For example, what if someone from sprite wanted to rlogin to a unix machine? If rlogin just claimed to be from a single internet address "sprite.Berkeley.EDU", then all packets would have to be routed via that machine, even though my own workstation is capable of running IP/TCP and communicating directly. But "finger" on sprite lists all users on all sprite workstations, including which workstation each user is most recently active on, and mail goes out as "sprite". There's a single shared file system. The real distinction between Sprite and Andy's distributed system is that we name our workstations and can rlogin between them. With respect to load sharing, we provide transparent remote execution (users on one host can run "pmake" and other programs to use multiple hosts in parallel) but we also provide performance guarantees to the owner of a workstation. Andy suggests that in a true distributed system, all processors are available to all users, and someone on one machine would notice a performance degradation when someone else on another machine started a large program. This is true, and to us it suggests that we don't necessarily want a "true" distributed system, only something close. In Sprite, anyone can use idle hosts, but if a user comes back to a host that's running someone else's processes, the foreign processes get migrated back to the originator's machine. People who are now using Sprite seem to uniformly appreciate the ability to control their own machine. I'd be interested in hearing how other people address this issue. Do you think workstation ownership is a reasonable notion, or is it outdated? ============ =========================== ============== Fred Douglis douglis@sprite.Berkeley.EDU ucbvax!douglis ============ =========================== ==============
blarson@dianne.usc.edu (bob larson) (03/26/90)
In article <9003260052.AA262747@sprite.Berkeley.EDU> douglis@ALLSPICE.BERKELEY.EDU (Fred Douglis) writes: >Sprite would not pass Andy's test for distributedness, though it, too, >would come close. [...] >For example, what if someone from sprite >wanted to rlogin to a unix machine? If rlogin just claimed to be from >a single internet address "sprite.Berkeley.EDU", then all packets >would have to be routed via that machine, even though my own >workstation is capable of running IP/TCP and communicating directly. Couldn't the sprite network be configured to look like a single machine with multiple network interfaces on the same network? Although unusual, I think this would be legal. Would redirects be sufficient to allow packets to be sent to the host that (currently) wants them? >With respect to load sharing, we provide transparent remote execution >(users on one host can run "pmake" and other programs to use multiple >hosts in parallel) but we also provide performance guarantees to the >owner of a workstation. Tops-20 also had a way to guarentee performance to users, so a minimum guarentee of performance realy isn't a way to distinguish a distributed system from a single processor one. (I think this was normally used for groups, but a group could consist of a single users.) What's the difference between a guarantee of 10% of a 10 mip machine to that of 100% of a 1 mip machine? Bob Larson (blars) blarson@usc.edu usc!blarson I do enjoy receiving money. -- Richard Stallman --** To join Prime computer mailing list send mail to **--- info-prime-request@ais1.usc.edu or usc!ais1!info-prime-request
lws@comm.wang.com (Lyle Seaman) (04/03/90)
ast@cs.vu.nl (Andy Tanenbaum) writes: >There must be a common pool of resources shared equally among all users, >just like in a timesharing system. Remember, the goal of the system designers I'm not sure this is an ideal goal of a distributed system. The concept of priorities has been around for a long time from the centralized system approach, and I think a reasonable approach to these concepts would be something like what Sprite does. The distributed priority would then be the composition of the priorities on the individual machines. This would allow me to have at a minimum, the resources available on my own machine. If someone in my environment is wasteful of resources, there will be a limit to that person's ability to impair my access to resources. This is one of the central arguments in the ongoing "single-user vs. shared" debate. One of the nice things about using many small systems instead of one large system is that I don't have to share my machine with any of a small group of people who never delete old files... And I don't have to share with some of the other people, who (instead of keeping track of what is where) grep the whole directory hierarchy several times a day. Or with those who recompile every piece of source code, instead of using even a dumb make... But I see your point about the test. -- Lyle Wang lws@comm.wang.com 508 967 2322 Lowell, MA, USA uunet!comm.wang.com!lws