randy@ms.uky.edu (Randy Appleton) (11/14/90)
I have been wondering how hard it would be to set up several of the new fast workstations as one big Mainframe. For instance, imagine some SPARCstations/DECstations set up in a row, and called compute servers. Each one could handle several users editing/compiling/ debugging on glass TTY's, or maybe one user running X. But how does each user, who is about to log in, know which machine to log into? He ought to log into the one with the lowest load average, yet without logging on cannot determine which one that is. What would be nice is to have some software running at each machine, maybe inside of rlogind or maybe not, that would take a login request, and check to see if the current machine is the least loaded machine. If so, accept the login, else re-route it to that machine. It would not be good to have each packet come in over the company ethernet, and then get sent back over the ethernet to the correct machine. That would slow down the machine doing the re-sending, causes un-needed delays in the turn-around, and clog the ethernet. Also, all of these machines should mount the same files (over NSF or some such thing), so as to presurve the illusion that this is one big computer, and not just many small ones. But still, it would be best to keep such packet sof the company net. One solution that directs logins to the least loaded machine, and keeps the network traffic shown to the outside world down, is this one: Company Ethernet --------------------------+------------------------------ | | --------------- ---| Login Server|---- ---------- --------------- | | Server1 | | |------------- ------- | ---------- | Disk| |---| Server2 | ------ | ---------- | . | . | ---------- |--| Server N| ---------- The idea is that as each person logs into the Login Server, their login shell is acually a process that looks for the least loaded Server, and rlogin's them to there. This should distribute the load evenly (well, semi-evenly) on all the servers. Also, the login server could have all the disks, and the others would mount them, so that no matter what node you got (which ought to be invisible) you saw the same files. The advantage is that this seup should be able to deliver a fair number of MIPS to a large number of users at very low cost. Ten SPARC servers results in a 100 MIPS machine (give or take) and at University pricing, is only about $30,000 (plus disks). Compare that to the price of a comparabe sized IBM or DEC! So my questions is, do you think this would work? How well do you think this would work? Do you think the network delays would be excessive? -Thanks -Randy P.S. I've sent this to several groups, so feel free to edit the 'Newsgroups' line. But please keep one of comp.arch or comp.unix.large, cause I read those. -- ============================================================================= My feelings on George Bush's promises: "You have just exceeded the gulibility threshold!" ============================================Randy@ms.uky.edu==================
zs01+@andrew.cmu.edu (Zalman Stern) (11/17/90)
randy@ms.uky.edu (Randy Appleton) writes: > I have been wondering how hard it would be to set up several > of the new fast workstations as one big Mainframe. For instance, > imagine some SPARCstations/DECstations set up in a row, and called > compute servers. Each one could handle several users editing/compiling/ > debugging on glass TTY's, or maybe one user running X. This has been done with mostly "off the shelf" technology at Carnegie Mellon. The "UNIX server" project consists of 10 VAXstation 3100's (CVAX processors) with a reasonable amount of disk and memory. These are provided as a shared resource to the Andrew community (three thousand or more users depending on who you talk to). The only other UNIX systems available to Andrew users are single login workstations. (That is, there is nothing resembling a UNIX mainframe in the system.) > > But how does each user, who is about to log in, know which machine to > log into? He ought to log into the one with the lowest load average, yet > without logging on cannot determine which one that is. For the UNIX servers, there is code in the local version of named (the domain name server) which returns the IP address of the least loaded server when asked to resolve the hostname "unix.andrew.cmu.edu". (The servers are named unix1 through unix10 .) I believe a special protocol was developed for named to collect load statistics but I'm not sure. As I recall, the protocol sends information about the CPU load, number of users, and virtual memory statistics. Note that all asynch and dialup lines go through terminal concentrators (Annex boxes) onto the CMU internet. > [Stuff deleted.] > ... Also, the login server could have > all the disks, and the others would mount them, so that no matter what node > you got (which ought to be invisible) you saw the same files. Andrew uses Transarc's AFS 3.0 distributed filesystem product to provide location transparent access to files from any workstation or UNIX server in the system. There are other problems which are solved via AFS components as well. For example, traditional user name/password lookup mechanisms fail badly when given a database of 10,000 registered users. AFS provides a mechanism called the White Pages for dealing with this. (Under BSD UNIX, one can use dbm based passwd files instead.) If you want more info on the UNIX server project, send me mail and I'll put you in touch with the appropriate people. Detailed statistics are kept on the usage of these machines. Using these numbers, one could probably do some interesting cost/performance analysis. > > The advantage is that this seup should be able to deliver a fair > number of MIPS to a large number of users at very low cost. Ten SPARC > servers results in a 100 MIPS machine (give or take) and at University > pricing, is only about $30,000 (plus disks). Compare that to the price > of a comparabe sized IBM or DEC! > > So my questions is, do you think this would work? How well do you > think this would work? Do you think the network delays would be > excessive? Yes, what you describe could be easily done with ten SPARCstations and a small amount of software support. It is not clear that it is useful to compare the performance of such a system to that of a mainframe though. It depends a lot on what the workload is like. Also, there are other points in the problem space. Three interesting ones are tightly coupled multi-processors (SGI, Encore, Pyramid, Sequent, Solbourne), larger UNIX server boxes (MIPS 3260s or 6280s, IBM RS/6000s, faster SPARCs, HP, etc.), and 386/486 architecture commodity technology (PC clones, COMPAQ SystemPRO). Certainly, DEC VAXen and IBM 370s do not provide cost effective UNIX cycles but that is not the market for that type of machine. Intuition tells me that the best solution is very dependent on your workload and the specific prices for different systems. Zalman Stern, MIPS Computer Systems, 928 E. Arques 1-03, Sunnyvale, CA 94086 zalman@mips.com OR {ames,decwrl,prls,pyramid}!mips!zalman
alanw@ashtate (Alan Weiss) (12/15/90)
In article <16364@s.ms.uky.edu> randy@ms.uky.edu (Randy Appleton) writes: >I have been wondering how hard it would be to set up several >of the new fast workstations as one big Mainframe. For instance, >imagine some SPARCstations/DECstations set up in a row, and called >compute servers. Each one could handle several users editing/compiling/ >debugging on glass TTY's, or maybe one user running X. > >But how does each user, who is about to log in, know which machine to >log into? He ought to log into the one with the lowest load average, yet >without logging on cannot determine which one that is. ....... You are referring to Process Transparency (which actually can be implemented at the task, process, or thread level). The leaders in this kind of work are Locus Computing Corp. in Inglewood and Freedomnet in North Carolina. LCC's product, the Locus Operating System, formed the basis for IBM Corp.'s Transparent Computing Facility (TCF), which allowed for a distributed, heterogeneous, transparent filesystem AND process system. It was first implemented back in the early 1980's on VAXen, Series 1's, and 370's. The first commericial product, AIX/370 and AIX/PS/2 (386) offer TCF. (I used to work for Locus, but have no commerical interest currently). For more information, contact IBM or Locus Computing - 213-670-6500, and mention TCF. As an aside, watching process execute with or without user knowledge as to execution location, as well as watching processes migrate while executing to other systems, is Neat. Wanna run that geological survey program? Let /*fast*/ find the quickest site! .__________________________________________________________. |-- Alan R. Weiss -- | These thoughts are yours for the| |alanw@ashton | taking, being generated by a | |alanw@ashtate.A-T.COM | failed Turing Test program. | |!uunet!ashton!alanw |---------------------------------| |213-538-7584 | Anarchy works. Look at Usenet! | |________________________|_________________________________|