pjw@usna.navy.mil, , jw@math30, (Peter J. Welcher (math FACULTY)) (02/07/91)
I have a question, especially for the academic readers of the group. (And it may just be I'm missing the obvious, or re-inventing a wheel.) The Naval Academy Math Dept has 28 Suns, mostly in faculty offices. We'd like our students to be able to run Mathematica and Matlab on them by logging in via PC's running Procomm, connected via Ethernet. We do that already for a few students, no sweat. I'm worried about handling lots of students, all trying to do homework the night before it is due. The question is, is there any easy way to perform load-sharing, other than by randomly assigning sections or students to hosts ? What I think I'd like to do is perhaps tell students to log into a certain host (say math3) and then have them randomly be rlogin-ed to another machine before the program (Mathematica, Matlab) is run. Is there any reason this is a bad idea ? My thought is the rlogin load will be relatively low, so going thru a common machine won't overload it too badly. (And it's a SPARC server, 32M memory, with unlimited user license.) Writing a script that does something with rwho is a possiblity, but there's all the net overhead to rwhod. (28 machines). I do want something that completes within say 5 to 10 seconds, so rusers, rup and the like are no good. I've written a C program that forks (to get around timeout delays) and then does rstat calls. It is called "loaddist". It kills processes that don't finish within a short time, and then prints the name of the least loaded host (with some other fudge factors thrown into the calculation, like Sun 3 vs. SPARC). My idea was to have "rlogin `loaddist`" done to the students when they log into the specified host, math3. Is this a good/bad idea ? An alternative would be to set "loaddist" up as a daemon, to reduce the possible amount of forks and net traffic. The daemon would, say, fork a query to one host per second, so that all information would be refreshed every 30 seconds or so. The student script would use signals to get "loaddist" to emit a hostname. Any comments or suggestions would be appreciated.
pww@bnr.ca (Peter Whittaker) (02/07/91)
In article <25860@adm.brl.mil> pjw@usna.navy.mil, , jw@math30, (Peter J. Welcher (math FACULTY)) writes: (after many deletions...) > >I have a question, especially for the academic readers of the group. >(And it may just be I'm missing the obvious, or re-inventing a wheel.) > >Writing a script that does something with rwho is a possiblity, but there's all > >I've written a C program that forks (to get around timeout delays) and then If you are going to force people to login to a single front-end host, then why not write an program that keeps track of who has been assigned to each machine, then assigns each new user to the least-busy machine? (i.e. using rlogin, or what have you). When a user logs out of the assigned machine, strike them from the "assigned machine" table. As long as your users are doing roughly the same amount of work (which should be true if they are all working on the same assignment) your machines will be more or less equally loaded. It's not terribly elegant, but it would work. If you wanted to double check the load on a machine before assigning a user to it, query it via UDP (if you are on a LAN, UDP should be fairly reliable). If it says it is too busy, query the next machine in your "machine assignment" table. There are more elegant solutions, but this one should be quick to write and should work passably well.
shelley@infonode.ingr.com (Shelley Wilmoth) (02/07/91)
In article <1991Feb6.173736.11922@bwdls61.bnr.ca> pww@bnr.ca (Peter Whittaker) writes: >In article <25860@adm.brl.mil> pjw@usna.navy.mil, , jw@math30, (Peter J. Welcher (math FACULTY)) writes: > >(after many deletions...) > >> >>I have a question, especially for the academic readers of the group. >>(And it may just be I'm missing the obvious, or re-inventing a wheel.) >> >>Writing a script that does something with rwho is a possiblity, but there's all >> >>I've written a C program that forks (to get around timeout delays) and then > >If you are going to force people to login to a single front-end host, then >why not write an program that keeps track of who has been assigned to each >machine, then assigns each new user to the least-busy machine? (i.e. using >rlogin, or what have you). When a user logs out of the assigned machine, >strike them from the "assigned machine" table. > >As long as your users are doing roughly the same amount of work (which should >be true if they are all working on the same assignment) your machines will >be more or less equally loaded. > >It's not terribly elegant, but it would work. If you wanted to double check >the load on a machine before assigning a user to it, query it via UDP (if you >are on a LAN, UDP should be fairly reliable). If it says it is too busy, >query the next machine in your "machine assignment" table. > >There are more elegant solutions, but this one should be quick to write and >should work passably well. If users will not always be working on the same system each time they log in, and if they will be saving their work in files, you will want to be sure they have access to those files no matter what machine they log on to. For example, user Jones logs on and is assigned to System A, does some work on the assignment, saves it to a file system local to System A, then logs off and goes to class. Later, he logs in again to complete his work, but is assigned to System B because it has the lesser load at the time. He will wish to somehow access the files he stored on System A.
fwp1@CC.MsState.Edu (Frank Peters) (02/07/91)
: On 6 Feb 91 16:22:07 GMT, pjw@usna.navy.mil, , jw@math30, (Peter J. Welcher (math FACULTY)) said: pjw> The question is, is there any easy way to perform load-sharing, other than by pjw> randomly assigning sections or students to hosts ? I once toyed with an idea to do something like this using DNS but never implemented it. Basically the idea was to define a new record type in my local DNS tables called PROG that would run the given program and return the result in an A record to the calling program. For instance, suppose I had a bunch of suns that were effectively identical as far as mathematica is concerned. I might define the following in my DNS: $ORIGIN wherever MathSuns PROG /usr/local/adm/leastload MX 10 My.Mail.Hub.Here.Edu. And any A record requests for MathSuns would run the program, take the IP address that results and returns it. By passing the hostname to be resolved as an argument I could use the same program to manage several pools. I think this idea has the following advantages: 1. I'd be willing to bet that the necessary modifications to bind would be relatively trivial. 2. Since all that ever gets returned is an A record no modifications are required to the world wide DNS system or to individual resolver clients. And no front end host beyond the nameserver would need to be involved...none of this 'telnet to machine A and let it decide where you should go' stuff. 3. The actual load program can be upgraded/replaced/modified with no changes to the bind code. I can make leastload return a random host as a first pass, then the least number of users later, then the least loaded cpu and so on for finer levels of balance. The two tasks (picking a destination and returning it to the user) are isolated. I always did like modularity. Any comments on this idea? Any reason why it would be especially difficult/impractical? Anyone who has actuall done this?? :-) FWP -- Frank Peters Internet: fwp1@CC.MsState.Edu Bitnet: FWP1@MsState Phone: (601)325-2942 FAX: (601)325-8921
rbj@uunet.UU.NET (Root Boy Jim) (02/20/91)
In article <25860@adm.brl.mil> pjw@usna.navy.mil, , jw@math30, (Peter J. Welcher (math FACULTY)) writes: > >The question is, is there any easy way to perform load-sharing, other than by >randomly assigning sections or students to hosts ? Someone (I believe it is Apollo) has introduced the concept of a "broker", to complement the concepts of "clients" and "servers". Brokers locate the latter for the former when location is immaterial. >I've written a C program that forks (to get around timeout delays) and then >does rstat calls. It is called "loaddist". So far, so good. >It kills processes that don't finish within a short time, and Probably a bad idea, unless you have lots of runaway processes. >then prints the name of the least loaded host (with some other fudge factors >thrown into the calculation, like Sun 3 vs. SPARC). >My idea was to have "rlogin `loaddist`" done to the students when they >log into the specified host, math3. Is this a good/bad idea ? > >An alternative would be to set "loaddist" up as a daemon Sounds like rwho, now doesn't it? >Any comments or suggestions would be appreciated. Do you have NFS? Devote a directory on a common filesystem to load status monitoring. Some people have fixed rwho so that it merely writes info to a file in the rwho directory. Thus, the broadcast rwho traffic turns into NFS traffic all destined for wherever the real directory resides. -- [rbj@uunet 1] stty sane unknown mode: sane