[comp.unix.questions] Suggestions needed for maintaining a cluster of machines

jmn@power.berkeley.edu (J. Mark Noworolski) (11/16/90)

(for those of you who saw this under a different title, and as a follow-up,
I've cancelled those two articles-....sorry).

Our research group will soon be getting about 4-6 decstations. The plan is to
use redundancy as backup since each machine will have a big disk- but there
will be no tape drives.

I am the one who will be maintaining them. My question is, what is the best
way to set them up? Should I have one fileserver? 

In particular this is what I have been thinking:
1. Avoid using one machine as a fileserver- it will be unfairly loaded, and
if it breaks then our whole cluster will be toast. An advantage is that
administration will be headache free (until the inevitable disk crash).

2. Set up individual accounts on different machines. So let's say I have
four users (a,b,c,d) and four machines. Then I make user a's home directory
on machine 1, user b on machine b, etc. Now I nfs mount the user directories
to each machine. So that if user a logs in to machine b,c, or d he will see
his home directories from machine a by nfs. Any comments on this approach?
In our configuration, _most_ of the time it will be fairly easy to predict
who will be logging in to which machine most of the time- so I think this
stands a chance of working. What scares me about this approach is
maintenance- I think it will be a real pain to addusers on each machine
individually.

3. Backups. There are two problems here- one is to back up the operating
system and another to do the user directories. We do not plan to use a tape
drive (since one was not purchased), but redundancy instead. Any comments?

a) operating system- Perhaps the best idea is to have all updates to be done
on one machine and have those propagate to the other machines overnight via
cron and timestamp comparisons. Has anybody done this? I can see a problem
trying to figure out which files to propagate, and which ones not to (log
files, news spools, etc.). 

b) user accounts. This one I don't think is a problem- just use cron to do a
tar and copy via nfs to another machine in a circular fashion.

4. Adding other machines to the cluster may be a bit of a pain if they are
ever purchased.


I would really appreciate any comments about these ideas. I would love to
hear from anybody who has done something similar. Gee, I would really like
to get some examples of administration software for this approach also.

Thanks in advance,
mark
--
"There's a really fine line between clever and stupid"
				Nigel- Lead Guitar (Spinal Tap)
jmn@united.berkeley.edu, or jmn@power.berkeley.edu