[comp.sys.dec] Academic workstations -- Followups to comp.unix.questions ONLY

cline@sun.soe.clarkson.edu (Marshall Cline) (06/10/89)

To avoid confusion, followups to this are requested to ONLY go to one
newsgroup.  I suggest comp.unix.questions.

In article <507@lclark.UUCP> cullum@lclark.UUCP (Mike Cullum) writes:

>We are in the process of considering the purchase of workstations for
>a small lab in our Computer Science Department.  Our proposed 
>configuration calls for 8 workstations (8Mb RAM, 200+Mb disk, large
>monochrome display) and a server.  
>...
>Any advice?

Clarkson University has quite a number of workstations, so I guess I
have enough experince to answer.  However (almost) all ours are Sun's,
so I can't compare.  However, I can _STRONGLY_ recommend one feature
in particular:

We have a SINGLE disk server in our School of Engineering, all other
workstations being diskless (thin wire 10Mb/s Ethernet), being
connected via Sun's NFS.  There are probably 20 or more "clients"
running off this one server.  Although we're pushing the performance
of the disk server, the concept of a single disk server is the BEST
THING SINCE SLICED BREAD.

The problem can be illustrated with our micro-computers (5000 or so AT
class machines on the campus, well over 1000 with hard disks).
Consider a student "Joe".  Joe's files are on a particular machine.
If that machine is busy today, he has to copy his files onto whatever
machine he happens to get.  Thus he duplicate all his files on all the
machines he might be working on.  Then there's the "which is the
latest version?" question.  The end result is that our students have
to floppy-jocky everyday.

Having a central location for files (the disk server) means that each
workstation that you log onto acts like it has your files.  No two
versions, etc.

Thus your comment for workstation having a 200+Mb disk is one which you
may want to reconsider.

There's a binary-compatibility problem with the NFS scheme, but we have
almost all Sun-3's (68020, 68881).  When we go to SparcStations (Sun-4's),
we'll have to address the multiple /bin directories, etc.

Hope this helps.
Marshall
--
	________________________________________________________________
	Marshall P. Cline	ARPA:	cline@sun.soe.clarkson.edu
	ECE Department		UseNet:	uunet!sun.soe.clarkson.edu!cline
	Clarkson University	BitNet:	BH0W@CLUTX
	Potsdam, NY  13676	AT&T:	(315) 268-6591

rpd@apple.com (Rick Daley) (06/10/89)

In article <CLINE.89Jun9165618@sun.soe.clarkson.edu> 
cline@sun.soe.clarkson.edu (Marshall Cline) writes:
> We have a SINGLE disk server in our School of Engineering, all other
> workstations being diskless (thin wire 10Mb/s Ethernet), being
> connected via Sun's NFS.  There are probably 20 or more "clients"
> running off this one server.  Although we're pushing the performance
> of the disk server, the concept of a single disk server is the BEST
> THING SINCE SLICED BREAD.
> 
> The problem can be illustrated with our micro-computers (5000 or so AT
> class machines on the campus, well over 1000 with hard disks).
> Consider a student "Joe".  Joe's files are on a particular machine.
> If that machine is busy today, he has to copy his files onto whatever
> machine he happens to get.  Thus he duplicate all his files on all the
> machines he might be working on.  Then there's the "which is the
> latest version?" question.  The end result is that our students have
> to floppy-jocky everyday.

The original question had to do with which UNIX workstation to buy for a 
student lab.  I'm obviously biased about that, but I do have a comment
about Marshall's push  for a diskless environment.  His argument is that
this keeps students from having to use  floppies to move
their files between machines in the lab.  Well, this is indeed important, 
but it has nothing to do with whether the machines should be diskless.  
All this means is that student's files should be stored on an NFS file 
server.  The machines could still have local disks which are used for the 
OS and for paging.  This should give you better performance because local 
disks should be faster than networks, but it also adds to the cost and 
administration effort.

                                                Rick Daley
                                                rpd@Apple.COM

bzs@bu-cs.BU.EDU (Barry Shein) (06/10/89)

>This should give you better performance because local 
>disks should be faster than networks, but it also adds to the cost and 
>administration effort.
>
>                                                Rick Daley
>                                                rpd@Apple.COM

Bad guess, go measure it, because servers almost always have faster
disks, controllers and bigger disk buffers remote disks are usually
faster than local disks (assuming a reasonable network loading which
doesn't have to be zero.)

An ethernet can deliver data at almost 1MB per second, go look at the
specs on your standard 27msec SCSI cheapo, 20KB/sec is not unusual for
maximum disk transfer rate, about 1/40th the speed of an ethernet.

In some cases remote disks are much faster, particularly where the
server CPU is much faster (and the disk system) and your process is
causing some amount of parallelism to occur (this doesn't have to be
purposeful, something simple like a find with a grep on each file can
end up exploiting both CPUs as one resolves the file system as the
other pumps away at the raw data.) Remember all those gripes about the
overhead of namei()? Where do you think namei() is running in an NFS
environment?

Many network load problems are due to badly configured or managed
networks with lots of junk traffic (eg. ARP or other broadcast
screamers going unchecked.)

However, I will agree that blaming it on the diskless workstations is
a wonderful alibi, the yokels believe you and rarely ask you to
actually do your job and find out what's really causing the problem.

It's the diskless workstations, it's the diskless workstations (we
know those diskless workstation users will never buy the local disks
you recommend so it's a safe bet to blame it on them.)

Another problem is political, the enforced memory shortage has
temporarily made the disk/memory balance unnatural. But don't confuse
economic realities with technical ones.

I agree none of this would apply to a Mac-II acting as an NFS server,
it doesn't support the disk architecture necessary to get any
performance advantage over a local disk.

I am not saying there aren't cases where a diskful workstation is far
better, I'm just saying most people don't know what they're talking
about or have motives other than understanding the technology.
-- 
	-Barry Shein

Software Tool & Die, Purveyors to the Trade
1330 Beacon Street, Brookline, MA 02146, (617) 739-0202

grunwald@flute.cs.uiuc.edu (Dirk Grunwald) (06/11/89)

>> Bad guess, go measure it, because servers almost always have faster
>> disks, controllers and bigger disk buffers remote disks are usually

This isn't really true, you know, unless you always buy the most up to
date controller. Smaller disks get better faster, and your incremental
cost is much much lower.

E.g., we have two CDC Sabre drives, 741 controller serving a 3/260. Total
cost at the time (university discounts, etc), about $27,000 for everything,
and we get about 650Mb of storage plus a fast central server if you need
compute-bound jobs (although extensive use of the server slows down everyone
else). The disk has about 1.5Mb/second transfer and 16ms seek

You can buy SCSI CDC Wren-V's that have 1Mb/second transfer rates & 16ms seek
for about $2500. Pick up 10 of those for your 10 clients that you can
serve off the 3/260 & you get 6000Mb of storage, and you still have enough
left over to buy a 2Gb tape backup system.

You get system redundancy (i.e. if a disk croaks, make that station be
a client of another), higher aggregate throughput and 10x the storage.
You don't have the central server, but hey, that can be an advantage
since you can now mix & mach your configuration (i.e. you want 10 stations,
now you have have 5 Sun/i386 stations & 5 DEC-3100's).

--
Dirk Grunwald -- Univ. of Illinois 		  (grunwald@flute.cs.uiuc.edu)

bzs@bu-cs.BU.EDU (Barry Shein) (06/11/89)

Yes, Dirk, you're right, a PDP-1 with drum memory is probably not a
cost-effective NFS file server these days. Thanks for pointing out
this oversight in my message.
-- 
	-Barry Shein

Software Tool & Die, Purveyors to the Trade
1330 Beacon Street, Brookline, MA 02146, (617) 739-0202

grunwald@flute.cs.uiuc.edu (Dirk Grunwald) (06/11/89)

oooo..cheeky cheeky

The 3/260 is less than a two years old.

At that time:
	+ Wren IV had same bandwidth as Wren V, less density.
	+ 741 was best controller available on the market, other
	  than Rimfire, which didn't work with 4.0

Hardly a PDP/1.

The point of the message was that disk aggregate bandwidth increases faster
for smaller disks; central file servers devote high $$ resources to
serving clients. The ability to incrementaly upgrade your system decreases
(witness that we still have that 741) and your total performance is poor.

For example, a 3/60 with a local CDC Wren-V runs small latex jobs about
10% to 20% faster than a 3/60 serviced by a an idle file-server on an
idle network. And it'll do it for less money.


--
Dirk Grunwald -- Univ. of Illinois 		  (grunwald@flute.cs.uiuc.edu)

heberlei@iris.ucdavis.edu (Todd) (06/12/89)

In article <32705@bu-cs.BU.EDU> bzs@bu-cs.BU.EDU (Barry Shein) writes:
[stuff deleted]
>However, I will agree that blaming it on the diskless workstations is
>a wonderful alibi, the yokels believe you and rarely ask you to
>actually do your job and find out what's really causing the problem.
>
>It's the diskless workstations, it's the diskless workstations (we
>know those diskless workstation users will never buy the local disks
>you recommend so it's a safe bet to blame it on them.)
>
[stuff deleted]
>-- 
>	-Barry Shein
>
>Software Tool & Die, Purveyors to the Trade
>1330 Beacon Street, Brookline, MA 02146, (617) 739-0202

I have been doing traffic analysis, and it is our diskless
workstations which are pulling down our network.  Our network
throughput is starting to drop because of the high collision rate,
BUT...

I work in a reasearch environment (and the two other places that I
have checked were research environments), so we have lots and lots of
workstations with lots of users pushing the workstations pretty hard.
If you have only a small isolated lab (or one connected to a larger
network by a bridge (not simply a repeater)), diskless stations may be
a better buy.

(The poor poster who started this thread probably has given up on
getting his original question answered)

Todd Heberlein

markley@celece.ucsd.edu (Mike Markley) (06/12/89)

In article <507@lclark.UUCP> cullum@lclark.UUCP (Mike Cullum) writes:
>
>>We are in the process of considering the purchase of workstations for
>>a small lab in our Computer Science Department.  Our proposed 
>>configuration calls for 8 workstations (8Mb RAM, 200+Mb disk, large
>>monochrome display) and a server.  
>>...
>>Any advice?
>

It somewhat depends upon what your overall objective is but
here is my $0.02.  If you want a distributed file system where
you have access to all of your files from any workstation
without having to copy them to the local disk buy Apollo
workstations.  The Apollo file system is IMHO orders of
magnitude easier to administer than NFS.  The Apollo registery
is also easier to set up and administer than Yellow Pages from
SUN.  In SUNs favor is cost and amount of inexpensive software
available.  This is becoming less of a factor now that SR10
from Apollo is available.  Just configure your machine as a
BSD4.3 machine and the compatability questions dim.  If you
need to develop graphics software the SUN is easier to program
unless you go with X-Windows and then the systems are the same.
The X11R3 software is much more stable on the SUNs then it is
on the Apollos.  If you are looking for pure integer
performance than SUN is much more cost effective.  The
SparcStations are very fast in the integer environment.  For
floating point I would say consider buying a fast server and a
bunch of cheap diskless workstations.  You can put 2.8Gigabytes
on an Apollo DN10000 and then run all of you diskless stations
of of this.  It is several times faster than the SparcStations
for floating-point calculations.

My recommendation overall is to go with Apollo.  Their
workstations are IMHO much easier to take care of and their
network is much easier to grow.  A single command adds new
workstations, and the workstations file system(s), to the network.

Mike Markley
University of California, San Diego
markley@celece.ucsd.edu
markley@kubrick.ucsd.edu

darcy@tci.UUCP (Jeff d'Arcy) (06/13/89)

In article <32705@bu-cs.BU.EDU> bzs@bu-cs.BU.EDU (Barry Shein) writes:
>
>>This should give you better performance because local 
>>disks should be faster than networks, but it also adds to the cost and 
>>administration effort.
>>                                                Rick Daley
>>                                                rpd@Apple.COM
>
>Bad guess, go measure it, because servers almost always have faster
>disks, controllers and bigger disk buffers remote disks are usually
>faster than local disks (assuming a reasonable network loading which
>doesn't have to be zero.)

Certainly, servers (at least those configured by sane administrators)
are likely to have faster, bigger disks etc. than would be feasible
for individual workstations.  However, latency is likely to suffer due
to the overhead associated with protocol encapsulation etc. especially
in heterogeneous environments.  If the application is doing something
simple such as a huge block transfer the performance hit won't be that
bad, but more complex operations involving random disk accesses will
hurt.

>An ethernet can deliver data at almost 1MB per second, go look at the
>specs on your standard 27msec SCSI cheapo, 20KB/sec is not unusual for
>maximum disk transfer rate, about 1/40th the speed of an ethernet.

At risk of repeating myself, this observation only applies to the simple
case where transfer speed is the limiting factor.  Unfortunately, latency
is frequently more important and is the first thing shot to h*ll in a
network environment.

>[very good points about parallelism and network administration
> deleted to save network bandwidth]

>However, I will agree that blaming it on the diskless workstations is
>a wonderful alibi, the yokels believe you and rarely ask you to
>actually do your job and find out what's really causing the problem.

>It's the diskless workstations, it's the diskless workstations (we
>know those diskless workstation users will never buy the local disks
>you recommend so it's a safe bet to blame it on them.)

>[attempted disclaimers deleted]

>I am not saying there aren't cases where a diskful workstation is far
>better, I'm just saying most people don't know what they're talking
>about or have motives other than understanding the technology.

"...don't know what they're talking about..."?  Disagreement is not
a sure sign of one party's ignorance, Barry.  I'm not saying that
local disks are the one and only way to go, but they are superior for
a wide range of applications.  I make my living in this field and I
am probably not alone in being offended by your implication that those
who disagree with you on this point are either foolish, lazy or
dishonest.