[comp.protocols.tcp-ip] Ethernet Suffering

jch@omnigate.clarkson.EDU.UUCP (04/30/87)

Tead Mean <mead@tut.cc.rochester.edu> writes:
>
>	There seemed to be a consensus that having diskless workstations and
>file servers on a network would cause havoc to an Ethernet.

I'd like to solicit comments on a configuration that our School of Engineering
has proposed:

They would like to purchase an Aliant super-mini-computer, a Sun 3/160 server
and 12 diskless 3/50s, 8 Opus Clipper systems (a PC/AT with a 32032 processor
board running a System V port (I belive)), and 83 IBM PC/AT clones running
Sun's PC/NFS.  The Opus system are supposed to be disk servers for the
PC/NFS systems where most of the computing is supposed to take place.  Most
of the PC/ATs will not have any hard disk, they will rely fully on the Opus
systems for disk storage.

All this equipment, in 4 buildings, will be linked with 3 fiber repeaters, 
making one large ethernet.

Our limited experience shows that one or two diskless 3/50s doing disk
intensive work (compiling programs or coping disk files around) significantly
affect the performance of both other diskless 3/50s and PCs on the same net
that do not make use of the file server (i.e. DECnet-DOS to a VMS system).

(The Imperial ;-)) We in the computing center would like to see some
partitioning of the ethernet into departmental segments connected to a
School of Engineering backbone with at least level II bridges.  In our
minds this would localize traffic to some degree, isolate potential
physical problems (shorted or broken cable, accidental or malicious)
and provide some measure of security.  This would not address problems
of the "Chernobyl" effect.

Does anyone have experience with a similar configuration of diskless
workstations and/or PCs that they can comment on?

Thanks

Jeff

root@TOPAZ.RUTGERS.EDU.UUCP (05/01/87)

We use diskless Suns extensively.  We have around 40 diskless machines
on one Ethernet.  There is evidence that this causes more load than
one would prefer.  On the other hand, it is also not a disaster.  I'm
sceptical of 2 machines causing serious problems.  We'd rather keep it
to about 25.  Because of the critical dependence of Suns on their
Ethernets, and the wierd things that some TCP/IP implementations do to
the Ethernet, we keep diskless Suns on separate Ethernets dedicated to
just Suns.  We use a real IP gateway between the Ethernets.  Level 2
bridges would certainly help with the load, but would not necessarily
provide isolation from wierd packets.  Whether this is a problem
depends upon how confident you are in your TCP/IP implementations.

mike@BRL.ARPA (Mike Muuss) (05/02/87)

Two Sun-3/50 processors blasting to each other with a TCP connection
can achieve ~2-3 Mbits/sec user-to-user throughput (tested with the TTCP
program), and seem to use about 25% of the ethernet bandwidth as
monitored on another Sun-3/50, which has unknown (to me) measurement
accuracy.  In our experiences this has had no noticable impact on other
users of the Ethernet.  Adding a second pair of Sun-3/50s running the
same test doubled the loading on the Ethernet, as you would expect.

Current wisdom suggests that there should be no more than one file
server and 8 diskless Sun-3s per Ethernet for good Ethernet performance
when all the Suns are busy.  At BRL, we presently have one Ethernet with
14 Sun-3/50s and and 4 Sun-2/50s running off of one fileserver (a Gould
PN9050 giving both ND and NFS service), as well as a variety of other
machines (more Goulds and 2 Alliant FX/8s) that communicate with NFS
on a more occasional basis.  We find that head contention on the
file server is the performance limit now, not the Ethernet.  However,
once the filesever is beefed up a bit, the Ethernet will be next, so
the Ethernet will be split into two, with a pair of level-3 IP gateways
between them.

Hope this information helps.
	Best,
	 -Mike

jqj@GVAX.CS.CORNELL.EDU (J Q Johnson) (05/02/87)

Charles Hedrick notes that 20 to 40 diskless SUNs is a resonable load on
an Ethernet.  Although our experience at Cornell is consistent with this
estimate, one should be a bit careful:  small software and usage changes
can make for big changes in behavior.  For example, on our main Ethernet
(about 25 diskless SUNs plus 75 other machines, total less than 25% load)
we observe that at least 1/2 of the SUN load is ND traffic.  ND is not
efficient in its use of Ethernet bandwidth, and I would expect the total 
load offered by the SUNs to drop, perhaps precipitously, when SunOS4.0
arrives.  Similarly, slightly better caching strategies in clients can
make a big difference, as can adding a bit more memory (we do wish our 
3/50s had 6MB!).

Perhaps most important, don't attempt to generalize from diskless SUNs
to PC-ATs (or even to diskless VAXstations).  The PCs won't be paging
across the network, don't run a multitasking OS, have typically smaller
program sizes than Suns and longer process lifetimes, etc.

All the above points to being able to support lots of diskless workstations
on your network.  On the other hand, it would be foolish to design a
network that didn't make provisions for saturation.  If you don't put
in bridges or gateways initially, at least locate your servers near
their clients so you can get the benefit of installing bridges later if
you need to.  Leave your PCs with a couple of empty slots so you can add
more memory (for a RAM disk or whatever) later if need be.  And so on.
Don't assume that any load analysis you do today will still be valid in
1989.

dms@HERMES.AI.MIT.EDU (David M. Siegel) (05/04/87)

Here at the MIT AI Lab we have found that our diskless sun workstations
put a much heavier load on an Ethernet than Hedrick and Johnson noted.
Using a network analyzier, the 18 diskless Suns we have on one ether
can run the cable at 50 percent of capacity for extended periods of
time. Peek 5 second usage often jumps to 70 percent. All our machines
have 8 Meg of RAM, though some of them run Sun Common Lisp. Much of
the traffic is ND packets. Based on this, we are planning on having no
more than 12-15 Suns are one ethernet. Each server will have its own
"client" subnet.

hedrick@TOPAZ.RUTGERS.EDU (Charles Hedrick) (05/05/87)

Note that the 2-3 Mbits/sec of Ethernet traffic you report is with a
test program designed to test the network only.  However in actual
use, the majority of the high-speed Ethernet traffic is generated by
file serving.  In that case, it is limited by the speed the disks, and
the amount of lookahead done by the protocols.  I would be extremely
surprised to see the current generation of Sun file server deliver
more than 1Mbit/sec of sustained throughput.  Much to my surprise, I
find that replacing Eagles with super-Eagles does not seem to increase
the throughput available in my tests noticably. Note that these tests
involved a mix of operations, including file creating, reading,
removing, and renaming, and that the files were small or moderate in
size.  I.e.  we tried to duplicate the sorts of I/O that a typical
student mix would generate.  I have to believe that fast sequential
operations on large files would get more with a super-Eagle.  Some
other results:
  - one Eagle with one controller seemed to use about 2/3 of the CPU
	in a 3/180.
  - a second Eagle on the same controller added very little in throughput
  - a second Eagle on a second controller added about 50% in capacity.
	It seems that this was limited by CPU capacity
  - a 280 with super-Eagle did not have noticably more performance than
	a 180 with Eagle.  However we assume that the 280 would
	be able to handle two disks and controllers without running
	out of steam.  (We were unable to test this because we didn't
	have the right hardware configuration.)  It's not clear whether
	this would be cost-effective, though, when compared against
	using one Sun 3/140S per disk [a configuration which however
	is not supported by Sun.  Indeed I'm not sure that the 140S
	is even on the price sheet.]I wgets

mike@BRL.ARPA (Mike Muuss) (05/05/87)

I agree that the TTCP only measured memory-to-memory throughput. That
was the intent -- to see how much data could be shoveled. I did not
intend to suggest it was a generic benchmark. Note that TTCP was using
TCP, mind you, not NFS or ND.

In our environment, we do a lot of network-based 24-bit RGB graphics,
which means whacking .75Mbytes (lores) or 3 Mbytes (hires) for each
image.  Often they are computed and displayed without ever touching
a disk.  So the TTCP test was not uninteresting.

Our Gould 9000 fileserver, which serves the collection of Suns I
mentioned, can be seen at busy times handling 200 packets/second
in both the transmit and receive directions (peak).  Many of them
result in disk transactions, although the ratio can be deceptive.
Eg, 1 pkt arrives asking for 8kbytes of data, which is read with
one disk I/O, and returned in 8 packets.  1 disk I/O, 9 packets.

Hope you find these random statistics of interest.
	Best,
	 -M

jon@CS.UCL.AC.UK.UUCP (05/07/87)

The figures here are approx:

Manchester University run a net of 60 odd suns. They have 10
diskless 3/50s per 3/260  server with a 400 Mb eagle.

Each server-client set has
it's own thin ethernet. All the servers are backboned on an
ethernet. with 4Meg on each diskless client, 8 Meg on server,
the servers and ethernet just about cope if no more than 5Meg
virtual mem is used in each
client (ie 1Meg swapping). I don't know whether the bottleneck
is ethernet or server cpu/disk speeds.

Most of the ether traffic is ND/NFS, which is much less a
respecter of bandwidths and delays than tcp traffic, and wreaks
havoc with bridges and gateways unless you handwind down the
read/write transfer sizes. Hence the client/server ratio and
separate ethers.

Does anyone know of any affordable ethernet/ethernet IP
gateway/subnet router that can take 8 Kbytes worth of IP back to back
from several (~10) hosts at once?

Jon

mishkin@apollo.uucp (Nathaniel Mishkin) (05/08/87)

I found all this discussion about loaded ethernets pretty interesting.
Having used Apollos (both in and out of Apollo Computer Inc.) for the
last ~5 years, I've become pretty familiar with the vices and virtues
(much of the former) of token ring networks and often wondered why we
wouldn't just be better off with ethernet.  I think the recent discussion
in this group highlights some of the virtues of token ring networks.
I was fairly astonished to hear read one basically can run no more than
(based on the various estimates) 8-15 diskless workstations (of some
manufacture) on a single ether.  I shudder to think of the cost (in money
and performance) of *requiring* routers/bridges and internetwork topology
for a relative small "work group".  You just don't have these problems
in a token ring.  Token rings guarantee fair access to the medium and
as a result can run successfully with consistently higher average loads.

And forget diskless workstations for a minute.  How about doing file system
backups over the net?  There's a fine bit of load; and it's not bursty
like diskless workstations.  In our multi-hundreds of gigabyte environment,
backups (like love) are forever.

I also thought the comment about how improved caching would help matters
was interesting.  Of course, proper caching requires correct cache
validation to ensure that you're reading valid data.  Not all distributed
file systems implement such correctness guarantees.  For example, Apollo's
distributed file system does, but NFS doesn't.
-- 
                    -- Nat Mishkin
                       Apollo Computer Inc.
                       Chelmsford, MA
                       {wanginst,yale,mit-eddie}!apollo!mishkin

mark@mimsy.UUCP (Mark Weiser) (05/09/87)

In article <34bd5209.c366@apollo.uucp> mishkin@apollo.UUCP (Nathaniel Mishkin) writes:
>...I think the recent discussion
>in this group highlights some of the virtues of token ring networks.
>I was fairly astonished to hear read one basically can run no more than
>(based on the various estimates) 8-15 diskless workstations (of some
>manufacture) on a single ether. 

I think this is a misinterpretation of the comments.  I have seen
Apollo networks exhibiting extremely poor performance when too many
diskless nodes were accessing a single server.  (Too many did not
seem to be all that many--I saw this at the Brown demonstration
classroom, when all the diskless clients were trying to start at
once.)  I think that the question is:  what does it mean to 'run
no more than...'.  Sure you can run more than 8-15, but the
performance will look worse.  If you are used to a local disk, then
you can 'feel'  the decrement with more than 8-15 diskless workstations
on the ethernet.  On the other hand, if you are willing to accept
low-performance transients (as the Brown folks evidently were on their
Apollos during startup), then you can do more.

Another angle: there are lots of reasons why performance could be different
between these two systems.  It is premature to point the finger at the
0/1 networking levels without more information.

-mark
-- 
Spoken: Mark Weiser 	ARPA:	mark@mimsy.umd.edu	Phone: +1-301-454-7817
After May 15, 1987: weiser@parcvax.xerox.com

connery@bnrmtv.UUCP (Glenn Connery) (05/11/87)

In article <34bd5209.c366@apollo.uucp>, mishkin@apollo.uucp (Nathaniel Mishkin) writes:
> I was fairly astonished to hear read one basically can run no more than
> (based on the various estimates) 8-15 diskless workstations (of some
> manufacture) on a single ether...  You just don't have these problems
> in a token ring...

Since you are not comparing equivalent systems this kind of interpretation
of the results seems rather unwarranted.  The discussion to date has
pointed out that the Suns are doing paging of the virtual memory over the
Ethernet.  Depending upon the way things are set up this could be a huge
load for the network to handle, regardless of the efficiency of the
access protocol.
-- 

Glenn Connery, Bell Northern Research, Mountain View, CA
{hplabs,amdahl,3comvax}!bnrmtv!connery

mishkin@apollo.uucp (Nathaniel Mishkin) (05/11/87)

In article <6603@mimsy.UUCP> mark@mimsy.UUCP (Mark Weiser) writes:
>In article <34bd5209.c366@apollo.uucp> mishkin@apollo.UUCP (Nathaniel Mishkin) writes:
>>...I think the recent discussion
>>in this group highlights some of the virtues of token ring networks.
>>I was fairly astonished to hear read one basically can run no more than
>>(based on the various estimates) 8-15 diskless workstations (of some
>>manufacture) on a single ether. 
>
>I think this is a misinterpretation of the comments.  I have seen
>Apollo networks exhibiting extremely poor performance when too many
>diskless nodes were accessing a single server.

I think there's some confusion here:  *I* was not talking about the number
of diskless workstations that could be booted off a single server.  Maybe
other people were.  It seemed that people were talking about the number
of diskless workstations that could be on a single local network (e.g.
ether or ring).

Further, let me make it clear that when I said I gave the range "8-15"
I was merely quoting the numbers that had appeared in the earlier articles
to which I was following up.  (I.e. I should not be considered an authority
on the performance characteristics of other manufacturer's workstations :)
Unless I was misreading, these quotes were from articles that seemed
to be discussing the number of diskless workstations per ether, not per
disked server.  I'll leave it to the real authorities to clear things up.

>Another angle: there are lots of reasons why performance could be different
>between these two systems.  It is premature to point the finger at the
>0/1 networking levels without more information.

Fair enough.  I was just trying to provide some more information that I thought
was relevant.
-- 
                    -- Nat Mishkin
                       Apollo Computer Inc.
                       Chelmsford, MA
                       {wanginst,yale,mit-eddie}!apollo!mishkin

mishkin@apollo.uucp (Nathaniel Mishkin) (05/11/87)

In an earlier posting of mine, I unjustly sullied the capabilities of
the NFS protocol in the area of caching.  My cursory reading the the
NFS Protocol Spec (which doesn't explicitly discuss caching issues) failed
to catch the frequent "attributes" return parameters that one is, I take
it, to use in cache management if one is to have an efficient NFS
implementation.

Open mouth; extract foot.
-- 
                    -- Nat Mishkin
                       Apollo Computer Inc.
                       Chelmsford, MA
                       {wanginst,yale,mit-eddie}!apollo!mishkin

jas@MONK.PROTEON.COM (John A. Shriver) (05/11/87)

We are looking at several effects here.  One is server saturation
proper-how fast its disks and protocols can run.  The next is
saturation of the server interface.  The third is saturation of the
LAN itself.  All three are sensitive to the LAN technology.

Server protocol performance can be effected relatively easily by LAN
packet size.  If you've got big packets (4K instead of 1.5K), you'll
take less interrupts and context switches.

Saturation of the server interface is to a great degree a matter of
good design.  Having enough buffering, a clean programming interface,
and an ability to pipeline can definitely help receive/transmit more
data.

However, having any level of data link flow or congestion control can
really help.  Most CSMA networks have no way to know if a packet was
really received at the server, or was dropped for lack of a buffer.
Some CSMA networks (DEC's CI) do this, and it helps a lot.  (Ethernet
does not.)  All of the Token-Ring networks (IBM's, our ProNET, ANSI's
FDDI standard) have this, in the "frame copied" bit that comes back
around from the recipient.  This makes the possibility of lost packets
due to server congestion dramatically lower, which really speeds
things up.  The data link can implement flow control & retransmission
much faster than the transport code.

The LAN itself can have dramatically different total capacity, which
matters when you want 3 servers on one LAN, not just one.  On 10
megabit networks, you can get more total data through, with less
delay, on a Token-Ring than a CSMA/CD network.  While vendors will
disagree on where CSMA/CD congests terminally (somewhere between 4 and
7 megabits/second), it is true that Token-Ring can really deliver all
10 megabits/second.

Moreover, at speeds beyond 10 megabits/second, CSMA/CD does not scale,
and you almost have to go Token-Ring.  (You can go CSMA/CA, but it can
degenerate into a Token-Bus.)  The FDDI standard is a Token-Ring, as
is the ProNET-80 product.

mishkin@apollo.UUCP (05/11/87)

[[This is a reposting of my response to Mark Weiser's article.  This
  one is slightly different from my earlier one.  If I spend any more time
  trying to figure out how to (successfully) cancel a previously posted
  article, I think I'll go insane.  Sorry for the noise.  --mishkin]]

In article <6603@mimsy.UUCP> mark@mimsy.UUCP (Mark Weiser) writes:
>I think this is a misinterpretation of the comments.  I have seen
>Apollo networks exhibiting extremely poor performance when too many
>diskless nodes were accessing a single server.

I think there's some confusion here.  *I* was talking about the number
of diskless workstations per ethernet, not per server.  I thought that's
what other people were talking about too.

Further, I want to make clarification:  When I referred to 8-15 as being
the maximum number of diskless workstations (per ethernet), I was *merely*
quoting the numbers that appeared in the articles to which I was following
up.  (I.e. I don't claim to be an expert on the performance characteristics
of other manufacturers' equipment.)  I'll let the real experts clear things
up.

>Another angle: there are lots of reasons why performance could be different
>between these two systems.  It is premature to point the finger at the
>0/1 networking levels without more information.

Climbing further out of the hole which I seem to have been digging myself
into:  I agree with you.  I was not trying to make a definitive comparison
between rings and ethers.  I was simply trying to add some more information
to the discussion.  A number of people here (at Apollo) have said to
me "Come on, this really can't be an ethernet saturation problem."  Others
have extolled ring networks in even other ways that I can barely
understand.

Finally, before I shrink away, I feel obliged to point out that lest
anyone get the wrong impression, Apollo believes that both ring and ether
networks are fine ideas.  These days, one can buy Apollo's DN3000s with
either or both of a ring or ethernet controller and all your Apollo
workstations can communicate (and share files) over complex
ring/ether/whatever internetwork topologies.

-- 
                    -- Nat Mishkin
                       Apollo Computer Inc.
                       Chelmsford, MA
                       {wanginst,yale,mit-eddie}!apollo!mishkin