[comp.protocols.tcp-ip] TCP Ethernet Throughput

karl@hpscdc.scd.hp.com (Karl Watanabe) (02/03/90)

/ hpscdc:comp.protocols.tcp-ip / ncpmont@amdcad.AMD.COM (Mark Montgomery) /  8:12 pm  Jan 30, 1990 /
In article <582@berlioz.nsc.com> andrew@dtg.nsc.com (Lord Snooty @ The Giant Poisoned Electric Head       ) writes:
><2447@ncr-sd.SanDiego.NCR.COM>, ward@babel.SanDiego.NCR.COM (Ed Ward 2337) :
>> I'm looking for some information comparing the Ethernet throughput of the 
>> AMD lance, Seeq 8005, and Intel 82??? controller chips....
>
>I'd look at National's NIC too - they practically own the Enet market.
>-- 
>...........................................................................
>Andrew Palfreyman	andrew@dtg.nsc.com	Albania before April!

Whoa, boy!	I think that the statement "they practically own the Enet
market" is a bit rash.	Check with IBM, DEC, SUN or 3COM or look inside their
boxes.  I think you'll find that there are about twice as many AMD LANCE, SIA
and XCVR chips in them as all those other manufacturers have sockets combined.
"LOOK AGAIN" to quote the T.I. speak n' spell chip.		Mark
----------


I have opened a few HP LAN modules and I see AMD LANCE and SIA chips.

I don't how NATL Semi could possible "OWN" the ENET mkt.

Karl

pat@hprnd.HP.COM (Pat Thaler) (02/06/90)

Probably nobody "owns" the Ethernet IC market.  The National 
controller chip is used on a lot of newer cards for PC's (but 
certainly not all PC cards).  Cards for workstations often use the 
Intel and AMD chipsets.  The early SEEQ chips were also used
a lot on PC's, I'm not sure about the 8005.  People's perceptions of
who "owns" the market are probably colored by what machines they
normally work with.  (I expect that most of the IC vendors involved
don't release sales figures on their Ethernet chips.)

In my experience, how well a chip performs often depends on how it
interacts with the backplane it is interfaced to.  In other words,
it is possible that chip A will perform better than chip B in
a PC; but chip B will perform better than chip A in another backplane.
On the chips which support a buffer structure, how efficiently you
manage the buffers can also affect performance.  There may be a
trade off of efficiency in throughput for efficiency in memory usage.

It is not clear to me that a benchmark test of the chips against each
other in a set environment will indicate relative performance in 
another environment.

Pat Thaler

alexr@xicom.uucp (Alex Laney) (02/08/90)

In article <29000@amdcad.AMD.COM> ncpmont@amdcad.UUCP (Mark Montgomery) writes:
>In article <582@berlioz.nsc.com> andrew@dtg.nsc.com (Lord Snooty @ The Giant Poisoned Electric Head       ) writes:
>><2447@ncr-sd.SanDiego.NCR.COM>, ward@babel.SanDiego.NCR.COM (Ed Ward 2337) :
>>> I'm looking for some information comparing the Ethernet throughput of the 
>>> AMD lance, Seeq 8005, and Intel 82??? controller chips....
>>
>>I'd look at National's NIC too - they practically own the Enet market.
>>-- 
>>...........................................................................
>>Andrew Palfreyman	andrew@dtg.nsc.com	Albania before April!
>
>Whoa, boy!	I think that the statement "they practically own the Enet
>market" is a bit rash.	Check with IBM, DEC, SUN or 3COM or look inside their
>boxes.  I think you'll find that there are about twice as many AMD LANCE, SIA
>and XCVR chips in them as all those other manufacturers have sockets combined.
>"LOOK AGAIN" to quote the T.I. speak n' spell chip.		Mark

The figure is: about 85% of the Ethernet chipset market is supplied by
Nat. Semi. Surprising but true ... I know for certain that Western Digital
and others use the Nat. Semi chipset. How many chips does DEC use compared
to the IBM PC and compatible market( Novell, etc.)?

It's an independent figure from Dataquest or some organization like that.

-- 
Alex Laney, Xicom Group, National Semiconductor, Ottawa, Canada (613) 728-9099
uunet!mitel!sce!xicom!alex (alex@xicom.uucp)     Fax: (613) 728-1134
"You save time, increase the amount of work done and it is easy."

rusti@milk0.itstd.sri.com (Rusti Baker) (02/10/90)

I would be interested in seeing any type of discussion of 
the behaviour of the chips mentioned in this thread.

E.G. Clark, Jacobson et al provided this insight into the 
behavior of the LANCE in their June 1989 IEEE Comm.  article: 

"[the LANCE] locks up the memory bus during the transfer
thus stalling the processor"

ncpmont@amdcad.AMD.COM (Mark Montgomery) (02/10/90)

In article <29914@sparkyfs.istc.sri.com> rusti@milk0.itstd.sri.com.UUCP (Rusti Baker) writes:
>I would be interested in seeing any type of discussion of 
>the behaviour of the chips mentioned in this thread.
>E.G. Clark, Jacobson et al provided this insight into the 
>behavior of the LANCE in their June 1989 IEEE Comm.  article: 
>"[the LANCE] locks up the memory bus during the transfer
>thus stalling the processor"

Yes, and how many times have we seen in this group and protocols people
asking if anybody knew why their 3c5xx board was locking up their
system.
	Well, 3c5xx's don't have LANCE's on them, they have N__ chips.
	Also if you'll re-read the article you'll see that what they
	were saying was that the LANCE was "stalling" the cpu while
	it did dma of the packet directly to memory.  Can't do that
	with an XYZ chip.  Of course you could have the cpu do the
	transfers or you could build a cache if you'd rather.
				Mark

grr@cbmvax.commodore.com (George Robbins) (02/12/90)

In article <29914@sparkyfs.istc.sri.com> rusti@milk0.itstd.sri.com.UUCP (Rusti Baker) writes:
> 
> E.G. Clark, Jacobson et al provided this insight into the 
> behavior of the LANCE in their June 1989 IEEE Comm.  article: 
> 
> "[the LANCE] locks up the memory bus during the transfer
> thus stalling the processor"

Again, this is not neccessarily an attribute of the Lance chip, but rather
how a particular interface/system implements the Lance DMA/memory interface.
A different interface might implement a (logically) dual ported buffer
memory or other scheme and thus avoid this particular constraint.

Clearly indentifying the thruput constraints for each of the major ethernet
chipsets (let along their common instantiations) would be a major task.  Chip
selection is usually done on the basis of cost, familiarity, ease of interface
and sometimes even avoidance of known problems...

-- 
George Robbins - now working for,     uucp:   {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing:   domain: grr@cbmvax.commodore.com
Commodore, Engineering Department     phone:  215-431-9349 (only by moonlite)

phil@pepsi.amd.com (Phil Ngai) (02/13/90)

In article <29914@sparkyfs.istc.sri.com> rusti@milk0.itstd.sri.com.UUCP (Rusti Baker) writes:
|"[the LANCE] locks up the memory bus during the transfer
|thus stalling the processor"

My guess as to what was meant by this is that they are talking about the
LANCE requiring 600 ns to perform a memory cycle. That is, a zero wait
state memory cycle is 600 ns. (I don't know if you can do wait states.
This is based on my experience 6 years ago.) Although this might not
have been considered unusual when the LANCE came out in the early 80's,
10 MHz processors were only a dream at the time) it may seem slow in
an age of 25 MHz processors.

Any DMA device will lock up the memory bus, the question is how long?
The people who wrote your quote probably thought 600 ns was too long.

This is the kind of thing that a new device would probably do better.

I am not an official or unofficial spokesman for the company. This
is only my opinion.

Copyright 1990 by Phil Ngai. You may only distribute with the above
disclaimer.

--
Phil Ngai, phil@amd.com		{uunet,decwrl,ucbvax}!amdcad!phil
When guns are outlawed, only governments will have guns.

rusti@milk0.itstd.sri.com (Rusti Baker) (02/13/90)

Thanks to Mark Montgomery and George Robbins for clarification of the
problem of characterizing the performance of a chip set without considering
the interface/system.

>	Also if you'll re-read the article you'll see that what they
>	were saying was that the LANCE was "stalling" the cpu while
>	it did dma of the packet directly to memory.  Can't do that
>	with an XYZ chip.  Of course you could have the cpu do the
>	transfers or you could build a cache if you'd rather.
>				Mark

I was intrigued by the use of the word "stalling", versus "blocking" etc.
I had interpreted the remark in the article to mean that there was something
else going on (like the DMA had some other side effect).  Since the
authors did not elaborate, I was curious.

BILLW@MATHOM.CISCO.COM (William "Chops" Westfield) (02/14/90)

What happens is that the ethernet chips (both the Lance and Intel chips),
in their efforts to do fancy buffer managment, operate in a manner
similar to a processor (bus master), and have to share the bus with
the CPU chip.  To do this they implement a general purpose DMA scheme
that goes something like this:

	Ethernet: Can I have the bus?
	CPU:	OK (from this point on, the CPU can't access memory, and
		is "stalled").
	Ethernet: Lets, see, heres some address.  Now the address is valid,
		and here is some data... (and so on, hopefully for several
		words worth of data transfer).
	Ethernet: Ok, Im done.
	CPU:	Ok, I can start using the bus again...

This has the following problem:

    o	The DMA handshake takes several cycles, durring which no
	"useful" work is being done.
    o	The Lance and Intel chips both use multiplexed address/data
	pins, so that memory accesses by them take more cycles than
	they really ought to.
    o	The clock on the ethernet chip is typically 10MHz - much slower
	than most modern CPU chips.  This slows down both the memory
	accesses by the ethernet chip, and the the DMA handshake.
    o	A quick look at my lance book shows that the lance will
	take about 600 nS to read one word of memory - an eternity
	in a world of 80 nS main memory and 25 nS caches... Another
	250 nS goes by in between the time the DMA handshake finishes
	and the first DMA access starts.

So that's how ethernet controllers can "stall" a processor.  As others
have pointed out - clever hardware designs can get around this by using
dual ported memory, or other features.

Intel, AMD, and NS all have second generation chips out now.  I don't
know anything about them.  I do know that we in "higher level"
industry view any new chips with deep suspicion - early versions of
the first generation each had their share of serious bugs.  The Lance
and competitors may not be perfect, but at least they have reached the
point where they are fairly well understood.

BillW
-------

nn@lanta.Sun.COM (Neal Nuckolls) (02/14/90)

In article <29138@amdcad.AMD.COM>, phil@pepsi.amd.com (Phil Ngai) writes:
> In article <29914@sparkyfs.istc.sri.com> rusti@milk0.itstd.sri.com.UUCP (Rusti Baker) writes:
> |"[the LANCE] locks up the memory bus during the transfer
> |thus stalling the processor"
> 
> My guess as to what was meant by this is that they are talking about the
> LANCE requiring 600 ns to perform a memory cycle. That is, a zero wait
> state memory cycle is 600 ns. (I don't know if you can do wait states.
> This is based on my experience 6 years ago.) Although this might not
> have been considered unusual when the LANCE came out in the early 80's,
> 10 MHz processors were only a dream at the time) it may seem slow in
> an age of 25 MHz processors.
> 

Actually, an 8-word LANCE burst requires a full 4.8 us from start
to finish (600 ns/word).  The keyword with ethernet chips is
latency -- not bandwidth.  The bandwidth is easy.  Satisfying
the latency needs for, say, a LANCE, in applications more complex
than your typical PC is difficult.

------
Re: TCP Ethernet Throughput

Memory-to-memory TCP throughput between two SPARCstations 1's in
single user mode over an empty ethernet running SunOS 4.1 using a
32k send/receive window is approximately  1030 Kbytes/s .

Interestingly, the bottleneck in squeezing out the final few
Kbytes/s is the medium itself -- the receiver cannot return window
update (acknowledgment) packets and defers while the transmitter
is sending full 1500 byte ethernet frames back-to-back so the
transmitter runs out of "window" after 32k, then it gets all the
acknowledgments in one flood, short pause, then it transmits the
next 32k.  This type of behavior is a manifestation of CSMA/CD.

neal nuckolls
sun microsystems
nn@sun.com

hwajin@wrs.wrs.com (Hwa Jin Bae) (02/18/90)

In article <12566152530.17.BILLW@MATHOM.CISCO.COM> BILLW@MATHOM.CISCO.COM (William "Chops" Westfield) writes:
> [...]
>So that's how ethernet controllers can "stall" a processor.  As others
>have pointed out - clever hardware designs can get around this by using
>dual ported memory, or other features.

Oh yes... a voice of reason.  I know several designs that suffer from
this very problem.  Additionally, some of the VME CPU boards that have
on board LANCE chips (I won't mention names here...) and do not dedicate
a small pool of separate memory for the LANCE ring buffers should be banned.
A little separate RAM goes a long way... FAST.
-- 
hwajin@wrs.com (uunet!wrs!hwajin)   "Omnibus ex nihil ducendis sufficit unum."
Hwa Jin Bae, Wind River Systems, 1351 Ocean Avenue, Emeryville, CA 94606, USA

Andy.Linton@comp.vuw.ac.nz (Andy Linton) (02/20/90)

There have been a number of articles on the problems with the LANCE.

Has anyone any idea if this is used on the MICOM-Interlan boards, NI5010
and NP600A. I'm trying to track down a problem with transfer rates on
machines using these boards. Also there seems to be some debate about
whether the 3-Com boards use the LANCE.

Thanks
andy
--
SENDER = Andy Linton
EMAIL  = Andy.Linton@comp.vuw.ac.nz	PHONE = +64 4 721 000 x8978