[comp.protocols.tcp-ip] IBM's FAL and poor performance

HANK@BARILVM.BITNET (Hank Nussbacher) (08/26/90)

>A month ago, I wrote a very short message to this group concerning
>NIS.NSF.NET, saying "It certainly has a *terrible* implementation
>of TCP/IP."  I quickly received the following private message:
...
>Well, I spent about 15 hours benchmarking NIS.NSF.NET against other
>machines on the internet, carefully documenting the packet traces,
>and passing my observations on to IBM.  I did this at my own expense,
>as a matter of curiousity, because I thought that the folks who were
>sending the messages were "real programmers" who were really interested
>in getting it right (a Jay Elinsky was also involved).

I think in all fairness, you first need to identify yourself.  I have
looked at the Bitearn Nodes database ans was unable to find you listed
as a contact person at MSU.  Can you please describe your position at
the MSU computer center.

>Instead, my messages were sent to Merit, the operator of NSF.NET,
>causing a political uproar.  I personally do not appreciate finger pointing
>as a substitute for action, and in this case it was grossly unjustified.
>
>Let's keep the record straight: the folks at Merit *are* "real programmers".
>In the past, when I have uncovered a problem they have stayed up until
>5:30 in the morning tracking it down (and I suspect without overtime).
>The internet has been vastly improved during the tenure of their operation
>of the NSF net.

I cannot attest to how good the people at Merit are, but I can to the
quality of the people at IBM behind the Tcp/Ip package.  I have not
seen a CERT advisory yet for IBM's VM or MVS Tcp/Ip software but I
have seen quite a few for SunoS (rcp and sendmail to name just 2
recent ones).  To date I have not seen a report where IBM has violated
any IP "rules" and does things fast and loose.  I have seen cases of
vendors called to task in this list and others for ignoring certain
RFC rules, and people have to scramble to make things work around
them.

The people who wrote IBM's TCP/IP code are not reformed Cobol
programmers, but rather true programmers at the bit level.  They
understand all the intricacies of IP and know how to optimize code.

>The NIS.NSF.NET hardware was contributed by IBM, and I understand
>that the software is the latest IBM release.  My benchmark was carefully
>conducted for a direct comparison of the TCP/IP implementations,
>under the worst possible circumstances to make implementation flaws
>stand out clearly.  The failures demonstrated are plainly of the
>host, and not of the environment.

I would like to ask you some questions.  Did you determine whether
Merit is using an IBM 8232 to connect their channel to the Ethernet
or a BTI ELC or ELC2?  The difference is quite astounding.  Did you
check the level of activity on their Ethernet at the time?  If they
had 40% activity, there would be collisions which would cause a
slower response time.  Did you check the VM environment at Merit?
Are their disks fast enough?  Are they properly balanced?  Is Tcp/Ip
getting its full share of the resources?  A simple case where the
anonymous FTP disks you tried to get to being on the same disk pack
as swapping and being located at a sector far enough away, would
cause a slower response time (high seek time and head travel time).

You may be right that Merit "overall" is slower than some other
hosts you have looked at, but that does not mean that the Tcp/Ip
software of IBM is to blame.  It could be the IBM hardware, the
IBM mainframe or IBM VM software mis-optimization.  I will go out
on a limb and state that it is probably not the Tcp/Ip software
which is to blame but perhaps some other aspect, which may explain
why IBM forwarded your note to Merit.

>
>As to the "very satisfied" customers, I will point out that MSU has
>not yet installed the VM TCP/IP product, even though it came "free"
>with other products.  And I occasionally read "IBMTCP-L" on bitnet.
>'Nuf said.

Can you identify some non-satisified customers?  I am a satisified
customer.

>
>NIS.NSF.NET [35.1.1.48] is an IBM 4381P02 running VM/HPO-5 FAL 1.2.1
>(according to Merit -- there is no host DNS record).
>I understand it to be a lightly loaded machine.  The traces show
>less than MSS sized packets, too many retransmissions, timeouts of
>20 seconds or more, and failure to combine ACKs with bidirectional data.
...
>CONCLUSION:
>IBM-FAL for VM runs at 1/2 the speed of many of its competitors.
>It fails to meet several host requirements, not just for TCP/IP
>but for FTP and SMTP as well.

Can you elaborate which RFCs it fails to comply to especially in the
FTP and SMTP category?

>Bill Simpson
>    09998was@ibm.cl.msu.edu
>    09998was@msu.bitnet

Let's try to keep this a little bit balanced.

Hank Nussbacher

ROBERT@VM1.MCGILL.CA (Robert Craig) (08/26/90)

Mr. Simpson, if you did *that* much work, it would
seem to be in order to post a comparison of vendor
implementations, noting exactly where they fail in
compliance to the host requirements and in robustness,
performance, and other areas.  That would be constructive.

Your posting is inflammatory, filled
with unsubstantiated innuendo, and more typical of a
rumour-monger than of someone interested in learning
and in helping others.

If the tone of your original note to the supporters of TCP/IP
in IBM containing the results of your testing was in the
same vein, I don't wonder that you received no response from them
(as I intuit from the bitterness in your note).

For the record, we run FAL 1.2.1 on a 4381 with a BTI ELC.
We began with an 8232 and performance *was* abysmal.
It is much better now...
Note that IBM has replaced the
8232 with an improved product (3172?  I don't recall the
model number off-hand).

Robert Craig                          domain: robert@vm1.mcgill.ca
Senior Network Analyst                bitnet: robert@mcgill1
McGill University Computing Centre    Tel: (514) 398-3710
805 Sherbrooke St. W.                 FAX: (514) 398-6876
Montreal, Quebec H3A 2K6              CORISQ: (514) 398-RISQ

davecb@yunexus.YorkU.CA (David Collier-Brown) (08/27/90)

  More light, less heat, people!

  This was a criticism of observed performance: it should have carried a
disclaimer that that is exactly what it was.  Speculation about the hardware
configuration, the disk layout or the particular software product is
inadvisable.  
  Lets stick to facts: machine a is amazingly slow and doesn't do X.
machine b on the same net isn't...


--dave (if you haven't noticed I just attacked both sides, 
	you're not paying attention (:-)) c-b
ps: Criticism, even indirect, of any vendor is liable to cause
    unkindnesses in return mail.  Criticism of at least two will
    and do cause unkindnesses to your president. Facts are necessary,
    if insufficent, in such cases.]
-- 
David Collier-Brown,  | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or
72 Abitibi Ave.,      | {toronto area...}lethe!dave 
Willowdale, Ontario,  | "And the next 8 man-months came up like
CANADA. 416-223-8968  |   thunder across the bay" --david kipling

09998WAS%MSU@PUCC.PRINCETON.EDU ("Bill.Simpson") (08/27/90)

> Date: Sun, 26 Aug 90 07:41:20 EDT
> From: Robert Craig  <ROBERT@VM1.MCGILL.CA>
> Mr. Simpson, if you did *that* much work, it would
> seem to be in order to post a comparison of vendor
> implementations, noting exactly where they fail in
> compliance to the host requirements and in robustness,
> performance, and other areas.  That would be constructive.

Actually, I had tried to post a *concise* numerical presentation,
without detailed verbiage describing the specific failings I found.
I did send those details privately to IBM.  If there is enough interest,
I would be glad to post a synopsis here.  (it will be rather long.)

Moreover, the analysis would be limited to NIS.NSF.NET, as that is what
I looked at.  Analysis of many vendors' implementations would take a lot
more work than *that*.  Perhaps your organization is interested in
funding a series of such research?

My effort was merely to confirm my prior subjective observation with
objective, concrete numbers.

> Your posting is inflammatory, filled
> with unsubstantiated innuendo, and more typical of a
> rumour-monger than of someone interested in learning
> and in helping others.

Actually, the posting was well substantiated with several megabytes of
packet traces.  I deleted most of them a week or so ago, but if you'd
like, I'd be glad to flood your personal mailbox with the 300,000 or so
remaining (I think that would be a bit much for the entire mailgroup).
Send me a private message if you're sure you want them.

> If the tone of your original note to the supporters of TCP/IP
> in IBM containing the results of your testing was in the
> same vein, I don't wonder that you received no response from them
> (as I intuit from the bitterness in your note).

Unfortunately, your intuition is wrong.  If you had read part 1 carefully,
you would have realized that many messages were exchanged.  In fact,
10 of the now 17 hours devoted to this little project have been message
handling.  The "bitterness" arises from the virtual storm of mail due
to the political finger-pointing that arose in private messages.

> For the record, we run FAL 1.2.1 on a 4381 with a BTI ELC.
> We began with an 8232 and performance *was* abysmal.
> It is much better now...

For the record, I don't know nor care about the specific *hardware*
which was being compared.  (How was I to know, with no host DNS
record, and no announcement in the FTP welcome message.)
The hardware is whatever IBM chose to demonstrate their best to the
internet community.  I merely noted the machines which were being
compared, in very little detail, so that others would understand
the basis for comparison (more hops, heavy load, etc).

I am glad that with a different machine you have had better performance
(your notation means nothing to me).  I have been informed FAL performs
reasonably well with certain hardware in a direct connection with a
high bandwidth.  That does not speak to the issue of the internet as
a whole, where links may be lossy, gateways congested, etc.

I also have a number of messages saying that they have given up
entirely on NIS.NSF.NET, including one with a much more direct
connection than I have.  Also, messages thanking me for the posting....

Bill Simpson
   09998was@ibm.cl.msu.edu
   09998was@msu.bitnet