[comp.sys.isis] ISIS RPC timing figures

ken@gvax.cs.cornell.edu (Ken Birman) (11/21/89)

Some time ago, I promised to post RPC timing figures for ISIS
when run using the new BYPASS facilities.  The table that follows
was generated using the newest and fastest version of this code
at Cornell and should be roughly the same in ISIS V2.0.

The test involved a process that times RPC's to:
	itself; a loop-back with no communication
	a local destination on the same machine using UDP for IPC
	a remote destination (different machine, still using UDP)
The machines are SUN 3/60's running SUN OS 4.0.2 and the C compiler
was run with optimization level 3 (-O3).  Each test measures 100
"runs", each "run" measures the time to do 10 RPC calls.  The
message sent and the reply have the same size, which is is varied
in these tests from 2 bytes to 131k bytes.  Messages have an additional
fixed overhead of about 256 bytes in this test.

A typical line gives the fast observed time, the average, and the slowest
observed time in milliseconds.   The costs associated with UDP send
and recvfrom and select are shown on the extreme right.  We use "sendmsg"
rather than "sendto" and this seems to be unexpected source of overhead.

The bottom line is is that ISIS itself spends about 2.8ms computing, total,
in these runs, and the remainder of the time is spent doing I/O.  The
2.0 figure for the loop-back case is .3ms faster because we don't need to
read a message from the wire and reconstruct it and .5ms because we skip
the ISIS windowed transmission protocol in this case.  By far the
dominent cost factor turns out to be the kernel cost of select, recvfrom,
sendto and sendmsg in these examples.  Dynamic memory allocation, packet
(re)construction and lightweight tasking overheads show up after this.

For example, if you break down the 10ms for the local RPC 2-byte case,
you get about 7.2ms in the IO calls and select, .5 in the windowing code,
.3ms to rebuild the packet, and 2.0 doing task create and building the message
and reply.  Note that one can get a UDP ping-pong for 2-byte packets to run
much faster -- I get a figure of between 4.6 and 5.0ms, but the cost of
sending larger packets (due to ISIS overhead), calling select for two
open file descriptors, and using sendmsg with 3 or 4 "iovector" entries
rather than just 1 causes a signifcant overhead amounting to at least
2.2ms; the cost varies with the packet size.

Bandwidth runs at betwen 280kbytes and 360kbytes/second for the larger
packet sizes.  For a 10Mbit ether, this is pretty respectable (TCP
is faster but has a kernel implementation).

                [ MIN    AVG     MAX]
self      2:    [ 2.0,   2.0,    2.0]
self   2048:    [ 2.0,   2.0,    2.0]
self   4096:    [ 2.0,   2.0,    2.0]
self   8192:    [ 2.0,   2.0,    2.0]
self  16384:    [ 2.0,   2.0,    2.0]
self  32768:    [ 2.0,   2.0,    2.0]
self  65536:    [ 2.0,   2.0,    2.0]
self 131072:    [ 2.0,   2.0,    2.0]
self 262144:    [ 2.0,   2.0,    2.0]
local      2:   [10.0,  10.0,   10.0]
local   2048:   [20.0,  20.0,   20.0]
local   4096:   [22.0,  23.0,   24.0]
local   8192:   [44.0,  44.3,   46.0]
local  16384:   [82.0,  82.0,   84.0]
local  32768:   [146.0, 147.4,  148.0]
local  65536:   [284.0, 284.5,  286.0]
local 131072:   [562.0, 563.1,  566.0]
remote      2:  [12.0,  12.0,   12.0]
remote   2048:  [22.0,  23.8,   24.0]
remote   4096:  [32.0,  32.0,   32.0]
remote   8192:  [58.0,  58.0,   60.0]
remote  16384:  [108.0, 108.9,  110.0]
remote  32768:  [196.0, 197.9,  200.0]
remote  65536:  [374.0, 381.3,  392.0]
remote 131072:  [772.0, 819.7,  888.0]

... we've run up to a megabyte of data at a time, but the bandwidth
actually starts to drop off above 131k because of flow control effects.
I see little point in trying to trim more time off the ISIS part of this
under SUN UNIX, but we might look at these figures again under MACH,
where the local IPC is much faster.  Multicast figures coming soon...

Ken

mak@hermod.cs.cornell.edu (Mesaac Makpangou) (11/22/89)

In article <34468@cornell.UUCP> Ken writes:

>For example, if you break down the 10ms for the local RPC 2-byte case,
>you get about 7.2ms in the IO calls and select, .5 in the windowing code,
>.3ms to rebuild the packet, and 2.0 doing task create and building the message
>and reply.  Note that one can get a UDP ping-pong for 2-byte packets to run
>much faster -- I get a figure of between 4.6 and 5.0ms, but the cost of
>sending larger packets (due to ISIS overhead), calling select for two
>open file descriptors, and using sendmsg with 3 or 4 "iovector" entries
>rather than just 1 causes a signifcant overhead amounting to at least
>2.2ms;

Is the number of "iovector" equal to the number of fields 
(both user and system fields) in an ISIS messages? 

> the cost varies with the packet size.

The general argument you often used to advcate the pyggibacking mechanism for
the previous versions of the system is that you believed there is no significant
overhead due to the pyggiback mechanism.

Does the fact that your measurement shows that "the cost varies with the packet size"
change your mind?

>Bandwidth runs at betwen 280kbytes and 360kbytes/second for the larger
>packet sizes.  For a 10Mbit ether, this is pretty respectable (TCP
>is faster but has a kernel implementation).

Year ago, I made few measurements of TCP on SUN 3/60
at the user level. I got a bandwidth of about 280kbytes with packets of
size 1024 bytes. But I noticed that with packet of size 1400 bytes the bandwidth
was around 250kbytes. For certain reasons, I did not try larger packets.

Do you mean that for a certain larger size of TCP packet, I could get better
bandwidth than the one I got with size 1024? Does anyone know what is the
TCP packet size providing the fastest user bandwidth?

A more general question:
-----------------------
Why 2, 2048, 4098, 8192, ... ?

Most people provide measures for a NULL RPC (i.e. no argument and no treatment).
Why don't you provide this figure too? I mean does it exist any reason or is it
just because there was particular reason to do so?

RPC litterature say that RPC messages tend to be 
small (i.e. 4, 16, 32, 64, 256, 512, 1024 bytes). Does it any particular
reason to jump from 2 bytes to 2kbytes?

Mesaac

ken@gvax.cs.cornell.edu (Ken Birman) (11/22/89)

In article <34505@cornell.UUCP> mak@cs.cornell.edu (Mesaac Makpangou) writes:
>
>Is the number of "iovector" equal to the number of fields 
>(both user and system fields) in an ISIS messages? 

Our inter-site messages are packed, one or more per UDP transmission.
If there are <n> messages in a UDP packet we send <n+1> io vectors,
plus one additional one for each indirect data reference from a %*X
format item (new in ISIS V2.0).

>The general argument you often used to advcate the piggybacking mechanism for
>the previous versions of the system is that you believed there is no
>significant overhead due to this mechanism.
>
>Does the fact that your measurement shows that "the cost varies with the
>packet size" change your mind?
>

The BYPASS code still uses piggybacking, but not for the types of operations
we measured (which involve direct IO between processed without involving
the protocols server).  By the time you pay the 20ms or so of overhead
for going through protos, I still doubt that a 2 or 3ms difference on the
inter-site communication cost in order to send an extra chunk of data
makes any significant difference.

Basically, IO costs are completely constant for a given number of INET
packets, each of which holds 1500 bytes.

Basically, I continue to feel that piggybacking is having no major performance
impact on ISIS, and that it isn't happening very much in "real" applications.
Unfortunately, the issue is hard to quantize and nobody has wanted to build
a simulation to study it in more detail.

Note also that with the new scope features that are built into the ISIS
cbcast algorithm, piggybacking is limited in any case.

There are situations where piggybacking would kill you.  A good example
is in our "YP" server, which wouldn't want to respond to requests using
cbcast.  For cases like this, I am adding a reply_l("f"...) option to ISIS
in V2.0, which will use a FIFO broadcast instead of cbcast for its reply.
But, this only matters to people designing servers, which seems to be a small
(empty set?) subgroup of ISIS users.

> .... some questions about TCP

My comment was based on "hearsay".  I didn't actually measure TCP performance.

>Most people provide measures for a NULL RPC (i.e. no argument and empty reply).
>Why don't you provide this figure too?   Why not 16, 32, 64, 128...

This is a good point.  Our test showed essentially identical performance
values (to a fraction of a millisecond) for all packet sizes from 0 to 
1k.  2k forces UDP to send two INET packets and this is the first time
one sees a performance change.  I didn't use 0 because my test program
dumped core in that case.  I need to check into this, obviously, but the
figure would have been identical for 0...1350 or so, in any case, with
the first change occuring as ISIS switched to two INET packets instead of
one.

Ken