ken@gvax.cs.cornell.edu (Ken Birman) (11/21/89)
Some time ago, I promised to post RPC timing figures for ISIS when run using the new BYPASS facilities. The table that follows was generated using the newest and fastest version of this code at Cornell and should be roughly the same in ISIS V2.0. The test involved a process that times RPC's to: itself; a loop-back with no communication a local destination on the same machine using UDP for IPC a remote destination (different machine, still using UDP) The machines are SUN 3/60's running SUN OS 4.0.2 and the C compiler was run with optimization level 3 (-O3). Each test measures 100 "runs", each "run" measures the time to do 10 RPC calls. The message sent and the reply have the same size, which is is varied in these tests from 2 bytes to 131k bytes. Messages have an additional fixed overhead of about 256 bytes in this test. A typical line gives the fast observed time, the average, and the slowest observed time in milliseconds. The costs associated with UDP send and recvfrom and select are shown on the extreme right. We use "sendmsg" rather than "sendto" and this seems to be unexpected source of overhead. The bottom line is is that ISIS itself spends about 2.8ms computing, total, in these runs, and the remainder of the time is spent doing I/O. The 2.0 figure for the loop-back case is .3ms faster because we don't need to read a message from the wire and reconstruct it and .5ms because we skip the ISIS windowed transmission protocol in this case. By far the dominent cost factor turns out to be the kernel cost of select, recvfrom, sendto and sendmsg in these examples. Dynamic memory allocation, packet (re)construction and lightweight tasking overheads show up after this. For example, if you break down the 10ms for the local RPC 2-byte case, you get about 7.2ms in the IO calls and select, .5 in the windowing code, .3ms to rebuild the packet, and 2.0 doing task create and building the message and reply. Note that one can get a UDP ping-pong for 2-byte packets to run much faster -- I get a figure of between 4.6 and 5.0ms, but the cost of sending larger packets (due to ISIS overhead), calling select for two open file descriptors, and using sendmsg with 3 or 4 "iovector" entries rather than just 1 causes a signifcant overhead amounting to at least 2.2ms; the cost varies with the packet size. Bandwidth runs at betwen 280kbytes and 360kbytes/second for the larger packet sizes. For a 10Mbit ether, this is pretty respectable (TCP is faster but has a kernel implementation). [ MIN AVG MAX] self 2: [ 2.0, 2.0, 2.0] self 2048: [ 2.0, 2.0, 2.0] self 4096: [ 2.0, 2.0, 2.0] self 8192: [ 2.0, 2.0, 2.0] self 16384: [ 2.0, 2.0, 2.0] self 32768: [ 2.0, 2.0, 2.0] self 65536: [ 2.0, 2.0, 2.0] self 131072: [ 2.0, 2.0, 2.0] self 262144: [ 2.0, 2.0, 2.0] local 2: [10.0, 10.0, 10.0] local 2048: [20.0, 20.0, 20.0] local 4096: [22.0, 23.0, 24.0] local 8192: [44.0, 44.3, 46.0] local 16384: [82.0, 82.0, 84.0] local 32768: [146.0, 147.4, 148.0] local 65536: [284.0, 284.5, 286.0] local 131072: [562.0, 563.1, 566.0] remote 2: [12.0, 12.0, 12.0] remote 2048: [22.0, 23.8, 24.0] remote 4096: [32.0, 32.0, 32.0] remote 8192: [58.0, 58.0, 60.0] remote 16384: [108.0, 108.9, 110.0] remote 32768: [196.0, 197.9, 200.0] remote 65536: [374.0, 381.3, 392.0] remote 131072: [772.0, 819.7, 888.0] ... we've run up to a megabyte of data at a time, but the bandwidth actually starts to drop off above 131k because of flow control effects. I see little point in trying to trim more time off the ISIS part of this under SUN UNIX, but we might look at these figures again under MACH, where the local IPC is much faster. Multicast figures coming soon... Ken
mak@hermod.cs.cornell.edu (Mesaac Makpangou) (11/22/89)
In article <34468@cornell.UUCP> Ken writes: >For example, if you break down the 10ms for the local RPC 2-byte case, >you get about 7.2ms in the IO calls and select, .5 in the windowing code, >.3ms to rebuild the packet, and 2.0 doing task create and building the message >and reply. Note that one can get a UDP ping-pong for 2-byte packets to run >much faster -- I get a figure of between 4.6 and 5.0ms, but the cost of >sending larger packets (due to ISIS overhead), calling select for two >open file descriptors, and using sendmsg with 3 or 4 "iovector" entries >rather than just 1 causes a signifcant overhead amounting to at least >2.2ms; Is the number of "iovector" equal to the number of fields (both user and system fields) in an ISIS messages? > the cost varies with the packet size. The general argument you often used to advcate the pyggibacking mechanism for the previous versions of the system is that you believed there is no significant overhead due to the pyggiback mechanism. Does the fact that your measurement shows that "the cost varies with the packet size" change your mind? >Bandwidth runs at betwen 280kbytes and 360kbytes/second for the larger >packet sizes. For a 10Mbit ether, this is pretty respectable (TCP >is faster but has a kernel implementation). Year ago, I made few measurements of TCP on SUN 3/60 at the user level. I got a bandwidth of about 280kbytes with packets of size 1024 bytes. But I noticed that with packet of size 1400 bytes the bandwidth was around 250kbytes. For certain reasons, I did not try larger packets. Do you mean that for a certain larger size of TCP packet, I could get better bandwidth than the one I got with size 1024? Does anyone know what is the TCP packet size providing the fastest user bandwidth? A more general question: ----------------------- Why 2, 2048, 4098, 8192, ... ? Most people provide measures for a NULL RPC (i.e. no argument and no treatment). Why don't you provide this figure too? I mean does it exist any reason or is it just because there was particular reason to do so? RPC litterature say that RPC messages tend to be small (i.e. 4, 16, 32, 64, 256, 512, 1024 bytes). Does it any particular reason to jump from 2 bytes to 2kbytes? Mesaac
ken@gvax.cs.cornell.edu (Ken Birman) (11/22/89)
In article <34505@cornell.UUCP> mak@cs.cornell.edu (Mesaac Makpangou) writes: > >Is the number of "iovector" equal to the number of fields >(both user and system fields) in an ISIS messages? Our inter-site messages are packed, one or more per UDP transmission. If there are <n> messages in a UDP packet we send <n+1> io vectors, plus one additional one for each indirect data reference from a %*X format item (new in ISIS V2.0). >The general argument you often used to advcate the piggybacking mechanism for >the previous versions of the system is that you believed there is no >significant overhead due to this mechanism. > >Does the fact that your measurement shows that "the cost varies with the >packet size" change your mind? > The BYPASS code still uses piggybacking, but not for the types of operations we measured (which involve direct IO between processed without involving the protocols server). By the time you pay the 20ms or so of overhead for going through protos, I still doubt that a 2 or 3ms difference on the inter-site communication cost in order to send an extra chunk of data makes any significant difference. Basically, IO costs are completely constant for a given number of INET packets, each of which holds 1500 bytes. Basically, I continue to feel that piggybacking is having no major performance impact on ISIS, and that it isn't happening very much in "real" applications. Unfortunately, the issue is hard to quantize and nobody has wanted to build a simulation to study it in more detail. Note also that with the new scope features that are built into the ISIS cbcast algorithm, piggybacking is limited in any case. There are situations where piggybacking would kill you. A good example is in our "YP" server, which wouldn't want to respond to requests using cbcast. For cases like this, I am adding a reply_l("f"...) option to ISIS in V2.0, which will use a FIFO broadcast instead of cbcast for its reply. But, this only matters to people designing servers, which seems to be a small (empty set?) subgroup of ISIS users. > .... some questions about TCP My comment was based on "hearsay". I didn't actually measure TCP performance. >Most people provide measures for a NULL RPC (i.e. no argument and empty reply). >Why don't you provide this figure too? Why not 16, 32, 64, 128... This is a good point. Our test showed essentially identical performance values (to a fraction of a millisecond) for all packet sizes from 0 to 1k. 2k forces UDP to send two INET packets and this is the first time one sees a performance change. I didn't use 0 because my test program dumped core in that case. I need to check into this, obviously, but the figure would have been identical for 0...1350 or so, in any case, with the first change occuring as ISIS switched to two INET packets instead of one. Ken