[comp.sys.isis] Long haul service, bypass mode communication

ken@gvax.cs.cornell.edu (Ken Birman) (02/13/90)

>From: Jacob Levy <jlevy%ee.technion.ac.il@CORNELLC.cit.cornell.edu>

>Ken

>I am at present unclear how and when the bypass code gets used.
>Also, could you post (or send to me if I missed it) an
>explanation on how to use the long haul stuff. This is something
>I will probably use to set up links between the Technion and
>Xerox PARC once my language (called Janus) is operational.

>Thanks,

>        --Jacob

I'll post the relevant manual pages.  With regard to "bypass" mode,
the rules are fairly simple.  The basic broadcast interface in ISIS is:
	bcast(gaddr_p, entry, "out-format", out-args, nwanted, ...);
Since the extended format supports a list of addresses, say that the dests
of a given broadcast B are { D0, D1, ..., Dk } where each Di is a group
address or a process address.

We use the bypass mode in the following situation:
	1) The length of the destination list must be 1 element
	2) This element must be a group address
	3) The sender must belong to the group
	4) The broadcast type must be one of fbcast, cbcast and abcast.
There are, however, a few caveats:
	5) bypass will also be used with the new "process list" mechanism
	   (which lets you send to subsets of a group's total membership)
	6) bypass mode will also be used on replies to a bypass broadcast;
	   i.e. you do reply(mp) and <mp> was received in bypass mode
	7) bypass mode will be used even when the bcast_l "x" option is used.
... anything else will be routed via the old ISIS protocols (transparently).

Bypass messages are sent via a direct "transport" facility.  The usual
one is built with UDP messaging and hence has cost linear in the number of
destinations; on a SUN 3/60 these are some approximate perf. estimates (we
will publish more carefully measured ones, so don't hold me to these..)

	1) Roughly .6ms to build the message and get it out of the ISIS clib
           (less for fbcast, which needs less ordering information)
	2) <n dests> * 1.5ms to send the packets via UDP (assumes small msg)
	3) On each remote dest, about 2ms to receive the packet and get it
	   back into the ISIS clib, including memory allocation
	4) for cbcast and abcast, another .7ms to unpack the ordering
	   information and apply it locally.
        5) for abcast from non-token holders: add one more cbcast to set the
           order; abcast from the abcast token holder has no such extra cost.
           (The abcast token holder is determined by ISIS, not by you).

	Total for a typical "RPC": betweeen 9.6ms. and 11ms over CBCAST,
	depending on load on your ether, message size, etc.  This is close to
        the SUN figures.  (Over fbcast our times drops a bit, but you lose
	the CBCAST guarantees.  If you pre-allocate tasks and use isis_wait
        over a dedicated tk_connect pipe, the RPC time might drop as low as 5ms.
        We lose in several ways because of the UNIX select() and recvfrom()
        interfaces).

Pat Stephenson is working on other transport protocols that use clever
schemes to beat the n*1.5ms + 2ms remote part of this down, he'll report
on his work some time in the future.  These transport protocols plug in
via an interface called isis_transport and are only used if you specify
a new bcast_l option.  Due to concurrency it is hard to give any single formula
for the performance to expect.  Pat is playing with an ethernet multicast
transport protocol now.

Hope this clears up things a little...   I'll post the current long-haul
manual page (actually the "spooler" manual page).