[comp.sys.isis] Measuring the performance of ISIS

ken@gvax.cs.cornell.edu (Ken Birman) (10/24/90)
I've had a few questions about how to measure the performance of ISIS,
most recently from a group at LANL.  I know that many groups like to do
these sorts of measurements locally, and just want to outline the issues
and approach we recommend.

First, it will be important to keep in mind that ISIS has two sets of
protocols, the "bypass" ones and the old non-bypass ones in "protos".
To get the bypass ones you compile with -DBYPASS... in version V2.1.
In Version 3.0 and beyond, the bypass protocols will be the default and
the role of the protos ones will be greatly diminished, because the
pg_client interface will begin to support bypass communication too.
This is important, because the old protocols had several performance
problems:
   1) They were slow structurally (RPC to protos and IPC to deliver messages
      to clients -- typical costs were in the 20-25ms range).
   2) They tended to "congest" easily due to large amounts of memory being
      needed in protos (mostly, for lightweight tasks)
Both of these problems are largely eliminated in the bypass case, although
(for the time being) the bypass code is only used under certain restrictions,
documented in the man pages and manual itself.

Next, you need to decide what you plan to measure:
   1) Throughput-a: the amount of data per second ISIS can pump over a link
   2) Throughput-b: the number of multicasts per second ISIS can do in a group
      or system-wide
   3) Latency-a: the delay associated with starting an asynchronous multicast
   4) Latency-b: the delay associated with an RPC-style communication
      (and here, need to think about who did the reply: local? remote?)
   5) Latency-c: the delay associated with a all-replies multicast
Other issues to control are whether the sender also gets a copy of the message
(i.e. whether or not to use option "x"), and the type of multicast to test
(we recommend "cbcast" because cbcast is the main protocol for most of ISIS)

Next, there is a question of structuring the test program itself.  You
will want a controlled test -- one in which you know pretty much what is
going on.  ISIS is predictable only if you know how many tasks are active
and what they are up to; if you do a test that forks off a lot of work in
the background and trusts the scheduling to "even things out" you may
instead see the system thrash, or at least see erratic performance.  For
example, it would be a bad idea to just see how many bcasts you can fork
off with option "f" at one time.  This is more or less a fluke of the 
point at which the system congests and won't measure anything reasonable.
Plus, when severely congested, performance tends to degrade (paging becomes
a factor because the programs get so big...)

Examples of good test sequences follow:

    1) RPC style: one thread does, say, 100 cbcasts waiting for replies,
       times total elapsed time.
    2) Async IPC style: one thread does, say, 100 cbcasts, no replies,
       times total elapsed time.  Using synchronized clocks, it is easy
       to get bandwidth and latency-to-delivery numbers here too.
    3) "Maximum presented load" style: fork off a bunch of threads (t_fork),
       each loops doing a test of type 1 or 2.  Measure performance for each
       bcast, but also for system as a whole.

You'll see real strange numbers in situations where the bypass protocols are
not used and congestion starts to kick in.  Also, problems will show up if
your systems start to page heavily due to large virtual address spaces, or
if you aren't careful about layout of the programs on your network and put,
say, 3 or four on one machine (inviting UNIX to run its somewhat unpredictable
scheduling algorithms).  

To guard against this, while testing your performance test, we suggest that
you run "top", "vmstat", and or "prstat".  (The latter is an ISIS utility;
you'll be watching the "congestion" flags)

I hope this helps.  If you see figures that deviate substantially from what
you expected, drop us a note...

Ken