[comp.os.mach] Context switches? Or something else

ken@gvax.cs.cornell.edu (Ken Birman) (10/13/89)

I was very interested by the two context switch postings for
the NeXT.   I notice that the NeXT benchmark uses native Mach
IPC.  Did the prior benchmark use something else (pipes, or UDP or
TCP?)

I ask because I have been interested in understanding the best
way (read best==fastest) to communicate in Mach and was unclear
on what I lose by running though UDP, rather than using Mach's
"non-standard" native scheme.  Could this 500us context switch
difference reflect a communication cost of some sort?

Has anyone done benchmarks comparing the cost of UDP, "UNIX domain"
TCP and Mach native communication for intra-machine and inter-machine
cases?  I would be very interested in seeing the figures... especially
if they covered scatter gather as well.

Ken Birman

Richard.Draves@CS.CMU.EDU (10/15/89)

> Excerpts from netnews.comp.os.mach: 13-Oct-89 Context switches? Or
> someth.. Ken Birman@gvax.cs.corne (766)

> Has anyone done benchmarks comparing the cost of UDP, "UNIX domain"
> TCP and Mach native communication for intra-machine and inter-machine
> cases?  I would be very interested in seeing the figures... especially
> if they covered scatter gather as well.


Mach IPC doesn't provide for scatter/gather in the sense of
readv/writev.  Mach messages can contain "out-of-line" segments of
memory, which are transferred with copy-on-write VM technology.  They
pop up somewhere in the receiver's address space (somewhere that isn't
being used, of course).  There is no way for the receiver to say where
they should go.  The receiver can use vm_write or vm_copy to move the
memory elsewhere, I suppose.  I've never seen it done.

Here are some measurements of local (between two tasks on a single
machine) RPC performance.  I used a Sun-3/60.

For the Mach RPC, I used a Mig interface which sends an integer in the
request and returns an integer in the reply.  The reply message is
actually bigger than the request, because Mig also includes an error
code in the reply.  The important point is that these times include the
overhead of the Mig stubs packing/unpacking the messages.  The client
uses msg_rpc and the server uses msg_receive/msg_send.  The server does
the receive on a port set.  (Port sets can contain multiple receive
rights.  They're like select in that they let a single thread receive
from multiple communication channels.  Most Mach servers use them.)  In
three trials of 10000 iterations each, I got 1.272, 1.298, 1.298
msecs/RPC.

For the Unix RPC, I used the same Mig-generated stubs to pack/unpack
messages, so that overhead remains the same and the same number of bytes
are moving through the kernel.  I used PF_UNIX/SOCK_STREAM sockets.  The
client used write/read; the server used select/read/write.  (So an RPC
took five system calls instead of three as in the Mach case.)  In three
trials of 10000 iterations each, I got 3.010, 3.048, 3.038 msecs/RPC. 
The select was only given one file descriptor; I don't know how that
affects it.  (Port sets scale properly.  The time to receive from a port
set with hundreds of ports is the same as the time to receive from a
single port.)  When I removed the select, I got 2.436, 2.436, 2.432
msecs/RPC.  Better, but in my experience Unix servers (like X) tend to
use select.

Please let me know if there is some faster way to use sockets.

Rich

Rick.Rashid@CS.CMU.EDU (10/15/89)

For contrast, I did a couple of experiments on VAX based systems using somewhat
simpler programs and using a version of the Mach IPC code which I optimized a
bit over the summer.  Generally speaking, the relative performance of Mach
calls on Vax architecture machines is greater than on other Mach based systems
at CMU because more work historically went into the machine dependent trap
interface on the Vaxen.

Mach RPC (4 bytes of data both ways):  538 microseconds
Pipe RPC (4 bytes of data both ways): 2498 microseconds
Pipe Select ( ditto):                 3487 microseconds
Mach RPC (Mig interface version):      608 microseconds

All times were on a MicroVAX III.

Richard.Draves@CS.CMU.EDU (10/15/89)

The Sun-3/60 times I posted were measured on a CS6a.STD+WS kernel.  CS6a
is the CMU name for Mach 2.5.  STD+WS is the kernel configuration; it is
a standard release configuration.  (WS refers to the set of device
drivers included in the build.)

Rich

raveling@isi.edu (Paul Raveling) (10/17/89)

In article <33153@cornell.UUCP>, ken@gvax.cs.cornell.edu (Ken Birman) writes:
> I was very interested by the two context switch postings for
> the NeXT.   I notice that the NeXT benchmark uses native Mach
> IPC.  Did the prior benchmark use something else (pipes, or UDP or
> TCP?)

	It used pipes.  It undoubtedly doesn't show off the best
	of each Unix variant, but it's portable among all.


----------------
Paul Raveling
Raveling@isi.edu