[comp.os.mach] The value of microbenchmarking

af@cs.cmu.edu (Alessandro Forin) (03/18/91)

In article <9337@star.cs.vu.nl>, ast@cs.vu.nl (Andy Tanenbaum) writes:
> 
> What are the current figures for the 3.0 microkernel for sending a null
> message from user space on one machine over the Ethernet to another
> user process and then back, i.e. the null RPC time?  Also, what is the
> maximum user-to-user bandwidth in 3.0?  If possibly, what are they 
> on Sun 3/60s, to compare them with the numbers I published in the Dec, 1990
> CACM.
> 
> Andy Tanenbaum (ast@cs.vu.nl)

It is customary in our field to use microbenchmarks such as ast's one
when talking about a system, for various (practical) reasons.  However,
that is only meaningful for comparisons purposes when all other 
things are equal.  In this case I strongly fear we are comparing apples
and oranges.
I was, for instance, surprised to hear from a friend that visited
Denmark last summer that Amoeba is quite far from being a usable system.
Certainly things have improved since, but I will only be willing to
compare Mach and Amoeba seriously when I will be able to:

1) login from testarossa.mach.cs.cmu.edu to a machine running Amoeba
2) compile a C program, maybe run some Lisp code
3) send/receive mail locally
4) locally produce a paper describing my impressions of the system
5) come back a couple weeks later and find all my files still there

I can certainly volunteer to ast an account on testarossa which
has been a Mach 3.0 system for quite some time, to prove that it 
meets all the above criteria.
[A quick way to summarize the above tests is to answer the question
 "Does Andy Tanenbaum use Amoeba as his own operating system ?".]

Please note that Sprite and the V kernel both pass, as far as I know,
the above tests.  In addition, I have a Sprite distribution tape that
I received at no charge and installed myself on my machine, but that's
a story for another day...

Respectfully yours,
sandro-
PS: Incidentally, does the "local RPC" optimization take into account
heterogeneity ? I mean, what are the times between a Sun 3/60 and
a VaxStation 3100.  Just curious.

tdw@uk.ac.cam.cl (Tim Wilson) (03/23/91)

In comp.os.mach article <1991Mar18.155406.1320@cs.cmu.edu>
af@cs.cmu.edu (Alessandro Forin) wrote:
[...]
> I will only be willing to compare Mach and Amoeba seriously when I
> will be able to:

> 1) login from testarossa.mach.cs.cmu.edu to a machine running Amoeba
> 2) compile a C program, maybe run some Lisp code
> 3) send/receive mail locally
> 4) locally produce a paper describing my impressions of the system
> 5) come back a couple weeks later and find all my files still there
[...]

Isn't a comparison of UNIX emulations just as much a microbenchmark as
null RPC times?  

Tim
--
Tim Wilson

Univ of Cambridge Computer Laboratory, Cambridge, UK	(Not representing them)
tdw@cl.cam.ac.uk     ...!uunet!mcsun!ukc!cam-cl!tdw     Tel: +44 223 334626

af@cs.cmu.edu (Alessandro Forin) (03/26/91)

> Isn't a comparison of UNIX emulations just as much a microbenchmark as
> null RPC times?  

I have not mentioned Unix in my post at all, and it was not assumed in any
whatsoever way.  As a matter of fact, we can pass most of those tests with
our MS-DOS emulator as well :-)) But you'll agree with me that an operating
system that cannot fulfill the needs of its users (by whatever means and
standards) is not an operating system but something else.  To qualify as a
"general purpose multiuser operating system" I believe the tests I mentioned
can be considered the bare minimum, esp in these days of heavy networking.

That having been said, the answer to your question is no.  What I have
described is a functionality test, not a performance test.  After a system
passes the functionality tests [e.g. we are sure it is a complete system,
there won't be surprises in it for users] we can go off to performance
testing as much as you wish and have a lot of fun in the process too.
Usually, microbenchmarks tend to be significant only for knowledgeable
people [e.g. those who have implemented a message passing kernel understand
the implications of certain numbers better than others], while
macrobenchmarks such as a large compilation or something are easily
understood and useful to many more people.  But each one is good in its own
way.

To those who have done it, comparing two full-blown Unix emulations is by no
mean a microbenchmark.
As for the value of it.. it is my experience that there is a lot to be
learned and understood by doing that also.  We have tried to convey this and
other impressions in the USENIX paper, in the same conference there was a
paper of the V kernel group on the same subject: by reading both papers I
think one can get a better but still partial feeling of the work involved
and of its value, merits and limits.

BTW, I did not mean the list of OSes in the previous post to be complete in
any whatsoever way, just a couple of examples of other "research operating
systems" with which meaningful comparisons viz Mach have been made.
There are others, and they know who they are.

sandro-