[comp.os.mach] Operating system sizes

gordoni@chook.adelaide.edu.au (Gordon Irlam) (03/06/91)

Followups directed to comp.os.misc.

This fell into a black hole the first time I tried to post it.  The
original discussion which prompted this was an attempt to compare Mach
to various other flavours of unix by looking at the size of the
executable image.

One of the problems with comparing text sizes is the differences in
machine architectures, and compilers.  Here is another (flawed) attempt
at comparing the sizes of a few operating systems.

Presented below are some rough measurements of the number of lines of
kernel source code for a number different operating systems (includes
comments and blank lines).

                          Kernel code
                         (lines / 1000)

Synthesis (Sun 3)               5    experimental
Plan 9 (SG Power)              15    experimental
V (Sun 3)                     >15    experimental
Unix 32/V (VAX)                17    basic unix
Minix 1.5 (IBM PC)             30    basic unix
Ninth Edition Unix (Sun 3)     80    unix
BSD 4.3 (VAX)                  90    unix
BSD 4.3 Tahoe (VAX)           100    unix
System V R3.2 (3b2)           120    unix
SunOS 4.03 (Sun 3 + Sun 4)    440    unix
Umax 4.2 (Multimax)           280    multi unix
Mach 2.0 (VAX)                140    multi unix (minimal)
Mach 2.0 (VAX)                400    multi unix (full)
Mach 3.0 (80386)              100    multi distributed kernel
Chorus 3.2 (Compaq 386)        60    multi distributed kernel
Chorus 3.2 (Compaq 386)       200    multi distributed kernel and unix

All these figures are very rough.  Typically I ran du on the sources,
and applied the empirically determined constant of 38 lines per
kilobyte of source.  I then adjusted some of the figures as I saw fit.
This was to account for sources that contained a large number of small
files (where du counts each file as a whole block), or when the kernel
directories contained a significant amount of documentation or dead
code that should not be included; I was after the number of lines of
code that are actually compiled to build a real kernel.  Other factors
that I have not attempted to account for are differences in coding
density, number of comments, the presence of debugging code and so on.
Don't believe any figures to within more than, say, 30%.  A few of the
values have been plucked from the net or various research papers.

Notes follow (slightly inflammatory):

Synthesis, Columbia - 5k.  This is a very experimental system.  I
    guess this is about as small as you can get and still have an
    operating system.

Plan 9, Bell Labs - 15k.  This is supposedly a real distributed
    operating system.  The size is surprising.  Either a lot of
    functionality we have come to expect is not present.  Or most
    operating systems have accumulated a lot of dead wood over the
    years.  Probably both.  I think I can urge caution, at the
    suggestion that Plan 9 is going to replace System V, at least in
    the short term.

V, British Columbia/Stanford - at least 15k.  I have only seen the
    size quoted in some early papers.  I suspect the final version was
    quite a bit larger.  Deceased.

Unix 32/V, Bell Labs - 17k.  The first version of unix to run on a
    VAX.

Minix 1.5, Tanenbaum - 30k.  A "toy" system designed to teach the
    principles of operating systems design.  Significantly larger than
    32/V!

Ninth Edition, Bell Labs - 80k.  A more recent version of unix from
    Bell Labs (1987).  Don't know enough about it to be able to make
    any nasty comments.

BSD 4.3, Berkeley - 90-100k.  Unix has grown by a factor of 5 in its
    lifetime on the VAX starting from 32/V, more to come.  Admittedly
    an increasing portion of this has been to accommodate the ever
    increasing range of machine models, and obscure peripherals that
    are being developed.

System V R3.2, AT&T - 120k.  Cleaner code than BSD, which is a bit
    hacky, but not a very nice system to use.

SunOS 4.03, Sun - 440k.  This includes both the Sun 3 and Sun 4
    versions.  For either one alone I would guess about 350k lines all
    up.  RPC, TMPFS, DLL, NFS, YP, POSIX, SVID, XPG, C2, it's all fun
    stuff, but not without its cost.  I guess one important thing to
    note is the size of a basic Unix system is quite small in
    comparison to the amount of extra stuff added to provide all the
    functionality many people expect.  But is it all really necessary,
    and does it have to be in the kernel?

Umax 4.2, Encore - 280k.  A reasonable attempt at porting BSD to a
    multiprocessor.  Perhaps a more difficult task than it sounds.
    Despite all the documentation in the code Encore is too scared of
    the complexity to try and modify it so that it performs well.

Mach, Carnegie-Mellon - 100-400k.  Sizes should probably be reduced by
    about 20% to account for the RCS header logs that are included in
    the sources.  Mach version 2.0 was essentially a multiprocessor
    version of BSD along with a few other bits that were re-written.
    Note the very large size of the full system.  A large number of
    obscure device drivers are included, along with experimental
    communication facilities.  Ditching all this and the debugger
    drops the size from 400k to 140k.  A lot of barnacles have
    accumulated to BSD over the years.  Mach 3.0 is an attempt to get
    rid of the barnacles and split the system into a small kernel, and
    a Unix sub-system running on top of the kernel.  I will leave the
    word distributed, which I have used, to someone at CMU to justify
    - I can't.

Chorus 3.2, Chorus Systemes - 60-200k.  A distributed multiprocessor
    kernel developed from the ground up.  The current Unix sub-system
    is based on System V, alas.  Developed outside of the United
    States, and consequently largely ignored inside the United States.

For fun here is the size of the total system including all the
utilities and so on that are needed for a real system.  Includes all
the bin, lib, and sys directories, but not the man and doc
directories.  Varies a bit depending on whether the system comes with
a Fortran compiler and so on, but I attempted to ignore things like
X11.

                           Total code
                         (lines / 1000)

Minix 1.5 (IBM PC)            170
Unix 32/V (VAX)               180
BSD 4.3 (VAX)                 640
System V 3.2 (3b2)            960
Mach 2.0 (VAX)               1000
BSD 4.3 Tahoe (VAX)          1000
Umax 4.2 (Multimax)          1800
SunOS 4.03 (Sun 3, Sun 4)    2400

If anybody has any figures for the amount of source code in the
Amoeba, OSF/1 and System V R4.0 kernels could they please post them,
thanks.


                                            Gordon Irlam
                                            gordoni@cs.adelaide.edu.au