gordoni@chook.adelaide.edu.au (Gordon Irlam) (03/06/91)
Followups directed to comp.os.misc. This fell into a black hole the first time I tried to post it. The original discussion which prompted this was an attempt to compare Mach to various other flavours of unix by looking at the size of the executable image. One of the problems with comparing text sizes is the differences in machine architectures, and compilers. Here is another (flawed) attempt at comparing the sizes of a few operating systems. Presented below are some rough measurements of the number of lines of kernel source code for a number different operating systems (includes comments and blank lines). Kernel code (lines / 1000) Synthesis (Sun 3) 5 experimental Plan 9 (SG Power) 15 experimental V (Sun 3) >15 experimental Unix 32/V (VAX) 17 basic unix Minix 1.5 (IBM PC) 30 basic unix Ninth Edition Unix (Sun 3) 80 unix BSD 4.3 (VAX) 90 unix BSD 4.3 Tahoe (VAX) 100 unix System V R3.2 (3b2) 120 unix SunOS 4.03 (Sun 3 + Sun 4) 440 unix Umax 4.2 (Multimax) 280 multi unix Mach 2.0 (VAX) 140 multi unix (minimal) Mach 2.0 (VAX) 400 multi unix (full) Mach 3.0 (80386) 100 multi distributed kernel Chorus 3.2 (Compaq 386) 60 multi distributed kernel Chorus 3.2 (Compaq 386) 200 multi distributed kernel and unix All these figures are very rough. Typically I ran du on the sources, and applied the empirically determined constant of 38 lines per kilobyte of source. I then adjusted some of the figures as I saw fit. This was to account for sources that contained a large number of small files (where du counts each file as a whole block), or when the kernel directories contained a significant amount of documentation or dead code that should not be included; I was after the number of lines of code that are actually compiled to build a real kernel. Other factors that I have not attempted to account for are differences in coding density, number of comments, the presence of debugging code and so on. Don't believe any figures to within more than, say, 30%. A few of the values have been plucked from the net or various research papers. Notes follow (slightly inflammatory): Synthesis, Columbia - 5k. This is a very experimental system. I guess this is about as small as you can get and still have an operating system. Plan 9, Bell Labs - 15k. This is supposedly a real distributed operating system. The size is surprising. Either a lot of functionality we have come to expect is not present. Or most operating systems have accumulated a lot of dead wood over the years. Probably both. I think I can urge caution, at the suggestion that Plan 9 is going to replace System V, at least in the short term. V, British Columbia/Stanford - at least 15k. I have only seen the size quoted in some early papers. I suspect the final version was quite a bit larger. Deceased. Unix 32/V, Bell Labs - 17k. The first version of unix to run on a VAX. Minix 1.5, Tanenbaum - 30k. A "toy" system designed to teach the principles of operating systems design. Significantly larger than 32/V! Ninth Edition, Bell Labs - 80k. A more recent version of unix from Bell Labs (1987). Don't know enough about it to be able to make any nasty comments. BSD 4.3, Berkeley - 90-100k. Unix has grown by a factor of 5 in its lifetime on the VAX starting from 32/V, more to come. Admittedly an increasing portion of this has been to accommodate the ever increasing range of machine models, and obscure peripherals that are being developed. System V R3.2, AT&T - 120k. Cleaner code than BSD, which is a bit hacky, but not a very nice system to use. SunOS 4.03, Sun - 440k. This includes both the Sun 3 and Sun 4 versions. For either one alone I would guess about 350k lines all up. RPC, TMPFS, DLL, NFS, YP, POSIX, SVID, XPG, C2, it's all fun stuff, but not without its cost. I guess one important thing to note is the size of a basic Unix system is quite small in comparison to the amount of extra stuff added to provide all the functionality many people expect. But is it all really necessary, and does it have to be in the kernel? Umax 4.2, Encore - 280k. A reasonable attempt at porting BSD to a multiprocessor. Perhaps a more difficult task than it sounds. Despite all the documentation in the code Encore is too scared of the complexity to try and modify it so that it performs well. Mach, Carnegie-Mellon - 100-400k. Sizes should probably be reduced by about 20% to account for the RCS header logs that are included in the sources. Mach version 2.0 was essentially a multiprocessor version of BSD along with a few other bits that were re-written. Note the very large size of the full system. A large number of obscure device drivers are included, along with experimental communication facilities. Ditching all this and the debugger drops the size from 400k to 140k. A lot of barnacles have accumulated to BSD over the years. Mach 3.0 is an attempt to get rid of the barnacles and split the system into a small kernel, and a Unix sub-system running on top of the kernel. I will leave the word distributed, which I have used, to someone at CMU to justify - I can't. Chorus 3.2, Chorus Systemes - 60-200k. A distributed multiprocessor kernel developed from the ground up. The current Unix sub-system is based on System V, alas. Developed outside of the United States, and consequently largely ignored inside the United States. For fun here is the size of the total system including all the utilities and so on that are needed for a real system. Includes all the bin, lib, and sys directories, but not the man and doc directories. Varies a bit depending on whether the system comes with a Fortran compiler and so on, but I attempted to ignore things like X11. Total code (lines / 1000) Minix 1.5 (IBM PC) 170 Unix 32/V (VAX) 180 BSD 4.3 (VAX) 640 System V 3.2 (3b2) 960 Mach 2.0 (VAX) 1000 BSD 4.3 Tahoe (VAX) 1000 Umax 4.2 (Multimax) 1800 SunOS 4.03 (Sun 3, Sun 4) 2400 If anybody has any figures for the amount of source code in the Amoeba, OSF/1 and System V R4.0 kernels could they please post them, thanks. Gordon Irlam gordoni@cs.adelaide.edu.au