ram@shukra.UUCP (08/30/87)
Hi, I am curious about kernel make times. Last week I had been to the local Unix user group (called SVnet - Silicon Valley ...) where Ron Parson - Sequent's Marketing Mgr gave a presentation on Balance 21XXX. He claimed that a kernel make on the Balance is ~3hrs. I have seen ads by Amdahl which claims 3 minutes. I know that this issue was raised by John sometime back. I am on the hunt for kernel make numbers. Anybody knowing or having those number be kind enough to mail/post the numbers. I know that the make time is subject to number & types of drivers etc. Inlcude your number & whatever other details you want to. --------------------- Renu Raman ARPA:ram@sun.com Sun Microsystems UUCP:{ucbvax,seismo,hplabs}!sun!ram M/S 5-40, 2500 Garcia Avenue, Mt. View, CA 94043
malcolm@spar.UUCP (08/31/87)
In article <26853@sun.uucp> ram%shukra@Sun.COM (Renu Raman, Sun Microsystems) > I am curious about kernel make times. Last week I had been to the > local Unix user group (called SVnet - Silicon Valley ...) where > Ron Parson - Sequent's Marketing Mgr gave a presentation on > Balance 21XXX. He claimed that a kernel make on the Balance > is ~3hrs. This number (3 hours) was to make the ENTIRE system. This includes the kernel, man pages, user programs and everything else. Amdahl (and previous discussions on the net) were talking about just the kernel. Malcolm
henry@utzoo.UUCP (Henry Spencer) (08/31/87)
> I know that the make time is subject to number & types of drivers > etc. Inlcude your number & whatever other details you want to. It also depends on the flavor of kernel. A V6 kernel compiled on, say, a Sun 4 (assuming you could get the V6 kernel past a modern C compiler!) would probably beat the Amdahl record, simply because the V6 kernel wasn't very big. -- "There's a lot more to do in space | Henry Spencer @ U of Toronto Zoology than sending people to Mars." --Bova | {allegra,ihnp4,decvax,utai}!utzoo!henry
parsons@sequent.UUCP (Ron V. Parsons) (09/03/87)
I'm Ron Parsons, the Technical Marketing Manager for Sequent Computers Renu Raman of Sun Microsystems referred to me talking about kernel make numbers. Here is some more data on the subject: Building the DYNIX system represents a significant amount of work. The DYNIX system includes all of 4.2bsd (with parallel enhancements) and much of AT&T System V Release 2.2. There are almost 6000 files in the DYNIX binary distribution. Of these, over 3000 must be compiled from C source, almost 300 must be interpreted by make and 60 must be directly assembled. There are almost 4000 compilations and assemblies and over 600 invocations of the nroff text formatter to build the on-line documentation. Low-effort, large-grained parallelization of the make utility reduced the time required to build the DYNIX system on the Balance 8000 computer by a factor of seven point five from the single-stream version of make. Table 1 shows the DYNIX system build times on a VAX(tm)11/750 and on various hardware configurations of the Balance 8000 sys- tem. The percentage of CPU usage indicates how well the build is utilizing the available resources. As expected, highly parallel builds use a greater percentage of the available CPU time in both monoprocessor and multiprocessor confi- gurations. Table 1 DYNIX 2.0 build times and CPU usage _______________________________________________________________________________ | | Single-stream | Modest parallelism| Maximal parallelism| | | (-P1) | (-P2) | (-P4) | |Config |__________|________|__________|_________|__________|__________| | | Time | CPU use| Time | CPU use | Time | CPU use | | | (hh:mm) | | (hh:mm) | | (hh:mm) | | |______________|__________|________|__________|_________|__________|__________| | VAX11/750 | 30:25 | 85% | 28:00 | 94% | - | - | |______________|__________|________|__________|_________|__________|__________| | Balance 8000 | 22:30 | 90% | - | - | - | - | | with 1 proc | | | | | | | |______________|__________|________|__________|_________|__________|__________| | Balance 8000 | - | - | 10:42 | 185% | - | - | | with 2 proc | | | | | | | |______________|__________|________|__________|_________|__________|__________| | Balance 8000 | - | - | 5:30 | 374% | 4:14 | 505% | | with 6 procs | | | | | | | |______________|__________|________|__________|_________|__________|__________| | Balance 8000 | - | - | - | - | 3:03 | 711% | | with 12 procs| | | | | | | |______________|__________|________|__________|_________|__________|__________| This table presents the build times (in real time) for the VAX11/750 and for various configurations of the Balance 8000 system. The percent CPU usage for each build is also given. The VAX11/750 configuration includes 8 Mbytes of memory and Fujitsu Eagle disks. The Balance 8000 configurations in- clude 10 Mbytes of memory and Fujitsu Eagle disks. The -P values indicate the relative amounts of parallelism for each build CPU usage for multiprocessor configurations reflects the percentage of a single CPU (i.e. 505% CPU use in a 6 proces- sor system is equivalent to 100% usage of 5.05 CPUs).
steve@nuchat.UUCP (Steve Nuchia) (09/04/87)
In article <8526@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) writes: [things kernel make times depend on...] > It also depends on the flavor of kernel. A V6 kernel compiled on, say, > a Sun 4 (assuming you could get the V6 kernel past a modern C compiler!) > would probably beat the Amdahl record, simply because the V6 kernel wasn't > very big. Being a software developer, the one benchmark that I pay a lot of attention to is the compiler. I once had the immence pleasure of developing serious code on a sun, and made extensive use of the graphical system performance monitoring facilities in support of that project. It is instructive to observe how very little of the elapsed time of a compile is spent at 100% cpu utilization and how much at maximum I/O bandwidth. The "make <something large>" benchmark will usually tell you more about the I/O performance of the machine than it does about the processor. That is a good thing, since I/O performance (including system call overhead, of course) is a pretty important system parameter. Using make as a benchmark we were able to quantify the braindamage of quite a few machines, some of which made claims of being high-performance. The specific benchmark we used at my former employer was to compile the product, which weighed in at about 30,000 lines of C and made four or five large programs and a couple of dozen smaller ones. We then ran a comprehensive set of automated tests on the resulting product. The whole job took 8 - 12 hours on the typical "super-micro" of 2 - 3 years ago. Interresting thing about it, as with most benchmarks, are the standout exceptions. We thought the pyramid was fast when it turned in a time of around three hours. Then we did an Amdahl, and the whole thing, including reading and writing the tapes, took 45 minutes on a heavily loaded system. The testing part would normally display the faked user interactions as it went along, so serial I/O was included in the timing. For most machines this didn't make much difference; they couldn't work fast enough to saturate a 9600 baud line. Then we got our first 68020 box, the convergent mightyframe (this was well over a year ago). It was so fast it could run the testing on two 9600 baud terminals in the same elapsed time as one. It was spending less than half the cpu and I/O keeping each 9600 baud line saturated. Then there were the losers. The old Radio Shack 16b's had a z80 handling all the I/O, including the hard disk. That thing took over 24 hours to run the whole cycle. And the Arete' box, which is supposed to have a lot of I/O power, failed. The master processor couldn't handle the system call overhead to keep the slave working. The box we looked at had 2 68010s, and they claim the '020s work better, but we didn't buy the box. The plexus uniprocessor machine with its intelligent I/O subsystems outperformed it on the software developer's benchmark. That evaluation cemented my conviction that multiprocessor architectures should avoid distinguishing among the processors. Now I want a sequent. I should point out that the compiler/linker system provided with most unix boxes is not well balanced in its resourse usage. If you look at what it's doing, it spends a lot of time in a large make (say, 2.11 news) forking and execing the myriad phases of the compiler, which then must read the data that the old phase just wrote. Engineering improvements in the compiler software could eliminate its sensitivity to I/O bandwidth. The benchmark is very sensitive to things like setting the sticky bit on the compiler phases, too. -- Steve Nuchia Of course I'm respectable! I'm old! {soma,academ}!uhnix1 Politicians, ugly buildings, and whores !nuchat!steve all get respectable if they last long enough. (713) 334 6720 - John Huston, Chinatown
aegl@root44.UUCP (09/06/87)
We have a sequent balance 21000 with 8 processors which we now use as a cross devlopment enviroment for UniPlus+ (Trademark of UniSoft Corporation). During the day when everyone is working on it use of the "-P" flag to make to use more processors is mildly frowned upon - but in the evening or at weekends if I'm the only person on I like to speed things up a bit. I used to just throw in "-P8" to use all 8 processors ... but then I watched the fancy light display on the front of the cabinet and noticed a fair amount of flicker in the processor activity lights ... Ah! I thought the C compiler isn't completely CPU bound (with a 4.2 fast file system and sequents whizzo disk controller and loads of buffers its close but there's still a little bit of i/o) - so I tried plotting a graph of real time vs. -Pn - and found the lowest real time came out at about -P12. So if the people at sequent trying for the 1 minute kernel compile haven't tried it yet they should try using '-P' values a little higher than the number of processors they have. -Tony Luck -------------- Disclaimer: I don't work for sequent (but if you rush out and buy one today tell them I sent you - perhaps I'll get lucky and get some commission from them - I'll split it with you ... honest!)