ehj@mordor.ARPA (Eric H Jensen) (09/09/86)
Maybe this subject has been discussed here before, but ... Given the same clock speeds how much faster is the 32332 than the 32032? What are the significant contributors to the increased speed? I expect the answer to this last question to mention "better micro-code". I have heard that the micro-code for the 32032 is very poor, can someone comment on this? -- eric h. jensen (S1 Project @ Lawrence Livermore National Laboratory) Phone: (415) 423-0229 USMail: LLNL, P.O. Box 5503, L-276, Livermore, Ca., 94550 ARPA: ehj@angband UUCP: ...!decvax!decwrl!mordor!angband!ehj
chongo@amdahl.UUCP (Landon Curt Noll) (09/11/86)
In article <15218@mordor.ARPA> ehj@mordor.UUCP (Eric H Jensen) writes: >Given the same clock speeds how much faster is the 32332 than the >32032? What are the significant contributors to the increased speed? I can't understand why the folks at nsc have remained silent on this matter, unless their system is down or they didn't get this message, or ... They forced me to sign one of them 'will not disclose' papers, so I can't give you details of what I have measured. I can note the range of statements have been from >3x all the way down to 'slightly slower in some cases'. Be careful what you read, even from the above. Sometimes the statements are just plain BOGUS, sometimes they are mis-leading and sometimes they are close to the truth. The first case can come up when someone wanting to prop sales up, or a customer who is upset and wants to damage the reputation of the chip. Sad to say, but I have seen this happen with all major chip firms. The second case can be for a few reasons: * the systems do not have the same number of wait states/memory speed * one system has 'burst mode' memory and another does not * the systems were running with different peripherals or kernels * one system has local memory, another has memory over a slower bus * one system used mmu X, another used mmu Y, another no mmu! * the compiler generates better code [in the case of the 32016 vs. 32032, a better compiler will show a 32016 to be closer to a 32032. Why? The more you stay within CPU Reg's, the more the 16 bit bus does not detract from the 32016. Even so, a better compiler will make BOTH run faster. It just helps the 32016 is bit more] [And remember that the 32332 can operate over a 32,16, or 8 bit bus.] * a benchmark uses a given instruction or addressing mode which is faster/slower on one chip The third case is worth talking about: Say you do find a benchmark that is honest. What does it mean for you? Will your 32332 based system have a greater throughput? Will you be able to solve a problem quicker or cheaper? In general, I have seen a 32332 system running faster than an 32032 system in my post-nsc employment days. There were a number of factors why, one of which was the cpu. The 32332 runs Unix. It has had MUCH, MUCH, MUCH fewer stated problems than the 32016 did way back when it first came out. The 32332 has a number of performance/architecture advantages over the 32032 and 32016. In short the 32332 is a *YUMMO* part. chongo <for a 32332 vs. 68020 discussion, see net.flame> /\oo/\ [This is not an Amdahl or NSC official statement]
thomson@utai.UUCP (Brian Thomson) (09/11/86)
We don't have 32032s here, but we have compared 32016s vs. 32332s running 4.2bsd with a Fuji Eagle and have found the 10 MHz 332 to be about twice the speed of the 016, or roughly 1.2 VAX 780s for CPU-intensive non floating-point benchmarks. -- Brian Thomson, CSRG Univ. of Toronto {linus,ihnp4,uw-beaver,floyd,utzoo}!utcsrgv!thomson
henry@utzoo.UUCP (Henry Spencer) (09/11/86)
> ...In short the 32332 is a *YUMMO* part.
In fact, it's what the 32032 *should* have been. If National had been
shipping this at 32032 time, the 32000 series would have been a roaring
success. As it is, I fear the 32332 is too little too late.
--
Henry Spencer @ U of Toronto Zoology
{allegra,ihnp4,decvax,pyramid}!utzoo!henry
curry@nsc.UUCP (Ray Curry) (09/12/86)
Sorry that this response was slow but we have had problems with our news program and mail. How much faster is the NS32332 than the NS32032? Like any answer concerning computers, it depends upon what environment and what tasks your talking about. First, where does the speed come from? National obtained roughly 3 times the performance for the NS32332 based upon better architecture, faster clock, and better compiler technology. For the purpose of this discussion, let's limit the discussion to the first. Better microcode, faster 32 bit address calculation, better interupt latency, and better hardware interface (burst mode memory) are the factors. As to the results, presently we have measured the following performance increase for the 32332 compared to the 32032 running the same code and same clock (10MHz). dhrystone 1.3x sieve 1.4x puzzle 1.6x EDN's 1.25x Overall I am seeing and average of about 50% faster, with some benchmarks as low as 20% and some as much as 90% faster. The state of optimization impacts the performance difference with less optimized code giving greater difference as might be expected. National is working on compilers supporting faster code by optimizing the instructions used and by using registers more intelligently. Nicknamed CTP for compiler technologies program, the compilers are a part of the move to UNIX V. They are due to be released later this year and I have used both Fortran 77 and C. I am still working on full characterization with a very large number of benchmarks, but in general with the new CTP compilers and the higher clock rate (15MHz), the overall improvement is pretty dramatic. The dhrystones have measured at 2800 dhry/sec on a DB332 board at 15MHz with code generated by the NSC CTP compiler on the Dhrystone Version 1.1. The board and the V.2 Unix are production released but the compiler is still in QA. This compares to 855 on the 032 with older compilers. Some of our customers had done some optimizations and got slightly higher numbers with dhrystones such as Sequent (up to around 1200-1300). Floating point intensive programs show less improvement because the 32081 is still being used, but even so there is some improvement because of the non-floating point instruction mix and the higher clock rate (15MHz). The 32081 is slower because of the older 16 bit slave protocal used but still runs competitively in many floating point benchmarks like the LINPAC. Whetstones run about 25% faster on the 332 at the same clock frequency with the 081. Early next year, there will be a NS32381 FPU that uses the 32 bit slave protocal available on the 32332 and some instruction improvements. Whetstone performance will of course climb when the new FPU is available with the expected increase to be about 2 to 1. As is NSC's policy on the NS32000 family, the 32381 will be code compatible with the 32081.
chongo@amdahl.UUCP (Landon Curt Noll) (09/18/86)
In article <7115@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes: >> ...In short the 32332 is a *YUMMO* part. > >In fact, it's what the 32032 *should* have been. If National had been >shipping this at 32032 time, the 32000 series would have been a roaring >success. As it is, I fear the 32332 is too little too late. How True! (note my comment was without reference to both time and/or other more *YUMMO* parts) So, will the 32532 suffer from the same problem? chongo <I have my guess already> /\oo/\