markd@wolf.UUCP (07/15/86)
SpeedPac 286 verses INTEL 8088/8087 verses NEC V20/8087 Mark C. DiVecchio 2/20/86 To continue my series on performance measurement of the IBM PC, I recently purchased a SpeedPac 286 card from Victor Technologies. The cost was $595 plus tax and $5 shipping and handling. If you need an 80287, you must purchase it elsewhere. You should be able to find an 8MHZ chip for around $300. It went into my PC with only one problem. I have a JRAM-2 card and I could not get the system to boot with the JRAM-2. If I can't get the SpeedPac 286 card to work with the JRAM-2, I will consider that a major deficiency. I have 2MB on the JRAM which lets me use 704K for DOS and the rest for RAM disks and print spoolers. Victor does claim that their board is compatible with the Lotus/INTEL Extended Memory Specification (EMS). I already consider the fact that you must buy an 80287 a major deficiency at least from the cost standpoint since the SpeedPac 286 will not work with the 8087. This, though, is not a problem with the SpeedPac 286 but rather with INTEL's design of the 80287 as a co-processor. With the 80287 running at 7.2 MHZ, you should see some good floating point performance. In their ads, Victor claims "The SpeedPac 286's microprocessor is an 8MHZ 80286, a faster version than the AT's 80286. That's right. Even faster than the AT. Which means that it can make your IBM PC, PC/XT and Victor VPC function over 600% faster." This is clearly hyperbole after my tests documented below. The meager documentation with came with the board, is no more realistic. It describes the cache memory scheme used on the board. It claims up to 7 and 1/2 times faster running programs as long as they efficiently use the cache. Programs which do not efficiently use the cache may only run sightly faster. The documentation suggests that if your programs run "only slightly faster", you should disable the cache via an on-board jumper. Then, according to the documentation, "the SpeedPac 286 should still at least double the speed of your application programs." To even talk about "efficiently" using a cache are worthless words. Those of us with applications packages like 123 or Wordstar can't do much about the program efficiently using a cache. Caches must be designed to speed up programs no matter how the programs are written. Programs should not have to be written to efficiently use a cache. This lack of documentation is a real issue with me. To expect to be able to sell a product like this which is bound to be full of incompatibilities with existing hardware, I feel that the company must supply you with almost every piece of technical data available. Victor supplies nothing. For example, Tall Tree Systems which sells their JRAM-2 board for about $200 includes a 95 page manual describing practically every aspect of the operation of their board. I called Victor and the person I talked with not only didn't have the answer to my questions but I don't think even understood them. I was trying to find out exactly how the cache worked so I could try to figure out why my JRAM-2 board wasn't working. All I received was the person looking up the JRAM in a "compatibility" list and telling me to disable the cache (more on the results of that later). What I really wanted to know was addressing schemes, block sizes and replacement strategies. Disabling the cache let the JRAM-2 card operate and my system booted as usual. While I had the cache disabled, I ran the benchmark programs again and the results were terrible, really terrible. The performance was barely even at the 8088 level and in one case was 23% slower than an 8088. I kept on digging. The JRAM-2 documentation describes how it uses bank addresses E000 and F000 to enable its memory and how it uses bank D000 as the paging bank. Well I thought I would change these to see if there was some interference with the SpeedPac 286. I changed the JRAM-2 enable address to F100:0000 and the paging bank to E000 (The JRAM-2 documentation is clear on how to move the jumpers to do this). Alas it worked. I was able to boot using the RAM disk with the SpeedPac 286 cache enabled. I was now back to my normal 704K DOS partition and my 1.2Mb RAM disk. So I looked at the benchmarks again. The 3 Turbo Pascal benchmarks ran slower now for some reason. Here is what my CONFIG.SYS file looks like now: BUFFERS=32 FILES=20 DEVICE=JBOOT.BIN 704K /-x=13x=12FQ ; /Use E000 as paging bank,FLUSH,QUIET DEVICE=JDRIVE.BIN BREAK=ON I got a hold of Peter Norton's "SYSINFO" utility which produces a number which is alledged to be some kind of performance figure compared to a standard PC. With his program, I get a performance rating of 6.6x with the cache enabled and 2.1x with the cache disabled. I have no idea what this number means. It is certainly not real world and seems only to serve the exaggerated claims of hardware manufacturers. I feel that Mr. Norton has done a disservice to the PC community by producing such a meaningless program. From my numbers below, I can see that a 2x or 3x speed up claim by Victor would be more realistic with cache on and if you have to disable the cache, your performance will actually drop below that of a regular PC. Anyone who buys a Speedpac 286 card and then has to disable the cache to get it to work has just thrown away $595 and then some. What are the problems? Well without more technical details from Victor, I can only guess. On the AT with its 80286, the bus to memory is 16 bits wide. This doubles the data bandwidth over the SpeedPac 286 in a PC. In addition, the DMA channels and fixed disk are significantly faster in the AT. Just putting a 80286 in a PC won't do much to I/O rates as we see in one test below. Another thing is the time required to keep the cache filled. This overhead can be significant as the processor must go out to main memory to fetch a block of data each time that it can not find the byte it wants in cache. The processor may bring in 8 or 16 bytes of data and end up using only 1 byte. But what should we expect in the first place? Considering the hardware in the PC and the problems interfacing an 80286 to it, a 2x or 3x performance improvement is not bad. I would have felt better if Victor (and other hardware vendors as well) gave reasonable numbers in their ads. Unfortunately, as most marketeers know, the exaggerated claims make for good copy and once one vendor uses them, all follow suit. Flight Simulator will not run but I didn't really expect it to run. You don't pay $600 to run Flight Simulator. SpeedPac 286 came with a 60-day money-back guarantee so I sent it back. Mark DiVecchio San Diego, CA P.S. there are a few short words below about the Classic 286 from Classic Technologies. I tried it but would not work with my expansion chassis. Therefore, I could not run many benchmarks. ------------------------------------------------------------------------- Results of Real World Benchmark run 2/20/86 Numbers followed by "x" is the performance multipler. "2.00x" means that the program ran twice as fast or in one-half the time. Numbers followed by "s" is the running time in seconds. 8088/ V20/ ------SpeedPac 286---- 8087 8087 --no JRAM-- -JRAM- --512K-- -704K- cache no cache Program cache ----------- ----- ----- ----- ------- ------ Compiling and Linking a large 1.00x 1.05x 1.86x 1.02x 1.90x Assembly Language program using 240s 228s 129s 235s 126s the Microsoft Macro Assembler v4.00 Generally CPU bound. Some I/O. Same but with all data on RAMdisk. 2.31x 104s Compiling and Linking a medium 1.00x 1.05x 1.82x 1.02x 1.91x sized Assembly Language program 86s 82s 47s 84s 45s Generally CPU bound. Some I/0. Same but with all data on RAMdisk. 2.53x 34s MFRACT 2 3 1027 1.00x 1.15x *** *** *** A public domain fractal generator 20s 17s using the 80x87 floating point processor. This program does a large number of floating point arithmetic operations. Some graphic screen output is done after the calculations are complete. No I/O except for program load. MFRACT1 2 3 1027 1.00x 1.09x 3.00x 0.96x 3.00x Same fractal program using emulated 75s 68s 25s 78s 25s floating point operations. This version of MFRACT emulates all of the floating point instructions using 8088/V20/80286 integer instructions. No I/O. Starting the FinalWord II word 1.00x 1.10x 1.83x 1.10x 1.83x processor and loading a 60K file 22s 20s 12s 20s 12s into its internal buffer. This job consists of a lot of disk I/O and very little CPU arithmetic. Formatting a 60 page word processor 1.00x 1.08x 2.44x 1.03x 1.68x file and placing the print file 188s 173s 77s 182s 70s on a fixed disk. This job consists of a lot of disk I/O and does a lot of character manipulation. Mandelbrot Set Calculation using 1.00x 1.03x *** *** *** the 80x87 floating point chip. This 476s 464s program, like MFRACT, is a heavy user of floating point instructions in the 80x87. It is CPU bound. No I/O. Copying an 84K file from a fixed 1.00x 1.00x 1.00x 1.00x 1.00x disk to another fixed disk. A test 4s 4s 4s 4s 4s that is very I/O bound. "TYPE" a 20K file to the CRT. This 1.00x 1.08x 2.48x 0.77x 1.87x utility does a lot of disk input and 92s 85s 37s 119s 49s a lot of output to the CRT ----------------------------------------------------------------------------- Then I ran some "benchmarks" written in Turbo Pascal. I do not know what these do so I feel that their usefulness is suspect except for some relative indication of performance. They are definitely more real world than Norton's "SYSINFO". 8088/ V20/ ------SpeedPac 286---- CLASSIC 8087 8087 --no JRAM-- -JRAM- 286 --512K-- -704K- Speed cache no cache Pack cache ----- ----- ----- ------- ------ ------ BENCH.COM Test 1 1.00x 0.99x 2.97x 0.93x 2.14x 4.01x Test 2 1.00x 1.01x 3.42x 1.03x 2.78x 4.02x Test 3 1.00x 1.03x 3.38x 0.90x 3.18x 3.06x SIEVE.COM 1.00x 1.05x 3.45x 0.90x 3.20x 3.32x FLOAT N/A 2.991s 0.993s 3.019s 1.010s 0.993s (time in seconds for 1000 floating point calculations) MULDIV 9.23s SYSINFO 1.00x 6.60x 2.10x 6.60x 7.00x ------------------------------------------------------------------------- Notes: Hardware configuration: IBM PC with 512K of system memory: 256K on the mother board 256K on an AST Combo board When the JRAM-2 was active this is what I had: IBM PC with 704K of system memory: 256K on the mother board 256K on an AST Combo board 196K on the JRAM-2 board Running DOS 2.10. Original 8088, the one with the bug in the mov to SS instruction though I don't think that this had any effect on my results. Markings : S4807 I2290004 (c)INTEL '78 V20 from NEC. Markings : 8523K9 070108D-5. 8087 was installed with the 8088 and V20. *** 8087 is not compatible with the SpeedPac 286 and was removed so I could not run tests which required 80287 hardware. (SpeedPac 286 supports a 7.2 MHZ 80287 which I do not yet own) All disk I/O was to an IBM 10MB fixed disk in an IBM expansion chassis. -- --------------------------------- Mark C. DiVecchio 9067 Hillery Drive San Diego, CA 92126 K3FWT Home of PC-VT and LPTx sdcsvax!man!wolf!markd No disclaimer : anyone who listens to me is a bigger fool than I.