carlton@aldebaran.berkeley.edu (Mike Carlton) (11/30/90)
Greetings, Well, as someone else posted recently, Next recently held a demo on the Berkeley campus with several NextStations and a Color NextStation. I was able to spend a while on one of the machines and got some simple benchmark results. Below are the numbers and several other comments I had. Basically, the NextStation is quick. Here are some numbers, along with some other machines for comparison. I measured the following: power/bc(1) sqrt/dc(2) nroff(3) bamsim(4) ----------- ----------- ----------- ----------- RS/6000(5) 3.4 0.3 2.7 46.1 MIPS(6) 6.0 0.3 1.6 46.1 NextStation(7) 7.8 0.5 3.4 70.1 SparcStation(8) 12.1 0.7 3.9 51.9 030 Cube(9) 19.0 1.3 15.1 322.4 Sun 3/60(10) 30.3 2.5 16.9 376.1 Sun 3/50(11) 44.5 3.9 29.1 462.8 VAX 785(12) 44.3 3.5 36.3 681.2 Benchmarks: ( 1) echo 2^5000/2^5000 | /bin/time bc > /dev/null ( 2) echo 99k2vp8opq | /bin/time dc > /dev/null ( 3) cat /dev/null | awk 'END {for(i=0;i<100;i++){print ".PP";for(j=0;j<100; \ j++) print j;}}' | /bin/time nroff -ms -i > /dev/null ( 4) bamsim chat.o: chat parser with writes simulated on the VLSI-BAM chip Machines: ( 5) RS/6000 Model 320, 32MB? memory, 20MHz?, 8KB i-cache and 64KB d-cache ( 6) M/2000-8, 128MB memory, 25MHz R3000, 64KB i-cache and 64KB d-cache ( 7) NextStation, 8MB memory, 25MHz 68040, on-chip 4KB i-cache and 4KB d-cache ( 8) SparcStation 1+, 24MB memory, 25MHz? SPARC, 64KB? cache ( 9) 030 Cube, 12MB memory, 25MHz 68030, on-chip 256B i-cache and 256B d-cache (10) Sun 3/60, 20MB memory, ?MHz 68020, ?KB cache, on-chip 256B i-cache (11) Sun 3/50, 4MB memory, 16MHz 68020, on-chip 256B i-cache (12) VAX 785, 32MB? memory, 8MHz?, ?KB cache Disclaimer: these are simple benchmarks -- don't place too much emphasis on them. If you really need to know just how fast the machine is then run your specific programs on it. SPEC results would be better of course, but I don't have them. The first 2 benchmarks are simple unix one-liners that were recently posted to comp.benchmarks. These are nice because you can sit down and type them quickly. They should involve mostly integer and pointer operations. The third one is a simple test of nroff, and so should be mostly character operations. The fourth benchmark is a register level simulator our group developed and uses, it is almost all integer operations. I've described the machines as well as I can, question marks indicates the quantities I don't know or have made a guess at. The NextStation I used was still a pre-release machine -- it was running a beta version of the OS and the 040 was supposed to be an early version. For the one-liners, the NextStation ranges from 1.1 to 1.5 times the speed of a Sparcstation 1+. On the larger simulation the Next is only 75% of the speed of the Sparcstation, we believe this could be due to caching on the Sparc or due to slow bit operations, since the simulation performs quite a few, and the 040 should be slower on large shifts (the shift instruction can only specify shifts of 1-8 bits on the 030 and presumably the same is true of the 040). Floppy drive: The floppy drive is very well integrated, I simply popped a DOS 1.44MB floppy in the drive and the Next automatically mounted it as a new volume and it appeared in the browser. Subjectively, the floppy seemed slow (i.e. copying seemed to be a few times slower than copying from a floppy to a hard drive on a Mac), but this is just an impression. Also, the Next was supporting a foreign format and so may be slower because of that. Shipping news: one of the Next reps said that they had begun shipping and he thought that Berkeley would get its first 5 machines sometime next week, with 5 more to follow in a couple of weeks. Of course, he went to great pains to point out that there were no guarantees, nothing was definite, this was what he thought, etc. The first 10 machines will all be the 105MB disk, 8MB configuration. A machine with 200MB disk has been added to the available configurations, but won't ship until after the new year. He thought the price for a 200MB system would be about $700 more than the 105MB system. 030 Cube upgrades to 040's probably won't ship until the new year. Software: I was able to play with Illustrator (and only crashed it once in about 5 minutes). It appeared to be just about identical to the Macintosh version. They also had a demo version of Framemaker, a demo of WordPerfect and of course, Lotus Improv. In general, the applications seemed pretty solid. Overall, the machines are very impressive. I'm number 5 on the list here at Berkeley and so might actually get one in the next couple of weeks -- I'm looking forward to it. I don't think there is a better price/performance, basic Unix box out there (which is what I was looking for) and the user interface is a world above X or Suntools. Combine that with a bunch of very slick bundled software and I think it's a great deal. We've had an 030 Cube in our office since they first came out, it was a good machine, but wasn't worth $6500 of my own money. The Nextstation is definitely worth the $3200. That's all for now, Mike Carlton carlton@cs.berkeley.edu
alex@pluto.dss.com (Alex Smith) (11/30/90)
In article <9325@pasteur.Berkeley.EDU>, carlton@aldebaran.berkeley.edu (Mike Carlton) writes: > Greetings, [ Benchmark specifics ] > For the one-liners, the NextStation ranges from 1.1 to 1.5 times the > speed of a Sparcstation 1+. On the larger simulation the Next is only > 75% of the speed of the Sparcstation, we believe this could be due > to caching on the Sparc or due to slow bit operations, since the simulation > performs quite a few, and the 040 should be slower on large shifts (the shift > instruction can only specify shifts of 1-8 bits on the 030 and presumably ^^^ ^^^^ ^^^^^^^ ^^^^^^ ^^ ^^^ ^^^^ According to the MC68030 User's Manual (2nd Ed.): The shift count for the shifting of a [data] register is specified in two different ways: 1. Immediate -- The shift count (1-8) is specified in the instruction. 2. Register -- The shift count is the value in the data register specified in the instruction modulo 64. ^^^^^^ ^^ Perhaps the simulation is using immediate specification, or shifting memory (which can only be done one bit/byte at a time). > the same is true of the 040). [ etc.] Alexander Smith "If that was an opinion, this is a disclaimer." alex@pluto.dss.com
carlton@aldebaran (Mike Carlton) (12/01/90)
In article <4088@pluto.dss.com> alex@pluto.dss.com (Alex Smith) writes: +In article <9325@pasteur.Berkeley.EDU>, carlton@aldebaran.berkeley.edu (Mike Carlton) writes: +> Greetings, + +[ Benchmark specifics ] + +> instruction can only specify shifts of 1-8 bits on the 030 and presumably + ^^^ ^^^^ ^^^^^^^ ^^^^^^ ^^ ^^^ ^^^^ +According to the MC68030 User's Manual (2nd Ed.): + + The shift count for the shifting of a [data] register is specified + in two different ways: + + 1. Immediate -- The shift count (1-8) is specified in the instruction. + 2. Register -- The shift count is the value in the data register + specified in the instruction modulo 64. + ^^^^^^ ^^ +Perhaps the simulation is using immediate specification, or shifting memory +(which can only be done one bit/byte at a time). + +> the same is true of the 040). + +[ etc.] + +Alexander Smith "If that was an opinion, this is a disclaimer." +alex@pluto.dss.com I wasn't real clear on my original posting, mainly because I was still speculating about the possible problems due to an 040. Yes, the code uses immediate shifts extensively. At some point, it becomes faster for the compiler to load a temp register with a shift amount and do a register shift rather than multiple immediate shifts. I don't know what gcc does, I have not profiled the code or looked at the assembly generated because it hasn't been a priority. Here are a few lines from one of the header files, showing some of the operations that are being performed: #define sign_ext_11(data) (((int)((data)<<21))>>21) #define sign_ext_12(data) (((int)((data)<<20))>>20) and even: #define tagged_imm_11(tage,data) \ ((-ebit(tage))^((((-ebit(tage))^(tage))<<27)| \ (((-ebit(tage))^(data))&0x7ff))) When I get a slab, I'll take a look at what the compiler is generating. --mike Mike Carlton carlton@cs.berkeley.edu