cgy@cs.brown.edu (Curtis Yarvin) (03/22/91)
In today's New York Times, there is an article about the new HP Snake line. The story places the low-end Snake (720?) at 57 MIPS, 55 Specmarks for $12,000. This will obviously cramp the digestion of competing workstation makers. Does anyone know how these numbers were achieved? Are they misleading in any way? Curtis "I tried living in the real world Instead of a shell But I was bored before I even began." - The Smiths
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (03/22/91)
In article <69465@brunix.UUCP> cgy@cs.brown.edu (Curtis Yarvin) writes: | In today's New York Times, there is an article about the new HP Snake line. | The story places the low-end Snake (720?) at 57 MIPS, 55 Specmarks for | $12,000. Assuming that this is what it sounds like, the next question is software. Does it run UNIX, and have X, and have {name it} application software? The workstation market can be divided into people who have source for everything they run and are buying raw MIPS, and people who run applications like Maxima, Interleaf, troff, etc, who are not in the market for hardware which doesn't support their application. Depending on the software support this machine may not currently be a player in the second market. This has happened to IBM somewhat, although they have the money to pay someone to port an application. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "Most of the VAX instructions are in microcode, but halt and no-op are in hardware for efficiency"
spot@CS.CMU.EDU (Scott Draves) (03/22/91)
In article <3284@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes: In article <69465@brunix.UUCP> cgy@cs.brown.edu (Curtis Yarvin) writes: | In today's New York Times, there is an article about the new HP Snake line. | The story places the low-end Snake (720?) at 57 MIPS, 55 Specmarks for | $12,000. Assuming that this is what it sounds like, the next question is software. Does it run UNIX, and have X, and have {name it} application software? I don't know, but it will probably run HP-UX like all its predecessors. Now, whether or not you call that unix is another question... :) But seriously, HP-UX is a rather dusty, but reliable version of SysV. I'd live with it to get that much CPU. My questions are: When will they be available in volume, ie, when does it become real? Until then, it's just a marketdroid scheme to hurt the competition and grab publicity. and What technology/process are they using? One chip? Clock? -- christianity is stupid Scott Draves communism is good spot@cs.cmu.edu give up
sysmgr@KING.ENG.UMD.EDU (Doug Mohney) (03/23/91)
In article <3284@crdos1.crd.ge.COM>, davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes: >In article <69465@brunix.UUCP> cgy@cs.brown.edu (Curtis Yarvin) writes: >| In today's New York Times, there is an article about the new HP Snake line. >| The story places the low-end Snake (720?) at 57 MIPS, 55 Specmarks for >| $12,000. > > Assuming that this is what it sounds like, the next question is >software. Does it run UNIX, and have X, and have {name it} application >software? HP-UX <gag> initially. Berkeley 4.4 and OSF/1 to be available when released. There is X. Not good X, but X. Since it's a member of the HP-PA RISC club, there is (already) a large base of software developed for the machine. > The workstation market can be divided into people who have source for >everything they run and are buying raw MIPS, and people who run >applications like Maxima, Interleaf, troff, etc, who are not in the >market for hardware which doesn't support their application. > > Depending on the software support this machine may not currently be a >player in the second market. This has happened to IBM somewhat, although >they have the money to pay someone to port an application. Already a developed base of software, due to the earlier members of the HP-PA club, so I understand. But, even if there are software porting problems (doubtful, HP tends to take pains on this), 57 MIPS for $12K will get a lot of people working on porting real quick. The floating point on the mid-level "snake" is supposed to be obscenely high. I have a friend at NWSC who is going to purchase one. .......................... The real question is: What will Digital do to save their bacon, while HP and IBM explore performance, and Sun keeps chugging along in a commodity market? Reform may be dying in the Soviet Union, but we have the right to introduce it to the DECUS Board of Directors. -- > SYSMGR@CADLAB.ENG.UMD.EDU < --
jlol@REMUS.EE.BYU.EDU (Jay Lawlor) (03/23/91)
>>>>> On 22 Mar 91 13:58:08 GMT, davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) said: > In article <69465@brunix.UUCP> cgy@cs.brown.edu (Curtis Yarvin) writes: > | In today's New York Times, there is an article about the new HP Snake line. > | The story places the low-end Snake (720?) at 57 MIPS, 55 Specmarks for > | $12,000. Bill> Assuming that this is what it sounds like, the next question is Bill> software. Does it run UNIX, and have X, and have {name it} application Bill> software? Bill> The workstation market can be divided into people who have source for Bill> everything they run and are buying raw MIPS, and people who run Bill> applications like Maxima, Interleaf, troff, etc, who are not in the Bill> market for hardware which doesn't support their application. Bill> Depending on the software support this machine may not currently be a Bill> player in the second market. This has happened to IBM somewhat, although Bill> they have the money to pay someone to port an application. Bill> -- Bill> bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) Well... The one I tested (720, the low end desktop model) was running HPUX, just like the 9000/800 series. It ran binaries from our 835 without recompilation, although floating point seemed faster after recompiling. Times for the code I ran (floating point intensive) were about the same as our RS/6000 540. Very fast. The X windows performance (using Motif window manager) was the best I've seen on any machine, although X isn't exactly efficient for lots of things.
linley@hpcuhe.cup.hp.com (Linley Gwennap) (03/26/91)
(Curtis Yarvin) asks: In today's New York Times, there is an article about the new HP Snake line. The story places the low-end Snake (720?) at 57 MIPS, 55 Specmarks for $12,000. This will obviously cramp the digestion of competing workstation makers. Does anyone know how these numbers were achieved? Are they misleading in any way? ---------- Yes, they are misleading. The performance on real applications (not toy benchmarks) is actually significantly *higher* due to the much larger caches (128KB I/256KB D) than competing systems. The CPU in the Model 720 is a traditional RISC implementation (no "super" stuff) that runs at 50 MHz using fairly standard 1.0 micron CMOS process. I could go into great detail, but basically the high performance is due to eliminating most pipeline interlocks and keeping the chip simple enough to allow the high clock frequency. The compilers have also been closely tuned to the hardware. The PA-RISC instruction set is rich enough to offer some instruction-level parallelism (e.g. COMPARE-AND- BRANCH, ADD-AND-BRANCH instructions) so that superscalar complexities are not needed. Yes, these system run UNIX (HP-UX) and most common applications, over 2000 total. All Series 700 systems are also source-code compatible with our popular Motorola-based workstations. By the way, if you need more performance, a 66 Mhz Model 730 is available for $20,000. --Linley Gwennap Hewlett-Packard
cs450a03@uc780.umd.edu (03/26/91)
Liney Gwennap writes:
[ enough details on the Snake (+HP sig) to make it look like he knowns
what he's talking about ]
How's I/O on this thing? Would use as a fileserver be a shameful
waste?
Raul Rockwell
burdick@hpspdra.HP.COM (Matt Burdick) (03/27/91)
> There is X. Not good X, but X.
What's wrong with it? It's normal X11R4.
-matt
--
Matt Burdick | Hewlett-Packard
burdick@hpspd.spd.hp.com | Intelligent Networks Operation
lewine@cheshirecat.rtp.dg.com (Donald Lewine) (03/27/91)
In article <32580004@hpcuhe.cup.hp.com>, linley@hpcuhe.cup.hp.com (Linley Gwennap) writes: |> |> Yes, these system run UNIX (HP-UX) and most common applications, over |> 2000 total. All Series 700 systems are also source-code compatible |> with our popular Motorola-based workstations. By the way, if you need |> more performance, a 66 Mhz Model 730 is available for $20,000. |> |> --Linley Gwennap |> Hewlett-Packard |> Are the new PA machines binary compatible with the old ones? Reading the press, it sounds like there are new instructions in the new machines so that while the old programs will run, the full performance will not be achieved unless you recompile. Is this true? -------------------------------------------------------------------- Donald A. Lewine (508) 870-9008 Voice Data General Corporation (508) 366-0750 FAX 4400 Computer Drive. MS D112A Westboro, MA 01580 U.S.A. uucp: uunet!dg!lewine Internet: lewine@cheshirecat.webo.dg.com
linley@hpcuhe.cup.hp.com (Linley Gwennap) (03/27/91)
Due to popular demand, here is an article comparing the new Snakes CPU to IBM's "America" chip (used in the RS/6000 series). I have deleted the section on America. I would be happy to post more info if this is useful. --Linley Gwennap Hewlett-Packard HP SNAKES CPU HP's high-performance chip set consists of the "Snakes" CPU chip and a floating point coprocessor ("FPC") jointly developed with Texas Instru- ments[1]. These are the first chips to implement the PA-RISC 1.1 architec- ture. They use a traditional RISC approach to achieve industry-leading performance of 72 SPECmarks with a 66 MHz clock. PA-RISC 1.1, an extension to the original PA-RISC architecture, includes several new instructions, many of which accelerate graphics operations[2]. A multiply-and-add instruction (as in IBM's POWER) is included. In addi- tion, the page size was doubled to 4 KB to reduce the TLB miss rate, and eight "shadow" registers were added to provide quick context switching for the TLB miss handler. The CPU contains all integer instruction processing, cache control and memory management functions. All cache memory is included in external SRAMs connected directly to the CPU. Snakes has a 64-bit path to the D- cache, just like the R4000. Both the I- and D-caches can be accessed simultaneously, resulting in a total cache bandwidth of 792 MB per second (peak). The FPC implements all floating point instructions. It receives instructions and data from the caches at the same time as the CPU, and du- plicates parts of the CPU's instruction pipeline, eliminating the penalties often incurred by separate CPU and FPC chips. Snakes is designed to work with a variety of memory and I/O interfaces. The CPU uses a five-stage pipeline to reduce cycle time. The penalties in this pipeline have been minimized. For example, conditional branches are executed with no delay if their outcome is predicted correctly, and with only a single cycle penalty otherwise. The branch prediction algorithm, more advanced than America's, predicts forward branches to be untaken and backward branches taken, thus optimizing for loops. The load penalty is a maximum of one cycle and the store penalty a maximum of two; these penal- ties can usually be avoided by the compiler. All other integer instructions (except a few rare system control functions) are always executed in a sin- gle cycle. This uncomplicated design is reflected by a simple, efficient compiler. Although Snakes is not superscalar, PA-RISC instructions such as ADD AND BRANCH, MOVE AND BRANCH and COMPARE AND BRANCH allow a similar amount of parallelism as America for integer-only applications; in fact, the ratio of Integer SPECmarks to MHz for Snakes (65/66) actually exceeds America's (35/42). FPC is a full 64-bit implementation. It contains two parallel execution units: the ALU (addition, conversion) and the MPY unit (multiply, divide, square root). Each unit can start a new operation on every other cycle, so FPC can accept one floating point instruction per cycle provided that ALU and MPY instructions are alternated. The external caches are direct mapped and are protected by parity, making them slightly less robust than America's ECC cache. Cache coherency flags are included to facilitate multiprocessor operation. A write-back protocol is used to reduce writes to main memory. Although Snakes does not imple- ment America's complex "critical word first" algorithm on cache misses, it will begin processing as soon as the critical word is obtained, reducing the miss penalty by as much as seven cycles. Snakes supports a wide variety of off-the-shelf SRAMs and can be configured with anywhere from 8 KB to 3 MB of external cache. At its maximum operating frequency of 66 MHz, it requires 12 ns SRAMs. The I- and D-TLBs are fully associative and contain 96 entries each. In addition, each TLB implements four variable size "block" entries capable of mapping up to 16 MB each, which can be used for large portions of the operating system and/or graphics frame buffers. The memory system supports 48 bits (256 terabytes) of virtual address space and 32 bits (4 gigabytes) of real address space. (This is a subset of the full 64-bit virtual space allowed by PA-RISC). Two addressing modes support 1 GB or 4 GB data seg- ments, significantly larger than America's segments. A separate bus provides access to memory, I/O and, if desired, graphics. This bus is a synchronous, dedicated interface with a peak transfer rate of 264 MB per second, about one-half the speed of America's memory system. The bus bandwidth is limited by its width of 32 bits, but a wider bus would have required a larger, more expensive package. Snakes's cache miss penal- ty, measured in cycles, is much higher than America's, due to the shorter clock cycle time. Snakes compensates for these penalties by allowing for large external caches to reduce the miss rate; the performance numbers for Snakes assume a 128 KB instruction cache and 256 KB data cache. The CPU is fabricated in HP's CMOS-26 process (a 1.0 micron, three metal layer process) and packaged in a 408-pin PGA. FPC is fabricated in TI's 0.8 micron CMOS process and placed in a 207-pin PGA. These PGAs were custom-designed to allow high frequency operation with wide CMOS buses. The CPU contains about 577,000 transistors, while FPC uses 640,000. For lower-cost systems, the chip set is designed to run at frequencies below 66 MHz, allowing lower-speed SRAMs to be used. FPC can also be eliminated to further reduce costs. REFERENCES AND NOTES [1] "CMOS PA-RISC Processor for a New Family of Workstations" by M. Forsyth, S. Mangelsdorf, E. DeLano, C. Gleason and J. Yetter, COMPCON Spring 91 Digest of Technical Papers, February 1991. [2] "Architecture and Compiler Enhancements for PA-RISC Workstations" by D. Odnert, R. Hansen, M. Dadoo and M. Laventhal, COMPCON Spring 91 Digest of Technical Papers, February 1991.
darrylo@hpnmdla.hp.com (Darryl Okahata) (03/27/91)
In comp.arch, linley@hpcuhe.cup.hp.com (Linley Gwennap) writes: > (Curtis Yarvin) asks: > > Does anyone know how these numbers were achieved? Are they misleading in > > any way? > > Yes, they are misleading. The performance on real applications (not > toy benchmarks) is actually significantly *higher* due to the much larger > caches (128KB I/256KB D) than competing systems. I'd like to point out that the D-cache is 64-bits wide, to improve floating-point performance. The I-cache is only 32-bits wide, and comes in either 128K or 256K configurations. -- Darryl Okahata UUCP: {hplabs!, hpcea!, hpfcla!} hpnmd!darrylo Internet: darrylo%hpnmd@relay.hp.com DISCLAIMER: this message is the author's personal opinion and does not constitute the support, opinion or policy of Hewlett-Packard or of the little green men that have been following him all day.
preston@ariel.rice.edu (Preston Briggs) (03/28/91)
linley@hpcuhe.cup.hp.com (Linley Gwennap) writes: [good info about the new HP chips] >eight "shadow" registers were added to provide quick context switching for >the TLB miss handler. I'm not sure I understand. Could you expand slightly? >conditional branches are >executed with no delay if their outcome is predicted correctly, and with >only a single cycle penalty otherwise. The branch prediction algorithm, >more advanced than America's, predicts forward branches to be untaken and >backward branches taken, thus optimizing for loops. The RS/6000 can rearrange loops so that there are no branch delays (often with no branch cost at all). That's hard to beat. What happens with a fall-through and a forward branch? >in fact, the ratio of Integer SPECmarks to MHz for >Snakes (65/66) actually exceeds America's (35/42). Could you post results for individual SPEC programs (both int and float)? >The external caches are direct mapped and are protected by parity, making >them slightly less robust than America's ECC cache. I would have liked some set-associativity too. (I'm very greedy) >will begin processing as soon as the critical word is obtained, reducing >the miss penalty by as much as seven cycles. What're the best and worst-case D-cache miss times (say, without writeback)? Line length? Will a cache-miss freeze the CPU or just lock the target register? >The I- and D-TLBs are fully associative Hooray! Thanks for the information. Thanks also for the references. Preston Briggs
jonathan@cs.pitt.edu (Jonathan Eunice) (03/28/91)
In article <69465@brunix.UUCP> cgy@cs.brown.edu (Curtis Yarvin) writes:
In today's New York Times, there is an article about the new HP Snake line.
The story places the low-end Snake (720?) at 57 MIPS, 55 Specmarks for
$12,000.
This will obviously cramp the digestion of competing workstation makers.
Yep.
HP's low end is apparently performance competitive with high end of
IBM RS/6000 line (little better integer perf, not as good floating
point, dramatically better X perf), and much better than even the
highest-end workstations from Sun, DEC etc. Look for serious
repricing, heavy discounting, and a i lot of worrying from
competitors.
Does anyone know how these numbers were achieved? Are they misleading in
any way?
Not really; just good engineering, running real fast.
HP-PA is a pretty good RISC design, and they've tweaked it with some
handy cache-usage optimizations and better floating point in version
1.1. The main win seems to be high speed CMOS fabrication --
50-someodd MHz on the low-end, just over 65 MHz on the high end.
Another win is large caches, which should be very handy indeed for X,
GNU, and other poor-locality-of-reference software, not to mention
large data sets.
These machines are more-or-less workstations, with workstation-sized I/O
capacity. So, how they will compare to SMP machines and "real" minicomputers
on throughput-oriented jobs is still in question. But, their CPU
performance looks excellent.
samf@perform.dell.com (Sam Fuller) (03/29/91)
In article <7410003@hpnmdla.hp.com>, darrylo@hpnmdla.hp.com (Darryl Okahata) writes |> I'd like to point out that the D-cache is 64-bits wide, to improve |> floating-point performance. The I-cache is only 32-bits wide, and comes |> in either 128K or 256K configurations. |> What does that mean? I assume a cache line is wider than 4 or 8 bytes. Is this the width of the processor to cache bus for I and D respectively? Sam Fuller Dell Computer Advanced Systems samf@perform.dell.com |> -- Darryl Okahata |> UUCP: {hplabs!, hpcea!, hpfcla!} hpnmd!darrylo |> Internet: darrylo%hpnmd@relay.hp.com |> |> DISCLAIMER: this message is the author's personal opinion and does not |> constitute the support, opinion or policy of Hewlett-Packard or of the |> little green men that have been following him all day.
jpk@ingres.com (Jon Krueger) (03/29/91)
How many have you shipped? Who has replicated your SPECmarks? -- Jon -- Jon Krueger, jpk@ingres.com
jonathan@cs.pitt.edu (Jonathan Eunice) (03/30/91)
spot@CS.CMU.EDU (Scott Draves) writes:
I don't know, but it will probably run HP-UX like all its
predecessors. Now, whether or not you call that unix is another
question... :) But seriously, HP-UX is a rather dusty, but reliable
version of SysV. I'd live with it to get that much CPU.
Yep, HP-UX and toward the end of the year, OSF/1. HP-UX is a much-enhanced
BSD kernel with a System V environment above. Rather spartan in BSD
system calls and utilities, apparently.
When will they be available in volume, ie, when does it become real?
Until then, it's just a marketdroid scheme to hurt the competition and
grab publicity.
My, aren't we cynical? Not believing the vendors are we? What'll
it be next, the government? ;-)
According to HP, over 1,000 systems have apparently shipped, and
availability is either now or near-term (no latter than May, I belive)
for all the goods. Not bad.