lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) (03/02/90)
Philips/Signetics has now officially revealed its VLIW chip. So far, I've heard: 2 integer ALUs 32 bit path to memory 6 ops/clock - 2 integer - 1 branch - 1 "constant generator" ?? - 1 memory operation - ?that leaves one op unaccounted for? Pretty scanty. Surely, someone out there has details? -- Don D.C.Lindsay Carnegie Mellon Computer Science
lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) (03/07/90)
I recently asked for information about Philips/Signetics' new VLIW chip. Things went well, so here it is: The chip is intended as a prototype for a 32-bit ASIC family. The CREATE compiler accepts Pascal-ish programs: they are considering various other language frontends. The compiler reads a file describing the specific chip. The prototype chip does 50 MHz and has a 200-bit instruction word, fetched over 100 pins by cycling them at 100 MHz. One obvious ASIC variation is to move the program to an on-chip memory. The compiler supposedly can allow any number of functional units, and you just tell it what the various pipeline delays are. I assume that there are some lower bounds - at least one ALU, and so on. The prototype chip has 6 units: two ALUs, a branch unit, a memory interface unit, a register unit, and a constant generator. The memory interface unit has address/data wires to the outside. There is nothing keeping designers from adding custom units, or multiple memory interfaces. (However, I expect the first designer who tries this will trigger some compiler hacking.) What holds everything together is the "multiport memory", which takes up a big fraction of the chip. Each unit has one or two 32-bit paths from the multiport memory, and one 32-bit path back to it. The prototype has something like 13 ports. Now, they cheated. You would think from the word "multiported" that every result is written to an address, and every fetch is from an address. Close; they economized by having a "funnel file" attached to each read port. This is just a two-port (1R 1W) memory. When you write to multiport memory, a mongo mux takes the data to the funnel files that you specify, and to the addresses within them that you specify. When a unit's read port reads from "multiport memory", what actually happens is that an address is applied to his specific funnel file. The funnel files seem like a reasonable trade between density and generality. They can all be different sizes, and they all easily have forwarding (done by referencing a special address). The compiler does have to able to remove contention at compile time. Also, note that this scheme deal in values, not variables. (Data has to go to each funnel file that will need it, and multiple copies cost space rather than time.) Each unit also has a 1-bit read port to multiport memory. It uses these to fetch boolean "guards" that disable writeback or interrupts. I'm not quite clear on how they compute guards, but it sounds like a good idea. They are claiming 75 K Dhrystones, although that's fuzzy because they transliterated the 2.0 benchmark to their own language. They claim 50 to 100 VAX MIPS on suitable integer programs. They give the impression that they can whip up instances in fairly short order. If Philips supports this hard enough for the tools to mature, it could become a very interesting ASIC option. -- Don D.C.Lindsay Carnegie Mellon Computer Science