koopman@a.gp.cs.cmu.edu (Philip Koopman) (12/14/90)
> Marty Fraeman writes: > >Rather than FPGA's exclusively, I would be inclined to use a mixture > >of LSI type parts like register files, dual port memories, ALU's and > >some FPGA logic for ``glue''. > > > >If one chose the correct parts, the design could be easily migrated to > >a standard cell or gate array library. (2900 series bit slice > >components, for example, are available from at least one vendor.) > Yes you could do this and Phil Koopman already did. In fact Phil > migrated his WISC 32 from TTL to a standard cell design while at Harris. > Perhaps he could comment on performance of the discrete vs integrated > implementation. The WISC CPU/16 and CPU/32 were built using a bit-slice approach with discrete TTL components. I judged that AMD 2901/2903's were too expensive, finicky to work with, and just plain overkill. In particular, the on-chip register file just didn't meet my requirements. So, I used 74181/182 ALU slices and 74374 register chips. The TTL version with ALS technology ran at 6 MHZ for the 32-bit system. Porting the design onto 2.5 micron standard cell CMOS at Harris resulting in about 8 MHz operation for the RTX 32P. It would have been faster but for two reasons: 1) the design still used tristate logic, which is good in discrete TTL and slow in VLSI CMOS (muxes are often faster, and increased package count isn't an issue) 2) the design was partitioned across two chips, with an inter-chip bus. A major overhaul of the design to take into account good CMOS design practice resulted in the BINAR chip. This 2.0 micron standard cell CMOS chip ran at between 12 and 16 MHz depending on the wafer. It was single-chip, used muxes instead of tristate buses, and somewhat tuned for speed (I'll bet we could have gotten to 20 MHz typical with further careful tuning). My estimate based on cursory analysis is that an FPGA design is going to be 2x to 5x slower than a gate array/standard cell design. One reason is that most FPGA's aren't architected for CPU design (they are better at glue logic consolidation). Another is that there is a tremendous amount of interconnect capacitance that slows things down. FPGA's with sea-of-gates architectures in the 10,000-gate range are just making it to market. Those ought to be quite interesting, and Charlie Johnsen at MISC is banking on them to get his MISC CPU built. (He told me he expects to take a big speed hit for using FPGA's, but he is using the flexibility to reconfigure the CPU on the fly to get the speed back with application-specific instruction sets.) Charlie's design shares program memory with stack memory in the same RAM chips. It might be possible to have separate stack memories if FPGA's with on-chip memory come out (a likely possibility in the coming years). Phil Koopman koopman@greyhound.ece.cmu.edu Arpanet 2525A Wexford Run Rd. Wexford, PA 15090 *** this space for rent ***