bvle@mullauna.cs.mu.OZ.AU (Binh Van LE) (07/30/90)
I am interested in getting some info on data-flow machine, namely: - references, - current development - personal experiences and opinions If there is enough interest, I will post a sumary. Thanks, Binh. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- bvle@cs.mu.OZ.AU Computer Science Department, Melbourne University, Australia.
sakai@etl.go.jp (Shuichi Sakai) (07/31/90)
In article <9904@hubcap.clemson.edu>, bvle@mullauna.cs.mu.OZ.AU (Binh Van LE) writes: > I am interested in getting some info on data-flow machine, namely: > > - references, > - current development > - personal experiences and opinions > I am Dr. Sakai, Electrotechinical Laboratory, Japan. Our research section has already constructed two dataflow machines, SIGMA-1 and EM-4. The former consists of 128 PEs, 128 SEs and two layered multistage network. It really recorded the performance of 170 MFLOPS in the spring of 1987. The latter is a new machine which has 80 PEs. It actually performs 996 MIPS on the summation of 65,536 numbers. It calculates the first 4,000 digits of PI in 0.369 sec. This is about 100 times faster than Sparc 330. As for SIGMA-1, my colleague will report in this newsgroup. I am a designer of the EM-4, so let me briefly report on it. 1. Features of the EM-4 prototype: (see ISCA89, IFIP89, ICS90, InfoJapan90, etc. Dr. Hiraki in IBM Watson will report something about the EM-4 in ICPP90.) (1) Strongly Connected Arc Model Naive implementation of dataflow is not realistic. Execution locality should be extracted so as to exploit cache or a register file. EM-4 adopts this model, which generates a critical section in a dataflow graph. The execution order of this section is determined statically and the section is mapped onto a register based architecture, i.e. RISC. (2) Pipeline Integration Two kinds of pipelines are integrated in the EM-4. One is a packet based cyclic pipeline and the other is a register based advanced control pipeline. The former bypasses the matching stage, when the order of execution is preliminary determined. Remark!! Cyclic pipeline cannot stand by itself. (3) Multiple RISC Scheme with a Single Chip Processor EMC-R We developed a single chip CMOS processor EMC-R. It contains 45,788 gates and it has 299 pins. This chip is now being fabricated by LSI Logic. It has been fully functional since November 1989. It actually performs 12.5 MIPS. EMC-R is a Multiple RISC in the sense: - small instruction set - few instruction formats - few addressing modes - no microprograms - register file architecture and RISC pipeline - 1 clock execution of each instruction (the aboves are as a conventional RISC) - few packet formats - few packet types - light synchronization - small and effective interconnection network - single chip with synchronization and communication facilities both of which are operational independently of and in parallel with the execution part (the aboves are as a multiprocessor RISC) (4) Direct Matching Scheme Matching is realized in one clock without any associative mechanisms. In addtion, inside the strongly connected section, there are no dynamic synchronization. (5) Versatile Interconnection Network with Extra Facilities I will report this in another paper. It has a deadlock prevention facilities and automatic load balancing facilities. (6) Maintenance Architecture There is a auxiliary system other than the computation system, dedicated to maintenance of the whole structure. This can dynamically monitor the system actions and support hardware/software debugging, performance measurement, scheduling strategies, etc. 2. Implementation Size: 60 cm * 92 cm * 140 cm Performance: max. 1 GIPS Network Performance: max. 14.63 GB/s Power: 2.6 KW Boards: - 16 PE groups boards each of which has 5 PEs - 2 mother boards which realize the global interconnection - Interface Switch (a packet interface between host and EM-4) 3. Software DFC, DFC-II: a language compatible with C Another language is now being designed. 4. Status Hardware with macro assembler: Fully operational since April 1990 Compiler of DFC-II: completed in this year Performance Report: will be in some papers in conferences For more questions, please email to the following address. As a matter of fact, after Dr. Kahaner reported on our machine in this newsgroup, there have been many sharp questions and valuable comments sent to us. We have not replied to all of them, sorry, but will surely do so. sakai@etl.go.jp kodama@etl.go.jp-- $@EE;R5;=QAm9g8&5f=j>pJs%"!<%-%F%/%A%cIt7W;;5!J}<08&5f<<!!:d0f=$0l(J ETL,Computer Science Division,Computer Architecture Section, Shuichi Sakai $@<qL#!'C;2N(J $@2HB2!':J!J@i2E;R!K(J $@D9CK!J7E!K(J $@<V!'(J$@%S%9%?(J $@$D$`$8$,$U$?$D(J sakai@etl.go.jp tel. 0298-58-5876 fax. 0298-58-5882 telex 362570 AISTJ