mayne@sun16.scri.fsu.edu (William (Bill) Mayne) (04/03/91)
This is a simple problem concerning a description which is giving me problems on my dissertation. I encountered this trying to describe dataflow execution strategies, but it could be applied in other contexts. Suppose there are three following processes, F, G, and H, communicating by a pipeline (message queues) or dataflow arcs: x G(x) H(G(x)) F--------->G--------->H--------->... F could be recording events in a physics experiment, G doing some preprocessing of the results, and H doing more processing of the output of G. Each event is treated separately. Suppose that there are many processors available, and that the events of interest happen in bursts. Each time an event happens F sends its readings to G as a packet or token. The tokens are queued until G can handle them. G similarly passes results to H, maintaining the order so the n'th output of F gets transformed to the n'th output of G, which is the n'th input of H. The time between events may sometimes be less than the time it takes G to process them, so the queue between F and G may grow. The throughput can be increased by executing G on more than one value, provided the output order, as seen by H, is maintained, i.e. we can start processing the second event of a burst before we have finished processing the first. We may have processors available to do this because H can't start until it gets the results of the first event as processed by G. (This example may make it look like a silly approach, but that's because it simplifies the context so much.) Note that this resembles traditional pipeline processing, except that the calculation of G is not divided into a number of separate, fixed stages. We could theoretically have as many values in process G as we have unfinished events and processors available to handle them. Later, as some G's finish, we would want to start devoting some of our processors to H. G looks to the outside world like a pipelined process, differing only in its internal structure. But just calling it this would be confusing, because there are other situations which really are like pipelines (the composition of G and H in the example is one instance). There is also a hardware interpretation of this problem. The now defunct ETA-10 could execute vector additions by using two adders in parallel, splitting the operand streams into two streams each, containing elements from either all even or all odd positions. The split streams could be processed by two independent adders, and the results interleaved to give the vector sum. (The adders may have each been pipelines themselves, but that is not important to my problem, except, perhaps, to emphasize that what I am talking about isn't exactly pipelined execution.) This is just what I have been told. I never worked on the ETA-10 myself. I've toyed with calling this "parallel execution on pipelined data", but besides being a mouth full I fear this is missing a better, already generally accepted name from some area I don't know. The concept is not new, but I haven't come across a good short name for it. I don't want to make one up if it already exists somewhere. Any suggestions / references? Bill Mayne (mayne@cs.fsu.edu)