[comp.arch] Chaining routines, sequencers

hutch@net1.ucsd.edu (Jim Hutchison) (05/27/88)

<>
I am interested in finding out about techniques for dynamicly chaining
routines together in microcoded hardware.  Currently I am using a AMD2910
based VLIW style function processor (cheeky name) and am doing co-routines
via "jump vector" operations.  This works fine for pairs of routines, but
when I suddenly want more than just an input and an output operation...

The overhead for the linked pair is 1 instruction per "call", which can
usually be hidden behind a main memory reference.  Has anyone got a fancy
do-dad which can set up a stack to roll down?  At each call, call the
routine one farther down in the stack.  Like this:

	setup()
		push routine A
		push routine C
		push routine B
		push routine Reset Pointer
	A()
		do stuff
		call next	; co-call next filter
	B()
		do stuff
		call next	; co-call next filter

	C()
		do stuff
		call next	; co-call next filter

	Reset Pointer()
		vector stack = top

An added convenience would be that it would reset to top of stack on bottom
out, but it is not necessary.  Why do it in software at all?  Things have
to be maleable, new filters as needed or dreamt up and sold. :-)

Suggestions?  Related hardware?  Any DSP chips that allow this?

    Jim Hutchison   		UUCP:	{dcdwest,ucbvax}!cs!net1!hutch
		    		ARPA:	Hutch@net1.ucsd.edu
Disclaimer:  The cat agreed that it would be o.k. to say these things.

guffens@kulesat.uucp (05/31/88)

In article <4981@sdcsvax.UCSD.EDU>, hutch@net1.ucsd.edu (Jim Hutchison) writes:
> <>
> I am interested in finding out about techniques for dynamicly chaining
> routines together in microcoded hardware.  Currently I am using a AMD2910
> based VLIW style function processor (cheeky name) and am doing co-routines
> via "jump vector" operations.  This works fine for pairs of routines, but
> when I suddenly want more than just an input and an output operation...
>
>
           A lot Deleted ...
> 
> Suggestions?  Related hardware?  Any DSP chips that allow this?
> 
We've got such a beast up and running. It consists of a 29116 processor
coupled through a FIFO to main memory. Control is by a 2910 aided with
hardware for zero overhead context switching. In the hardware 3 kind of
processes were defined

	1. return data from memory. This would return to the address which
	   was given when the memory was referenced. Each memory access would
	   cause an `interrupt'.
	2. Real time interrupts. Each hardware interrupt would start a process.
	   The highest priority was guaranteed to get processed within 2
	   microsec's. (test : put a 500 Khz frequency generator on it ...)
	3. default processes. They did things wich were not time critical. They
	   got started by a real time process.

All processes would run for a very short time and had to  context switch.
Typically a process would run for a short time and then need data from memory,
then it started the access together with a continuation address. When the
data did return, it could continue. In between another process was running.

The hardware did consist essential of single bit registers, wich said of a
process was active or not feeding a priority encoder. This was the
address of ram with the start addresses of the routines. In the first half
of the clock period the address of the ram was calculated and eventually new
addresses loaded. In the second half the ram was read out. (You should use
very, very fast logic to get the speed.)

The hardware (except the ram) would fit very nicely in a Xilinx. You could
even eliminate the 2910...

J. Guffens
guffens@kulesat.uucp for uucp
guffens%kulesat.uucp@blekul60 for bitnet

ir. J. Guffens
K.U.Leuven Esat-Mi2
De Croylaan 52B
3030 Leuven
Belgium Europe      for normal post.

Disclaimer : This is only my opinion...