ken@turtlevax.UUCP (Ken Turkowski) (05/30/84)
For a long time, I've wanted a file of auto-increment/auto-decrement registers for use in systems with one memory used for multiple data types. The canonical type of operation is a matrix multiplication, where there are two source operands and a destination operand. These are all vectors, so an auto-increment operation of some type is performed after each memory access. Normally, one has to maintain the value of these pointers in CPU registers, and go through the cycle of writing the MAR, fetching from memory, writing the MAR, fetching from memory, performing the operation, writing the MAR, writing into memory, etc. If there were multiple MARs with the ability to increment by arbitrary amounts after every access, as well as switch between the MARs on the fly, performance would be increased by nearly 100%. In the above scenario, three pointers are used for doing a canonical dot product-type operation. In the real world of computing devices, the processor needs to do other things as well, so it is useful to have a stack pointer, a heap pointer, a couple of FIFO pointers, and a local (trashable) pointer. So, a file with at least 4, and preferably 8 pointers would be very useful. Some time ago, I did a pinout calculation for 4 deep by 16 wide MAR, with bidirectional I/O on the bus side, output only on the memory address driver side, increment/decrement/load/nothing mode, and came up with 40 pins, including power. Of course, this didn't have arbitrary increments, nor could the outputs be stacked to have more than 4 pointers. I doubt if anyone could use a pointer file with more than 16 bits, so there's no need to have them expandable in width, so you may be able to get 8 pointers, 8 increments, a bidirectional and a tristateable port into a 48-pin package. With that kind of functionality, I wouldn't mind the fat package. Does anybody know of any chips suitable for implementing such an MAR file? I'm not aware of any chips that perform this function by itself, but there may be a minimal combination of parts (4-8?) to implement it. -- Ken Turkowski @ CADLINC, Palo Alto, CA UUCP: {amd70,decwrl,flairvax}!turtlevax!ken