[comp.arch] FORTH ENTINGES/ APL 32 bit

shri@ncst.ernet.in (H.Shrikumar) (11/30/90)
 Stack CPU is alive and well in the SC32, despite Harris killing the RTX!

In article COMP.LANG.FORTH:<1990Nov26.155122.28988@aplcen.apl.jhu.edu>
   mef@aplcen (Marty Fraeman) reminds us all:

>> All the talk about the difficulties at Harris reminded me of a
>> more recent CPU - the SC32; any news on it?
>>
>> Elliott Chapin

>The SC32 is in stock and available from: ...
>   Silicon Composers ...
...
>For those who don't know, the SC32 is a 32 bit stack microprocessor
>that does a bang up job running Forth.  At APL we are using the chip in
...
>Many sequences of Forth primitives can map into a single instruction so
>overall performance is similar to the RTX family although 32 bit
>quantiies are being manipulated and the address space is much larger.
>
>Several articles, primarily written by John Hayes although I've had a
>.. FORML conference.  .. an article in JFAR .. in Embedded Systems ...
>chip's history.  I can dig up more precise references if you'd like.

   The existence of this chip almost evaporated from my memory ...
till Marty Fraeman reminded us all about it.

  Marty, could you give any pointers to articles/papers about this chip
besides those in the FORTH niche publications you mention...  surely
the survival of radically different architecture like the SC32/RTX2000
is news to lots of other (non-Forth) people .. for eg.  comp.arch
will be happy to hear more about the SC32 and its well-being.

  Perhaps there are some ASPLOS, or ACM SIG?? or IEEE ??? articles ?
Or can one get a flyer from Silcon Composers ?

  There is one paper in ACM/ASPLOS-II Conference (An Architecture
for direct execution of FORTH - John Hayes, Marty Fraeman et al). I
assume the SC32 is a mature descendant of this 4um, 1.5MHz MOSIS
prototype part.

  For those in comp.arch following the thread about registers/caches ...
the above paper analyses Hoshagawa's (?) cut-back-K algorithm for
stack cacheing. You stack the top N words of cache, the ALU can access
TOS and TOS-1 directly. On an underflow-overflow you read in/out K
words. Optimal is when K=N/2.

  Simulations in the paper quoted report only 1 stack interrupt from
from the parameter stack and 300-odd interrupts from the Return Stack
(keeping track of primitive (subroutine) calls) for 1,000,000 Forth
primitives called in the parameter stack; for N=32 words, K=16.
Thats a low enough cache-miss rate, and the fills are K-sized bursts.

  For the return stack, each call would push a PC. So there were
300-odd missses in 1,000,000 pushes and an equal pops. ... thats
roughly 99.984% cache hit rate (not surprising for code). I am not able
to estimate a similar figure for the data stack, since more information
about the stack growth behaviour is needed. These two hit-rates need to
be weighted with the frequency of the memory fetch and store, @ and !,
operators (which go out to the bus) to arrive at an overall hit-rate
figure. Then we might be able to compare the stack CPU with an 
average register(window) RISC.

-- shrikumar ( shri@ncst.in )