[net.unix-wizards] Automated Structured Analysis

bill@milford.UUCP (bill) (01/31/85)

gobble

One pet concern of mine has been if the development process
could be automated to the extent that structured charts might
no longer be necessary, i.e.


UNIX as a virtual dataflow machine:
                                  

Using one of the most common examples, a spelling checker: 

                                                       ---------   
                                   ---------  words   /         \  
                                  /Have each\------->/ Sort the  \ 
                                 / word be on\       \ words, no / 
        ------------  stripped   \ a separate/        \duplics. /  
       /            \------------>\  line   /          ---------   
      /remove gramm. \document     ---------                |  
     | symbols from   |                                     |unique
     |  the text      |                                     |words  
      \              /                                      V       
    >  \            /                                    ----------   
   /    ------------             ______________         /          \  
  /                               dictionary    ------>/find words  \ 
 /document                       --------------        \  not in    / 
/                                                       \dictionary/  
                                                       / ----------   
                                                      / 
                                             <------- misspelt words


From such a DFD could be constructed the shell script:

tr -cs [A-Za-z] \040 |
sed "s/ */\
/g" |
sort -u |
comm -13 - dictionary

(NOTE: I haven't tested this shell script, its just an illustration (:-)

If the DFD is a network, it gets more complicated, but the "readslow"
routine supplied by Kernighan and Pike can provide the key:          
                                                           

                                      --------   
                          ---------> /        \  
               ------    /           \ do_b   /\            --------   
              /      \--/\            --------  \          /        \  
             /        \   \                      -------->/          \ 
------------>\ do_a   /    \                             |do_b_do_b_do|
              \      /      \          --------           \          / 
               ------        \        /        \---------->\        /  
                              ------>\ do_c     /           --------   
                                      \        /                       
                                       --------                        
                                                                       
could become:
             
  ... | do_a | tee temp1 | do_b ...&
readslow temp1 | do_c ...           
"Structurally, readslow is identical to cat except that it loops
instead of quitting when it encounters the current end of input."

Adjacent to this is the question whether any data flow diagram
could be decomposed into a combination of 'tee', 'joinslow'(like
the do_b_do_b_do above), 'comm', and single-input-single-output
filters? That is, 1) fan_in=2, fan_out=1 'a collector' 2)fan_in=1,
fan_out=2 'a splitter' and 3)fan_in=1 fan_out=1 'a transformer'.

These would run as slooooooow as molasses of course...
but is it conceivable that a "supercomputer" could be
set up as a virtual Unix dataflow machine?
Elements such as 'grep', 'sort', and 'tr' could be viewed
analogously to irreducible machine operations: 'lea', 'addb', etc.
The objects acted upon by 'grep' would be lines, by 'sort' would
be complete files, by 'tr' would be single characters; when a
completed object is available to such a dataflow element, then
{the kernel ?} would activate that process element, producing
hopefully another object needed by another process. I suspect what
currently eats up time in the pipelines exampled above is each
routine testing again and again for objects to act upon.

Well, enough of this 3/4 baked idea, any feedback?