[comp.sys.transputer] VCX Timing tests, etc.

stalker@CAPSRV.JHUAPL.EDU ("M STALKER , OFFSITE", SYSCON) (09/26/90)

Dear Netters:

     From time to time, I am entertaining questions about VCX.  Some are very 
specifics; others more general.  I will try to answer some of these questions 
to the best of my ability and post them in this forum for the benefit of 
anyone who might be interested in this topic. 


==============================================================================

THE VIRTUAL CHANNEL EXECUTIVE (in a nutshell)

     The Virtual Channel Executive (VCX) is a low overhead operating environ-
ment which provides transparent interprocess communications.  The VCX has been 
designed to execute concurrently with other application programs in a network 
of transputers.  Basically, VCX provides message-passing capabilities between 
application programs executing in different transputers, giving them unlimited
number of virtual channels for communications. 

      Figure 1 below gives an abstract view of a VCX application.  An appli-
cation Process A (located in Node x) is connected to Process B (located in 
Node y) by means of a virtual channel Z.  This virtual channel originates in 
process A; enters VCX through one designated VCX's input port (at node x); 
exits VCX from one VCX's output port (at node y); and ends at Process B.  It 
is possible and quite likely, that the path of the virtual channel will pass 
several nodes before reaching its destination.  However, this operation is 
completely transparent to the application programmer. 


         NODE x                                 NODE y
                          |\\\\\\\\\\\\\|            
     |---------|          |/////////////|          |---------|
     | Process |    Z     |\\\\\\\\\\\\\|    Z     | Process |
     |    A    |--------->|//// VCX ////|--------->|    B    |
     |         |          |\\\\\\\\\\\\\|          |         |
     |---------|          |/////////////|          |---------|
                          |\\\\\\\\\\\\\|           


                  Figure 1.  Abstraction of VCX       
                   

      The main purpose of VCX is to free the application programmers from the 
tedious and error prone task of designing point-to-point message-passing 
algorithms for interprocess communications.  VCX practically eliminates this 
chore.  If two application processes (located in different nodes) need to 
exchange data, only two id's are required: 1) the destination node number and 
2) the number of the destination VCX's output port feeding data to the 
receiving process.  These two id's are attached to the data to form a 
"datagram" and will be used by VCX to deliver the data to its recipient. 
                                                                    
      A fully connected group of transputers offers the possibility of many 
different virtual network configurations (i.e., pipeline, mesh, toroid, tree, 
hypercube, etc.).  With VCX, a network can be booted with one of these 
configurations and then dynamically reconfigured with a different one.  This 
"on-the-fly" virtual reconfiguration capability offers many possibilities: 
dynamic network tuning, fault tolerant communications, dynamic 
allocation/partition of computing power, etc.
                                                                     
2
     The VCX is currently supported by other software tools, executing off-
line, which provides additional freedom to the application programmer.
                                                                     
o     One of these tools is the Routing Table Generator (RTG).       
      Its main function is to read a network information file and    
      utilize the derived topology to build a Master Routing         
      Table.  Basically, this tool takes a "network information      
      file" as input and performs the following functions:           
                                                                     
      a)     Detects and reports syntax errors or network connection 
             errors found in the input file, providing explicit      
             error messages.                                         
                                                                     
      b)     Builds an optimal set of Datagram Routing Tables,       
             eliminating any cyclic paths; and provides valuable     
             statistics about the virtual configuration of the       
             network.                                                
                                                                     
      The RTG is a proven analytical tool which helps with the       
      task of designing an efficient and reliable network            
      hardwired topology.  The network statistics help to identify
      optimal virtual configurations and provides clues for a more 
      efficient mapping of the application program.                  
                                                                     
o     Another tool is the Network Worm (NW) which explores a         
      passive network of transputers and derive information about    
      its topology.  This tool can be used to verify the integrity   
      of the network, or to automatically generate or assist in      
      the generation of network information files.                   
                                                                     
                                                                     
WHAT VCX IS MADE OF?                                  
                                                                     
      The Virtual Channel Executive (VCX) consists mainly of a Datagram Router 
Process and four I/O Server Processes.  These processes execute in the 
background, concurrently with any number of application processes.  The 
executable code of VCX occupies about 3 Kbytes of memory plus any additional 
memory selected by the user for buffering the datagrams.  Each node in the 
network has an identical copy of VCX, responsible for transporting "datagrams" 
throughout the network.  The routing of "datagrams" is assisted by two look-up 
tables, the Process Connectivity Table and the Datagram Routing Table.  The 
former defines the interprocess connectivity while the latter indicates the 
route to follow through the physical communication links.  The Process 
Connectivity Table is supplied by the application programmer while a set of 
Datagram Routing Tables is generated off-line by a supporting software tool 
and stored into an ASCII file.  This off-line Routing Table Generator (RTG) 
tool will find all possible paths from each transputer to all the others in 
the network.  It will also ranks these paths from shortest to longest and 
eliminate any cyclic paths encountered. 


PROCESS CONNECTIVITY TABLE:

     The Process Connectivity Table defines how the different processes 
(components of an application program) are connected by "virtual channels".  
It is important to differentiate between soft-channels and virtual channels.  
A soft-channel is normally used for communications by processes which are 
physically located in the same node (transputer).  This type of communications
does not require the intervention of VCX.  On the other hand, a virtual 
channel allows the flow of data between processes which are located in 
3
different nodes.  VCX implements "virtual channels" in a simple manner; the 
application programmer defines these virtual channels by entering the 
proper information in the Process Connectivity Tables.

     Perhaps, the best way to illustrate how to build these tables is by using
a simple example.  We start up with a network of ten (10) transputers, 
physically connected by their communication serial links.  Furthermore, we 
select a network virtual configuration having a shape of a "pipeline", as 
shown in Figure 2 below.


        ROOT TRANS.
          |----|     |----|     |----|     |----|     |----|       
          |MAST|---->|BUSY|---->|BUSY|---->|BUSY|---->|BUSY|------->|
          |  1 |<----|  2 |<----|  3 |<----|  4 |<----|  5 |<----|  |
          |----|     |----|     |----|     |----|     |----|     |  |
    _____________________________________________________________|  |
   |   _____________________________________________________________|
   |  |
   |  |   |----|     |----|     |----|     |----|     |----|       
   |  |-->|BUSY|---->|BUSY|---->|BUSY|---->|BUSY|---->|SLAV|
   |<-----|  6 |<----|  7 |<----|  8 |<----|  9 |<----| 10 |
          |----|     |----|     |----|     |----|     |----|     

              Figure 2.  Network Virtual Configuration


     The nodes of this network are arbitrarily numbered 1-10; having node 1 
as the root node (also connected to the host communication bus) and node 10 as
the last node in the pipeline.  We place an application process named "MASTER"
in the root node, a copy of application process "BUSY" in nodes 2-9, and an 
application process named "SLAVE" in node 10.  The MASTER process sends a 
query to the SLAVE process via the virtual channel Q.  The SLAVE process 
returns the reply to the MASTER process via the virtual channel R.  The BUSY 
processes are there for the only purpose of wasting CPU time in nodes 2-9, 
thus simulating real life conditions.  The connectivity between the MASTER and
SLAVE processes is graphically shown in Figure 3.

           
        NODE #1          VCX     | ... |    VCX       NODE #10
                     |-----------| ... |-----------|
                     |           | ... |           |   
                 --->| 0  input  | ... | input   0 |<---
  |---------|    --->| 1  ports  | ... | ports     |         |---------|
  |         |---Q--->| 2         | ... |         1 |<----R---|         |
  |         |    --->| 3         | ... |           |         |         |
  | MASTER  |        |           | ... |           |         | SLAVE   |
  |         |        |           | ... |           |         |         |
  | PROCESS |        |           | ... |           |         | PROCESS |
  |         |<---R---| 0         | ... |         0 |---Q---->|         |
  |         |    <---| 1  output | ... | output    |         |         |
  |---------|    <---| 2  ports  | ... | ports   1 |--->     |---------|
                 <---| 3         | ... |           |    
                     |           | ... |           |
                     |-----------| ... |-----------|                  
                                 | ... |

                    Figure 3.  Processes Connectivity


4
     To fully understand how the Process Connectivity Tables are built, we 
have to take a look at the source code of the programs for this example.  
Therefore, a brief listing of the top level source code for node 1, nodes 2-9,
and node 10 will be presented next.  Some details are omitted for clarity.
Examining the following source code and making references to Figures 2-3, the 
reader should get a clear understanding about the Process Connectivity Tables.

NODE 1 source code:

/**************************************************************************
*                VIRTUAL CHANNEL EXECUTIVE APPLICATION                    *
*                                                                         *
*                           A Simple Example                              *
*                                                                         *
*           >>> Copyright (c) 1990 by SYSCON CORPORATION <<<              *
*                                                                         *
*   File name:    host_nod.c              Developer: Mario D. Stalker     *
*   Last revision:  09/18/90                                              *
*                                                                         *
*   Description:  This portion of the application program (Example 1)     *
*   executes on the root transputer.                                      *
*                                                                         *
**************************************************************************/

...  User's defined constants
...  Include files 

main(int argc, char *argv[])
{  
   The line below is the Process Connectivity Table for this node.  Please 
   note only one input port is needed for this node in this example.  However,
   we have declared four VCX's input ports available to better illustrate our
   point.

   static int Virtual_Channel[VCX_PORTS][2] = {{-1,-1},{-1,-1},{10,0},{-1,-1}};

   ...  Channels and Processes declaration   
   ...  Channel Allocation 
/*
*  Start Process Allocation
*/
   p_environment = ProcAlloc(hostvcx, WSPACE, 7, argv, to_vcx, from_vcx, 
                             env_ctrl, Virtual_Channel, VCX_BUFFER, MAX_NODES);
   application_1 = ProcAlloc(master,  WSPACE, 3, env_ctrl, from_vcx[0], 
                             to_vcx[2]);
/*
*  Start VCX and Application program
*/
   ProcPar(p_environment, application_1, 0);
} 













5
NODES 2-9 source code:

/**************************************************************************
*                VIRTUAL CHANNEL EXECUTIVE APPLICATION                    *
*                                                                         *
*                           A Simple Example                              *
*                                                                         *
*           >>> Copyright (c) 1990 by SYSCON CORPORATION <<<              *
*                                                                         *
*   File name:    busy_nod.c              Developer: Mario D. Stalker     *
*   Last revision:  09/18/90                                              *
*                                                                         *
*   Description:  This portion of the application program (Example 1)     *
*   executes on transputers 2-9.                                          *
*                                                                         *
**************************************************************************/

...  User's defined constants
...  Include files 

main()
{
   The line below is the Process Connectivity Table for nodes 2-9.  Please 
   note that no input/output port is needed for these nodes in this example.  
   However, we need to declare at least one port so satisfy the formal
   parameter list of the VCX process.

   static int Virtual_Channel[VCX_PORTS][2] = {{-1,-1}};

   ...  Channels and Processes declaration   
   ...  Channel Allocation 
/*
*  Start Process Allocation
*/
   p_environment = ProcAlloc(vcx,  WSPACE, 5, to_vcx, from_vcx,
                             Virtual_Channel, VCX_BUFFER, MAX_NODES);
   application_1 = ProcAlloc(busy, WSPACE, 0); 
/*
*  Start VCX and Application program (VCX at high priority)
*/
   ProcPriPar(p_environment, application_1);
} 


















6
NODE 10 source code:

/**************************************************************************
*                VIRTUAL CHANNEL EXECUTIVE APPLICATION                    *
*                                                                         *
*                           A Simple Example                              *
*                                                                         *
*           >>> Copyright (c) 1990 by SYSCON CORPORATION <<<              *
*                                                                         *
*   File name:    slav_nod.c              Developer: Mario D. Stalker     *
*   Last revision:  09/18/90                                              *
*                                                                         *
*   Description:  This portion of the application program (Example 1)     *
*   executes on transputer 10.                                            *
*                                                                         *
**************************************************************************/

...  User's defined constants
...  Include files 
                                                       
main()
{
   The line below is the Process Connectivity Table for this node.  Please 
   note only one input port is needed for this node in this example.  However,
   we have declared two VCX's input ports available to better illustrate our
   point.

   static int Virtual_Channel[VCX_PORTS][2] = {{1,0},{-1,-1}};

   ...  Channels and Processes declaration   
   ...  Channel Allocation 

/*
*  Start Process Allocation
*/
   p_environment = ProcAlloc(vcx,   WSPACE, 5, to_vcx, from_vcx,
                             Virtual_Channel, VCX_BUFFER, MAX_NODES);
   application_1 = ProcAlloc(slave, WSPACE, 2, from_vcx[0], to_vcx[0]); 
/*
*  Start VCX and Application program
*/
   ProcPar(p_environment, application_1, 0);
} 
==============================================================================

     Some other questions which are frequently asked deal with performance.  
How much overhead VCX imposes on the system?  What is the performance ratio 
between a system using VCX and one without it?  I am not prepared at this 
point to answer these questions thoroughly.  However, the previous simple 
example can give an idea about the propagation speed of data through the 
virtual channels when VCX is used.  To illustrate this, we first need to take 
a look at the source code for the modules: MASTER, SLAVE, and BUSY. 










7
MASTER source code:

master(Process *p, Channel *ctrl_1, Channel *in, Channel *out)
{ 
  union data_type
  { byte b[4];
    int  w;
  } number_of_characters;

  byte *pt_to_buffer;
  int tag, string_length, i;
  int start, finish, microsec;
  uint16 length;

  pt_to_buffer = malloc(STRING_SIZE * sizeof(byte));

  tag = ChanInInt(ctrl_1);  /* Get control signal from hostvcx */

  while(1)
       { CLR_SCRN;
         printf("Enter the length of string to be retrieved ( 1 - 41): ");
         scanf("%d", &string_length);
         number_of_characters.w = string_length;   /* Store the # of characters to be retrieved */  
                                                   /* from the SLAVE processin node #10.       */ 
         start = Time();

         put_mssg(out, number_of_characters.b, 4); /* Send the above value to the SLAVE process.*/

         get_mssg(in, pt_to_buffer, &length);      /* Get the string (w/specified length) from  */
                                                   /* the SLAVE process.                        */
         finish = Time();
         
         CLR_SCRN;
         printf("\n\nRetrieved string --> ");
         for(i = 0; i < length; i++)
             printf("%c", pt_to_buffer[i]);

         microsec = ((finish - start) * 64);
         printf("\n\nTotal message propagation time:        %7d microsec", microsec);
         microsec = microsec / 18;
         printf("\n\nNode-to-node message propagation time: %7d microsec", microsec);

         query();
       }
}

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++















8
SLAVE source code:

#include <string.h>

slave(Process *p, Channel *in, Channel *out)
{ 
  union data_type
  { byte b[4];
    int  w;
  } number_of_characters;

  byte phrase[40];
  int tag, i;
  uint16 length, string_length;

  strcpy(phrase, "This is the time for all the good men ...");

  while(1)
       {  get_mssg(in, number_of_characters.b, &length); 

          string_length = number_of_characters.w;   
                                          
          put_mssg(out, phrase, string_length);
       }
}

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

BUSY source code:

busy(Process *p)
{ 
  int x, y, z;
  x = 1313;
  y = 9191919191;
  while(1)
    { 
       z = y / x;
    }
}

     Briefly, the MASTER process sends an integer value to the SLAVE process. 
This value tells the SLAVE process which is the length of the string it should
send to the MASTER process.  The time-interval is measured from the time just 
before the MASTER sends the integer value to the time right after the string 
is received by the MASTER.  A series of tests were conducted under different 
conditions; the results are summarized below:












 


9
              TABLE 1.  Datagram Propagation Delay Chart
       

                   |  String Length   |    Total Time    |  Average 
                   |                  |                  |  Node-to-Node time
-------------------|------------------|------------------|--------------------
BUSY ON            |      1 byte      |    2048 microsec |   113 microsec
VCX High Priority  |     41 bytes     |    2240 microsec |   124 microsec
-------------------|------------------|------------------|--------------------
BUSY ON            |      1 byte      |  152384 microsec |  8465 microsec
VCX Same Priority  |     41 bytes     |  152128 microsec |  8451 microsec
-------------------|------------------|------------------|--------------------
BUSY OFF           |      1 byte      |    1984 microsec |   110 microsec
VCX Same Priority  |     41 bytes     |    2176 microsec |   120 microsec
------------------------------------------------------------------------------

Note: VCX was always executing at the same priority in nodes 1 and 10. 
      The total time includes the time it takes from the integer value
      to go from the MASTER process to the SLAVE process, the time for
      preparing the string to be sent, and the time for the string to
      go from the SLAVE process to the MASTER process.  The average Node-
      to-Node time is = (total time)/18.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+                                                                            +
+   Answer to last minute questions:                                         +
+                                                                            +
+   1.  To those who are requesting additional information about VCX.        +
+                                                                            +
+       Please be patient; I will forward more information to you as         +
+       it becomes available.  I really appreciate your interest.            +
+                                                                            +
+   2.  We have tentative plans to port VCX to other "C" compilers           +
+       (other than LSC), and possible adapt VCX to other high level         +
+       languages supported by transputers.  However, this takes time        +
+       and money.  Does anybody out-there has a few dollars to invest??     +
+                                                                            +
+   3.  We have not compare the VCX "low overhead" against other similar     +
+       environments.  However, we conducted some timing experiments to      +
+       measure the propagation delay of datagrams.  The data from one       +
+       of those tests is presented above in this message.  The size of      +
+       the VCX kernel to be loaded in each transputer is approximately      +
+       3 Kbyte, some of it is used by internal data buffers.                +
+                                                                            +
+   4.  Quite possible, VCX may be sold in binary executable format          +
+       with the option to purchase the source code for additional           +
+       price.  However, the packaging, price, support, etc. is still        +
+       under consideration.                                                 +
+                                                                            +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++











10
For more information contact:   Mario D. Stalker
                                SYSCON Corporation
                                9841 Broken Land Pkwy, Suite 210
                                Columbia, Maryland  21046

   Telephone:     (301) 381-8319
   FAX:           (301) 381-8321

   E-Mail:  stalker@capsrv.jhuapl.edu
* ------------------------------------------------------------------------- *
|                                                                           |
|  SYSCON Corporation, a subsidiary of Harnischfeger Industries, Inc.       |
|  Corporate Headquarters                                                   |
|  1000 Thomas Jefferson Street, N.W.                                       |
|  Washington, D.C.  20007                                                  |
|                                                                           |
|  System Engineering, Computer Systems, Training and Simulation Systems,   |
|  Facilities Management, Technical Services, Hardware/Software Products.   |
|                                                                           |
|  Parallel Processing Architectures, Real Time Programming, Software Tools |
|  Development, X-Windows, Graphic Tools Developments.                      |
|                                                                           |
* ------------------------------------------------------------------------- *