[comp.os.vms] Parallel Programming

vtcf@NCSC.ARPA (Williams) (10/01/87)

Hello all,

I want to write a program that does the following:

1) spawn two or more subprocesses that act out some scenario (maybe a 
   dogfight or something).  These subprocesses may spawn other subprocesses
   that perform some function for their parents.

2) have a global data area that collects all pertinent data from these
   subprocesses, which these subprocesses would read and write to.

3) spawn another process that sends this data to a graphics workstation
   so it can "act out" the scenario graphically

I'm ordering the Parallel library routines from DECUS so I won't have to
get too deep into system services, but I have a few questions about it.
I have the book, "Introduction to Parallel Programming" put out by DEC,
that explains the PLIB routines, but it's a little hazy on spawning
subprocesses that spawn subprocesses. Especially if you may not know how
many subprocesses you'll end up with.  Can this be done with these routines,
or do I have to just dive into the system services?  I'm new to this type
of programming, so any help (hints, horror stories, etc.) would be greatly
appreciated.

Thanks

Tom Williams
Code 4210
Naval Coastal Systems Center
(904) 234-4699

vtcf@ncsc.arpa

SCHOMAKE@HNYKUN53.BITNET.UUCP (10/02/87)

Number_of_lines: 124.
(Thanks to all who responded to my question on Fortran&C file compatibility
 and who appreciated the underlying trickyness: Fortran SEGMENTED records)

[]
Occam and Ada programmers better skip this one to avoid fits of laughter...
We also needed a system for running modules quasi parallel. After some
experimentation with VMS mailboxes, we decided to take another route.
Instead of pumping data through pipes, we just place short "pointers" on a
"blackboard". The following is an abstract of some discussions we had last
year, followed by a pragmatic solution.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Considerations:

1) We have a software system (signal processing and pattern recognition) that
   is complex and rapidly evolving, but we want to keep it manageable.
2) We want modularity at all levels.
3) Modules sometimes have to operate in a "parallel" fashion.
4) Creating (compiling+linking) of very large programs is either time
   consuming or needlessly complex (shareable images).
5) We want the modules to be self-containing, but:
6) We want minimum system dependence at this stage. So: dedicated code with
   respect to the "parallel" execution may NOT reside within the basic modules.
7) The modules must be useable in interactive and batch fashion.
8) The modules are written in plain ANSI Fortran-77 or plain C. They may have
   to be used interchangeably, but we do not want to get addicted to the VMS
   feature of interchangeable object modules.

Here is a possible solution.

- The modules will be stand-alone executables.
- The arguments they get are the arguments and options in the command line
  A portable library will parse these arguments. It should work under
  VMS, Unix (and MS-Dos, OS/2, sometime).

  Example   $ MYPROG {IN.DAT }OUT.DAT /OPT /FLAG=10 /TEXT="abc"

- A commandprocedure (VMS), script (Unix) or batch file (MS-Dos)
  is in fact the .MAIN. program.
- In an OS that is multi-tasking, we can run modules concurrently.
- Messages (=argument lists) are passed via memory resident tables
- The global (bulk) data area is the existing file system. Only when the
  high-order design has been finished we are going to look into optimizing
  data exchange. Solutions: global memory sections, data demons etc.

Each module is embedded in a looping commandprocedure.
This command procedure is spawned once (or equivalent operation) and
starts hibernating. As soon as the process is waked up, the
commandprocedure looks at a global OS variable (VMS /JOB logical,
Unix symbol, PC memory segment) and starts the module:

The VMS example:

(spawned subprocess MYPROG_DEMON)
         $loop:
         $ sleep                           !hibernate
         $ args = f$trnlnm("MYPROG_MESS")  !get arguments from blackboard
         $!                                !  i.e. the /JOB logical table
         $ MYPROG "''args'"                !run module
         $ goto loop                       !sleep again


Activating a module once it is spawned means:

     1) setting the blackboard variable that is to contain the arguments
     2) waking up the corresponding process.

These activation functions can be combined in a single procedure, called
MESSAGE.COM, e.g.:

    $ args = "''p2' ''p3' ''p4' ''p5' ''p6' ''p7' ''p8'"
    $ define/job 'p1' "''args'"
    $ wake 'p1'
thus:
    $ @message myprog {a.in }a.out

To keep things simple, messages are not queued: if a message is sent to a
process that is already active, WAKE waits until that process hibernates again
(you cannot wake somebody who is working already).

An advantage is that module spawning occurs only once and that cumbersome
mailbox techniques are not necessary. For message passing, you can fairly
easily make something that is the equivalent of the VMS /JOB table logicals
when going to another operating system.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

All this boiled down to a system with a few small commandprocedures
and three small Fortran programs. In VMS it looks like this:

                                      Lines:       Essence:
1 bdemon.com  Birth of a demon           6    DCL SPAWN MYPROG_DEMON
2 demon.com   Demon cycle (sleep/work)  97    Running MYPROG.EXE
3 sleep.for   I (demon) go to sleep    191    SYS$HIBER
4 state.for   Is he sleeping?          324    SYS$GETJPI
5 wake.for    Wake him!                 26    SYS$WAKE
6 mess.com    Tell a demon what to do   39    DCL DEF/JOB MYPROG_MESS "args.."
7 kill.com    Kill a demon               9    DCL STOP MYPROG_DEMON

For us, it works fine, especially since the source of the demon modules
(Fortran, C) is not polluted with any heavily system dependent code.
The hibernate state of waiting modules makes sure there is very low CPU load
by demons that are not active. Activating is done by the very fast SYS$WAKE.
Hierarchies are determined by the message passing paths, not by hierarchies
of subprocesses. There are also some disadvantages:

  -messages are not queued, sometimes this is what you need, at other times
   queuing would be handy. A queue handler can be added easily, however,
  -the global data area is in fact the disk. In our case this is obvious
   because of the massive amount of data. For experimenting this is no
   problem, too. Later, things could be made more efficient by means of an
   active data demon that receives requests for data items and returns them,
  -this is no "state-of-the-art" message passing technique, but it is simple
   and robust. Low-level scheduling is left over to the OS, and demon_X cannot
   interrupt the actions of demon_Y. It can only leave behind a message for
   the next cycle of demon_Y.

                 *
               ^^^^^
      KKKKKUUUUNNNNN
      KKK  UUUU NNNN           Lambert Schomaker
      K    UUUU  NNN           SCHOMAKE@HNYKUN53.BITNET
      KKK  UUUU   NN           Nijmegen, The Netherlands.
      KKKKK UU     N