[comp.os.vms] Need faster VMS spawn

perl@rdin.UUCP (Robert Perlberg) (04/16/87)

I saw an article a while back in net.lang.lisp (that's what it was
called at the time) in which someone mentioned that they had written a
lisp interpreter for VMS which ran subprograms by spawning just one
child process and letting it hang around and using it to run all
subprograms since spawning a process takes so long in VMS.  We are now
in the situation of desperatly needing to increase the speed with which
our system can start subprocesses.  We are running MicroVMS V4.3 on a
MicroVAX II.  If anyone can tell me how to use the abovementioned
technique or any way to start subprograms faster than with lib$spawn()
we would greatly appreciate it.  If we can't find any better way we are
going to be forced to link all of our object code (currently about 13
Megabytes of object code constituting 66 separate executables) into one
gigantic executable (gag!).

Thank you.

Robert Perlberg
Resource Dynamics Inc.
New York
{philabs|delftcc.chhat Yo9 y

perl@rdin.UUCP (Robert Perlberg) (04/17/87)

Oops!  I forgot to specify which language.  We are using VAX C and no
one here knows MACRO.

Robert Perlberg
Resource Dynamics Inc.
New York
{philabs|delftcc}!rdin!perl

paul@vixie.UUCP (Paul Vixie Esq) (04/21/87)

In article <602@rdin.UUCP> perl@rdin.UUCP (Robert Perlberg) writes:
>[...] for VMS which ran subprograms by spawning just one
>child process and letting it hang around and using it to run all
>subprograms since spawning a process takes so long in VMS. [...]

> If anyone can tell me how to use the abovementioned
>technique or any way to start subprograms faster than with lib$spawn()
>we would greatly appreciate it. [...]

At last, my VMS experience is good for something.  First: Eunice, the UNIX(tm)
emulator for VMS, uses this technique.  Second: as far as I know, you don't
need MACRO -- C will do.

In VMS, the process creation is a little bit :-) different from fork/exec.
They have LIB$SPAWN, which is a higher-level interface to SYS$CREPRC, which
creates a 'subprocess' running the image (binary) of your choice.  It has no
relation to your original process other than in CPU and other accounting, and
in that the original process has some special privs in killing or changing
the priority of the subprocess.

You will have to look at the System Services manual to find out what it's
called, but I know that there is also a "exec"-like routine that overlays all
or part of your address space with a new program image.  The DCL command
interpreter uses this -- that's why RUN with no arguments is so quick -- the
system only has to load the new code into the existing address space, there's
comparitively little system table munging for that.

Anyway, here's what I remember about Eunice (from the manual, I've not seen
the code).  When you want to create a subprocess, check to see if any of the
previously-created process is in hang-around state.  If not, use SYS$CREPRC.
When a subprocess finishes, (i.e., write your own exit() to catch them on
their way out), have them open a mailbox (name it after the PID so the parent
process can open the same one later on).  Have them sit there in a SYS$QIOW
waiting for something to arrive in that mailbox.  When something arrives,
treat the arrival as the name of the program to "exec" (like I said, see
System Services manual for name, there's only one like it).  If it sits there
for more than, say, five minutes, make it exit -- the parent could be gone,
or out of the section of code that was creating lots of subprocesses.  Back
in the parent, who as you've deduced by now must keep a list of what sub-
processes have been created and what state they are in.  In the parent, when
you want to create a subprocess and you know there's one hanging around 
reading from a mailbox -- well, of course! You just open the mailbox and
stuff the image name into it.

Is there now vomit aplenty all over your keyboard?  Sorry, that's how it's
done.  VMS has some good points, and this isn't one of them.  Good luck...
-- 
Paul A. Vixie        {ptsfa, crash, winfree}!vixie!paul
329 Noe Street       dual!ptsfa!vixie!paul@ucbvax.Berkeley.EDU
San Francisco        
CA  94116            paul@vixie.UUCP     (415) 864-7013

jimp@cognos.uucp (Jim Patterson) (04/24/87)

We've been using a technique similar to what you've described for
the same reasons (speed).  While I can't post any source, we did
use C and I can describe the general technique.

I should first point out the difference between LIB$SPAWN and
SYS$CREPRC.  LIB$SPAWN creates effectively a copy of your DCL
session, including all current symbols and process logical names.
(It does NOT execute your LOGIN.COM file to do this, however,
and so is not quite as slow as an actual logon).  SYS$CREPRC
just runs a process; it does not have any symbols or process
logicals defined when it starts up.  Since for our application
it was quite important that the user's "context" of symbols
and logicals was maintained, we used LIB$SPAWN.

The actual process of getting LIB$SPAWN to stay around is quite
simple.  We simply set up a mailbox for the input device of
LIB$SPAWN and sent mail commands down to it.  We also set up a
"status" mailbox, whose purpose is to return status information and
also to synchronise the parent and child processes.  These mailboxes
were manipulated using $QIO calls, because they often must be asynchronous.

To execute a command in the child session, the command is written to
the input mailbox, followed by a command to write the $STATUS symbol
to the status mailbox, and then the parent reads from the status
mailbox (which blocks the parent until the child finishes).  In case
the child gets into trouble and aborts, we also use the termination
AST of LIB$SPAWN which will cancel the status mailbox in order to
unblock the parent.  Note that this whole technique assumes that
what is being run are DCL commands.

Here are some other points of interest:
- If not interactive, you also need to set up a mailbox for
standard output and echo it onto the parent's standard output.
This can be useful anyways if you want to control where the output
goes (e.g. if in a window environment).
- Prior to sending down commands, we found it useful to send down
a few other commands to redirect the TT and SYS$INPUT files back
to the parent's input (otherwise they point to the input mailbox).
- Terminal ASTs (control-Y, control-T) are a problem. There's been
some discussion of these problems recently in the net; I just know
that our code still has a few glitches due to errant control-Y's
in particular.

I hope this is of some help in your application.

jimp@cognos.uucp (Jim Patterson) (04/30/87)

I'm replying to your posting just to correct some apparent misunderstandings
about a few of the VMS services.

>In VMS, ...
>They have LIB$SPAWN, which is a higher-level interface to SYS$CREPRC, 

There are some significant differences between LIB$SPAWN and SYS$CREPRC
besides the interface itself.  Primarily, LIB$SPAWN is intended to create
a copy of your DCL session (or other command-interpreter).  This involves
copying over process logicals, DCL symbols and other tid-bits of information.
SYS$CREPRC does none of this, but simply runs another program/process.
(That's why it runs so much faster).  However, because it doesn't copy
over this environment, it also doesn't provide the same user environment
that was available to the command interpretor user.
Before LIB$SPAWN was provided (in VMS 3.0), it was extremely difficult in
VMS to implement the equivalent of the unix system() call.


>You will have to look at the System Services manual to find out what it's
>called, but I know that there is also a "exec"-like routine that overlays all
>or part of your address space with a new program image.  

You won't find this in the system services manual.  The routine you're
referring to is the image activator, and it's used by DCL to merge a user
program or image into the process address space as part of a RUN command.
It's not intended to be called by user programs directly.
However, you're right that it is implemented as a system service. 

-- 
Jim Patterson
Cognos Inc.

tedcrane@batcomputer.tn.cornell.edu (Ted Crane) (05/05/87)

In article <633@cognos.UUCP> jimp@cognos.UUCP (Jim Patterson) writes:
>I'm replying to your posting just to correct some apparent misunderstandings
>about a few of the VMS services.
>
>>You will have to look at the System Services manual to find out what it's
>>called, but I know that there is also a "exec"-like routine that overlays all
>>or part of your address space with a new program image.  
>
>You won't find this in the system services manual.  The routine you're
>referring to is the image activator, and it's used by DCL to merge a user
>program or image into the process address space as part of a RUN command.

Jim's right, there is an undocumented system service which interfaces to the
image activator.  There is also a Run Time Library routine called

	LIB$FIND_IMAGE_SYMBOL

which presents a limited interface to the image activator.  It will locate
an image on disk, map it into your address space, and return to the caller
the value of a global symbol from that image.  Usually, that value is the
address of a routine entry mask, and the caller can then call the routine.
If you call LIB$FIND_IMAGE_SYMBOL again (with the same image name), it is
smart enough not to map the image again--it just returns the global symbol value
from the existing image.

This technique can be used to some advantage, like not including whole portions
of your program unless they are really going to be used (example: ANAL/ERRLOG
is divided into many such images, and are bound up at run time).  Unlike
sharable images, if you aren't going to call routines in the image, the image
need not even exist on disk!

One catch: the errors returned by LIB$FIND_IMAGE_SYMBOL are sometimes less
than helpful.  For example, something like "invalid key in library" (a
librarian message) when you look for a symbol that isn't in the image!