[comp.sys.atari.st] Pexec Cookbook

neil@cs.hw.ac.uk (Neil Forsyth) (02/12/88)

Does anyone have a copy of Allan Pratts Pexec Cookbook they can mail me?

Thanks in advance

Neil

-------------------------------------------------------------------------------
"I think all right thinking people in this country are sick and tired of being
told that ordinary decent people are fed up in this country with being sick and
tired. I'm certainly not and I'm sick and tired of being told that I am!"
- Monty Python

 Neil Forsyth                           JANET:  neil@uk.ac.hw.cs
 Dept. of Computer Science              ARPA:   neil@cs.hw.ac.uk
 Heriot-Watt University                 UUCP:   ..!ukc!cs.hw.ac.uk!neil
 Edinburgh
 Scotland
-------------------------------------------------------------------------------

tw@cscosl.ncsu.edu (Thomas Wolf) (02/18/88)

In article <1691@brahma.cs.hw.ac.uk> neil@cs.hw.ac.uk (Neil Forsyth) writes:
>
>Does anyone have a copy of Allan Pratts Pexec Cookbook they can mail me?
>

I would also be interested in a copy.  If there are others, perhaps someone
could post it?  If not, please mail me a copy.

Tom Wolf
ARPA (I think): tw@cscosl.ncsu.edu
           or wolf@csclea.ncsu.edu

robert@richp1.UUCP (Robert Miller) (02/20/88)

In article <1497@ncsuvx.ncsu.edu> tw@cscosl.UUCP (Thomas Wolf) writes:
>In article <1691@brahma.cs.hw.ac.uk> neil@cs.hw.ac.uk (Neil Forsyth) writes:
>>
>>Does anyone have a copy of Allan Pratts Pexec Cookbook they can mail me?
>>
>
>I would also be interested in a copy.  If there are others, perhaps someone
>could post it?  If not, please mail me a copy.
>
>

I too would be interested in a copy.


-- 
.......................................
"To open, cut along dotted line."  ....:.......................................
                                  :     :  Robert Miller @ ihnp4!richp1!robert
                                   .....

uace0@uhnix2.UUCP (Michael B. Vederman) (02/21/88)

Here is the Pexec cookbook, plus some information posted by Alan Pratt some
time  ago.

================ like cut here like ==========================


This is in response to a request from Christian Kaernbach which I got
from  BITNET: I can't reply directly to BITNET, but I'm sure other people
will find this interesting, too: it's a preliminary version of the
long-awaited Pexec cookbook!

In broad terms, the things you have to know about Pexec are that it
starts up a process, lets it execute, then returns to the caller
when that process terminates.  The "caller" -- the process which used Pexec
in the first place -- has some responsibilities: it has to make memory
available to the OS for allocation to the child, and it has to build
up the argument string for the child.

All GEMDOS programs are started with the largest block of OS memory 
allocated to them.  Except in very rare circumstances, this block
is the one stretching from the end of the accessories and resident
utilities to the beginning of screen memory. The point is that your
program has probably been allocated ALL of free memory.  In order to
make memory available for a child process, you have to SHRINK the
block you own, returning the top part of it to GEMDOS.  The time to
do this is when you start up.

If you use Alcyon C (from the developer's kit), you know that you
always link with a file called GEMSTART.  If you've been paying 
attention, you should have gotten the *new* GEMSTART from Compuserve
(or from somebody else who got it): I wrote that GEMSTART.  In
GEMSTART.S, there is a lot of discussion about memory models, and then
a variable you set telling how much memory you want to keep or give back
to the OS.  Make your choice (when in doubt, use STACK=1), assemble
GEMSTART.S, call the result GEMSEXEC.O (or something), and link the
programs which Pexec with that file rather than the normal GEMSTART.

Now here's a discussion of what GEMSTART has to do with respect to
keeping or returning memory:

Your program is invoked with the address of its own basepage as
the argument to a function (that is, at 4(sp).l).  In this basepage
is the structure you can find in your documentation.  The interesting
fields are HITPA (the address of first byte NOT in your TPA),
BSSBASE (the first address of your bss) and BSSLEN (the length of
your BSS).

Your stack pointer starts at HITPA-8 (because 8 is the length of the
basepage argument and the dummy return PC on the stack).  The space from
BSSBASE+BSSLEN to your SP is the "stack+heap" space.  Library malloc()
calls use this space, moving a pointer called the "break" (in the
variable __break, or the C variable _break if you use Alcyon C) up as it
uses memory.  Your stack pointer moves down from the top as it uses
memory, and if the sp and _break ever meet, you're out of memory.  In
fact, if they ever come close (within a "chicken factor" of about 512
bytes or 1K), malloc() will fail because it doesn't want your stack to
overwrite good data. 

When a process starts, it gets *all* of memory allocated to it: from the
end of any accessories or resident utilities up to the default screen
memory.  If you want to use Pexec, you have to give some memory back to
the OS.  You do this with the Mshrink call.  Its arguments are the
address of the memory block to shrink (your basepage address) and the
new size to shrink it to.  You should be sure to leave enough room above
your BSS for a reasonable stack (at least 2K) plus any malloc() calls
you expect to make.  Let's say you're writing "make" and you want to
leave about 32K for malloc() (for your dependency structures).  Also,
since make is recursive, you should leave lots of space for the stack -
maybe another 16K.  The new top of memory that your program needs is:

    newtop = your bss base address + your bss size + 16K stack + 32K heap

Since your stack pointer is at the top of your CURRENT TPA, and you're about
to shrink that, you'd better move your stack:

    move.l	newtop,sp

Now you want to compute your new TPA size and call Mshrink:

    move.l	newtop,d0
    sub.l	basepage,d0	; newtop-basepage is desired TPA size
    move.l	d0,-(sp)	; set up Mshrink(basepage,d0)
    move.l	basepage,-(sp)
    move.w	#$4a		; fn code for Mshrink
    trap	#1
    add.l	#10,sp		; clean up args

Now that you've shrunk your TPA, the OS can allocate this new memory to
your child.  It can also use this memory for Malloc(), which is used
occasionally by GEM VDI for blt buffers, etc.  Note that you only
have to do this once, when you start up: after that, you can do as much
Pexec'ing as you want.

When you want to exec a child, you build its complete filespec into one
string, and its arguments into another.  The argument string is a little
strange: the first character of the argument string is the length of the
rest of the string!

Here is a simple system call: pass it the name of the file to execute
and the argument string to use.

	long system(cmd,args)
	char *cmd, *args;
	{
	    char buf[128];

	    if (strlen(args) > 126) {
		printf("argument string too long\n");
		return -1;
	    }
	    strcpy(buf+1,args);			/* copy args to buffer+1 */
	    buf[0] = strlen(args);		/* set buffer[0] to len */
	    return Pexec(0,cmd,buf,0L);
	}

The first zero in the Pexec call is the Pexec function code: load and
go.  The cmd argument is the full filespec, with the path, file name,
and file type.  The third argument is the command-line argument string,
and the fourth argument is the environment pointer.  A null environment
pointer means "let the child inherit A COPY OF my environment."

This call will load the program, pass the arguments and environment to
it, and execute it.  When the program terminates, the call returns the
exit code from the program.  If the Pexec fails (not enough memory, file
not found, etc.) a negative code is returned, and you should deal with
it accordingly.  Note that error returns from Pexec are always negative
LONGS, while return codes from the child will have zeros in the upper 16 bits.

EXIT CODES:

GEMDOS, like MS-DOS before it, allows programs to return a 16-bit exit
code to their parents when they terminate.  This is done with the
Pterm(errcode) call.  The value in errcode is passed to the parent
as the return value of the Pexec system call.  The C library function
exit(errcode) usually uses this call.

Unfortunately, the people who wrote the startup file for the Alcyon C
compiler didn't use this.  The compiler calls exit() with an error code,
and exit() calls _exit(), but _exit always uses Pterm0(), which returns
zero as the exit code.  I fixed this by rewriting GEMSTART.S, the file
you link with first when using Alcyon.

Even though new programs return the right exit code, the compiler
itself still doesn't.  Well, I have patched the binaries of all the
passes of the compiler so they DO.  It isn't hard, and I will post
instructions at a later date for doing it.  IF YOU DO THIS, PLEASE
DON'T BOTHER OUR CUSTOMER SUPPORT PEOPLE IF IT DOESN'T WORK.  THEY
DON'T KNOW ANYTHING ABOUT IT.

I hope that this little cookbook makes Pexec less mysterious.  I haven't
covered such topics as the critical-error and terminate vectors, even though
they are intimately connected with the idea of exec'ing children.  A more
complete cookbook should be forthcoming.

If there are any errors or gross omissions in the above text, please
let me know BY MAIL so I can correct them coherently.  Landon isn't
here to check my semantics, so I may have missed something.  [Landon
is on vacation in France until early September.]

********************************************************************

C. Kaernbach's question was why his accessory, which basically did
a Pexec from a file selector, didn't always work.  The answer is that
it works when used within a program which has returned enough memory to
the OS for the child.  Why might it bomb?  Because if a program has
returned a *little* memory to the OS (only about 2K), a bug in Pexec 
shows up that breaks the memory manager.  Accessories are strange beasts
anyway, so for the most part combining two strange beasts (Accessories
and Pexec) is bad news.

/----------------------------------------------\
| Opinions expressed above do not necessarily  |  -- Allan Pratt, Atari Corp.
| reflect those of Atari Corp. or anyone else. |     ...lll-lcc!atari!apratt
\----------------------------------------------/	(APRATT on GEnie)

here is additional info

=================== like cut here again like =======================


Attention Mark Williams, Beckmeyer, and Gert Poltiek, and anybody
else interested:

There is a trick that some shells and compiler libraries use that lets
you pass argument strings to programs which are longer than the 127
bytes which fit in the command line area of the basepage.  Their trick
is to put the word ARGV= in the environment, and follow it with a
null-separated list of argument strings.  The list is terminated with
another null. 

This scheme works pretty well, but has two drawbacks, one major and one
minor. 

The minor drawback is that it defies the definition of what is in the
environment: the environment should consist of strings of the form
NAME=value<NUL> terminated by a final <NUL>.  This is minor because
shells using this convention usually put the ARGV information at the end
of the environment anyway. 

The major drawback is that you can't tell if the ARGV string in your
environment is really meant for you.  Imagine you have the Mark Williams
shell (msh), an editor compiled with Alcyon, and another utility like
"echo" compiled with MWC.  Imagine further that the editor has a
"shell-escape" command that lets you execute another program from within
the editor. 

Do this:

	From msh (the MWC shell): start up the editor with the
	command line arguments "this is a test."

	Tell the editor to execute the command "echo hello world."

	The "echo" command will echo "this is a test," not
	"hello world."

What happened is that msh put "this is a test" in the environment for
the editor (as well as in the command tail in the basepage).  The
editor, not knowing any better, didn't put "hello world" in the
environment before executing "echo." When "echo" started, it found
"ARGV=this is a test" in its environment and echoed that. 

What is needed is a way for a program to tell if the "ARGV=" string in
its environment is really intended for it, or is just left over from an
earlier program.  There is a way to do this that doesn't affect old
programs compiled without this fix. 

The new convention could be to place another string in the environment
with your own basepage address, before Pexec'ing your child.  The child
could start up, and check to see if its parent's basepage address (in
its basepage) matches the address in the environment.  If it does match,
the child will know that the ARGV= string is for it.  If it doesn't
match, the child will know it was started from a non-MWC program like
the editor above, and will look in its basepage for the command line. 

Note that if the parent's basepage isn't in the environment at all, but
the ARGV= string is, the child must assume that the ARGV string is
intended for it, just as it does now.  Therefore, old-style programs
could still Pexec new-style children, and vice-versa. 

This would all require a change in the startup code that calls main(),
and the exec() code which Pexec's the child. 

How about it, guys? If we could all agree on the name and format of this
new environment variable, we could get rid of a serious flaw in Mark
Williams' otherwise clever scheme.  Other shells could adopt this, too,
and ultimately everybody would be able to kiss the 127-character
command-line limit goodbye. 

For now, I propose that the environment variable in question be called
PBP, and that its value be the decimal string of digits making up the
parent's basepage.  The reason for this is that almost all libraries
have an atol() function, where not all have an atolx() function. 

A shell using this trick, with a basepage at 366494 (decimal), could
Pexec a child called "test.prg" with these strings in the environment:

...
PBP=366494<NUL>
ARGV=test.prg<NUL>first<NUL>second<NUL>third<NUL><NUL>

In the startup code of the child, you would do something like this:

If there's a PBP= in the environment
	If atol(PBP) == my parent's basepage
		get args from environment
	else
		get args from command line
	endif
else
	if there's an ARGV= in the environment
		get args from environment
	else
		get args from command line
	endif
endif


Does this sound reasonable? I would like to see this kind of thing
become a standard, but until a safeguard like this is in place, I can't
condone using ARGV= in the environment for finding your arguments.  It's
too chancy just to assume that you were started by a program savvy to
this scheme. 

/----------------------------------------------\
| Opinions expressed above do not necessarily  |  -- Allan Pratt, Atari Corp.
| reflect those of Atari Corp. or anyone else. |     ...lll-lcc!atari!apratt
\----------------------------------------------/


-- 
for (;;)                              : Use ATARINET, send an interactive
        do_it(c_programmers);         : message such as:
                                      : Tell UH-INFO at UHUPVM1 ATARINET HELP
University Atari Computer Enthusiasts : University of Houston UACE

david@bdt.UUCP (David Beckemeyer) (02/25/88)

In article <498@uhnix2.UUCP> uace0@uhnix2.UUCP (Michael B. Vederman) writes:
[ removed pexec cookbook ]
[ Alan's ARGV= proposed solution of passing the parent basepage in env. PBP ]
>In the startup code of the child, you would do something like this:
>
>If there's a PBP= in the environment
>	If atol(PBP) == my parent's basepage
>		get args from environment
>	else
>		get args from command line
>	endif
>else
>	if there's an ARGV= in the environment
>		get args from environment
>	else
>		get args from command line
>	endif
>endif
>
>
>Does this sound reasonable? I would like to see this kind of thing
>become a standard, ...

If this is what everyone agrees to, I will do it.   I never told MWC how
to do ARGV=; they just did it, and so I went ahead and changed my programs
to support it.   I don't know if it ever reached you personally, but I
and several other company reps. approached Atari for a "standard" way
of doing this way before MWC came along, and collectively we were told
(I quote from memory) "Atari won't make anything a standard; do it any
way you want, but we won't necessarilly support it, and it may not be
compatible with future Atari software and systems."

At that I did it one way, and then MWC came along with another way, and
the rest is history.

Now at least MWC and Micro/MT C-Shell are all compatible.  If the above
approach is the new "official" standard, I will implement it; but not
before I know it will be supportted.

Would it also work to compare argv[1] from the ARGV= and from the command
tail, and if they differ, use the args from the command tail?

(This is not a flame on Alan.  It's directed at Atari upper management BS).
-- 
David Beckemeyer			| "To understand ranch lingo all yuh
Beckemeyer Development Tools		| have to do is to know in advance what
478 Santa Clara Ave, Oakland, CA 94610	| the other feller means an' then pay
UUCP: ...!ihnp4!hoptoad!bdt!david 	| no attention to what he says"

juancho@dgp.toronto.edu (John Buchanan) (02/27/88)

In article <150@bdt.UUCP> david@bdt.UUCP (David Beckemeyer) writes:
>In article <498@uhnix2.UUCP> uace0@uhnix2.UUCP (Michael B. Vederman) writes:
>Would it also work to compare argv[1] from the ARGV= and from the command
	What can be done about accessing argv[0].  It would be nice to 
know what the name of the program running is.  I have not seen any discussion
about this.

-- 

John W. Buchanan                  Dynamic Graphics Project
               			  Computer Systems Research Institute
				  University of Toronto
(416) 978-6619			  Toronto, Ontario M5S 1A4

jpdres13@usl-pc.UUCP (John Joubert) (02/29/88)

---------

I too would like a copy of Allan Pratt's Pexec Cookbook.

----------------------------------------------------------------------------
John Joubert                         |     /\  |    /\    |     _ 
jpdres13@usl-pc.USL   or ...         |     \|<>|>|> \|<>|>|><`|`|
ut-sally!usl!usl-pc!jpdres13         |-----/|-------/|----------------------
GEnie: J.JOUBERT                     |     \/       \/
-----------------------------------------------------------------------------

apratt@atari.UUCP (Allan Pratt) (03/02/88)

From article <1988Feb27.141030.26647@jarvis.csri.toronto.edu>, 
by juancho@dgp.toronto.edu (John Buchanan):
> 	What can be done about accessing argv[0].  It would be nice to 
> know what the name of the program running is.  I have not seen any discussion
> about this.

Sorry, pal. The total discussion is, "It can't be done."

At least, not with the standard Pexec call.  If you do something like
Beckmeyer and MWC do with the environment, you can pass whatever
information you like to the child process. 

============================================
Opinions expressed above do not necessarily	-- Allan Pratt, Atari Corp.
reflect those of Atari Corp. or anyone else.	  ...ames!atari!apratt

david@bdt.UUCP (David Beckemeyer) (03/03/88)

In article <1988Feb27.141030.26647@jarvis.csri.toronto.edu> juancho@dgp.toronto.edu (John Buchanan) writes:
>	What can be done about accessing argv[0].  It would be nice to 
>know what the name of the program running is.  I have not seen any discussion
>about this.
The ARGV= environment string (MWC format) supplies argv[0] through argc.
The first string after the ARGV= is argv[0], argv[1] is the second string,
and so on.  Each string is separated by one NUL (zero byte).  The list of
arguments is terminated with two NUL bytes (which marks the end of the env.)

The MWC msh shell and the Beckemeyer Development Micro C-Shell and MT C-Shell
command shells all support this "extended" argument passing convention.

The MWC startup code reads this env. and loads the args into the argv[]
array before calling main().  I don't think any of the other C compilers
do it yet.  So programs written in MWC, when used with msh or csh will
have a valid argv[0].    I have a module that will read the env. for Alcyon
C. If anybody wants it send me mail.  If I get a lot of requests, maybe
I'll post it.  But without that, if you write your programs in another
C, or if you don't use msh or csh, then argv[0] will not be valid in
your programs.
-- 
David Beckemeyer			| "To understand ranch lingo all yuh
Beckemeyer Development Tools		| have to do is to know in advance what
478 Santa Clara Ave, Oakland, CA 94610	| the other feller means an' then pay
UUCP: ...!ihnp4!hoptoad!bdt!david 	| no attention to what he says"