[comp.sys.atari.st] xArgs explained

dal@midgard.Midgard.MN.ORG (Dale Schumacher) (04/07/89)

  At the request of Ken Badertscher, I'm posting this explaination of the
current xArgs argument passing method and arguments in favor of adopting
it as the official Atari standard for extended argument passing.

----

  The xArgs standard was developed by Dale Schumacher, David (orc) Parsons
and John Stanley with helpful suggestions from Allan Pratt, David Beckemeyer
and others.  It went through 7 different proposal iterations before all of
contributors were satisfied with it.  It was implemented in the dLibs package
of standard libraries for C (Alycon and later Sozobon), published by
Julian Reschke in Germany and used by several shell developers, though
mostly abroad.  I still believe that it is the best solution to the varied
collection of problems associated with extended argument passing.

  When a programs uses xArgs to pass it's arguments the parent creates a
structure which contains validation information, the argument count and
a pointer to the argument array in the parent's data space.  The address
of this structure is converted into an 8-digit hexadecimal string using
the characters "0123456789ABCDEF".  The string is placed in the child's
environment (or in the parent's, to be inherited by the child) as the
value of the "xArg" variable.  The first 125 characters of whitespace
separted command line arguments are also placed in the command line
image to be read by ignorant children.  The XARG structure is:

	typedef struct
		{
		char	xarg_magic[4];	/* verification value "xArg" */
		int	xargc;		/* argument count */
		char	**xargv;	/* argument array pointer */
		char	*xiovector;	/* should be NULL */
		char	*xparent;	/* parent validation pointer */
		}
		XARG;

  Verification that the xArg variable actually does point to the arguments
for a specific program is accomplished by checking the xarg_magic field
for the value "xArg" (0x78417267, note no '\0' here) and making sure that
the xparent field in the same of the parent pointer in the child's own
basepage.  This prevents a child which was passed xArgs from ignorantly
passing them on the a child of it's own.

  The xargc and xargv values are essentially the argc and argv of the
child, EXCEPT that the child MUST make a copy of the argv array and each
of the parameter strings.

  The xiovector field was initially provided as a place for Mark Williams
code to carry along their iovector string.  It was hoped that by including
this field, Mark Williams would be more easily convinced to use xArgs.
If the iovector string was not available, as in dLibs, this field is set
to NULL.  In order to remain compatible with current code, this field
should remain, but always be NULL.

----

  Now for the rationale...  Ken has said that the current method used by
Mark Williams et. al. WILL NOT be adopted.  It simply has too many
technical flaws.  The xArgs method is the only other major method (that
I know of) which is already in use, and has been for some time now.
All of the command line arguments, including argv[0], are passed and
may contain any characters except '\0'.  The environment string is not
corrupted, and has only 14 characters added to it for the xArg variable
and value.  The validity of the structure (via the magic #) and the
correct parent (via xparent) are easily tested.  Finally, ignorant
programs will not be broken, since they will receive up to the maximum
sized command line image with correct length values and a '\0' terminated
string.  Without changes, the Mark Williams style programs will assume
that they aren't getting extended arguments, and will only use the
command line image.

  The largest potential drawback of using xArgs is the requirement that
the child read from the parent's data space.  This is considered "a bad
thing" in general, and I agree that it should be avoided.  However, the
only code which should have to deal with this is the startup code in
each program, which should be supplied by the compiler vendor.  Once
correctly implemented (and I can show correct examples :-), it will
always work, so there is no potential for "operator error".  In a multi-
tasking system the issue arises of a parent which terminates or
otherwise makes the xArgs data invalid before the child has copied it.
Under current systems, this may be theorectically possible, and is an
issue I discussed with David Beckemeyer the first time around.  Since
the child get's to run for a "slice" before the parent continues from
the fork(), we expect that the arguments can be grabbed quickly enough
to avoid this problem.  If Atari decides to implement a multitasker
as a gemdos upgrade, they can provide REAL argc/argv to the child,
since such an upgrade obviously implies heavy modifications to the
exec machanism, and if xArgs was the standard, the multitasking exec
could force the xArg variable to point to the CHILD'S OWN argv.  This
would cause "unneeded" copying of the data, but would maintain simple
backward compatibility.  Therefore, even if multitasking becomes an
issue, xArgs copying from parent's space can be handled without error.

  By contrast, the most recent proposal by Ken is basically this.
The arguments are placed in the child's environment as the value of
the ARGS variable (xArg was mixed cases to help avoid environment
variable name conflicts, btw) with each argument (starting with argv[0])
separated by '\377' (0xFF) characters, and the command line image length
value is forced to be 126 (the images in '\0' terminated instead).

  The problems I see with this are as follows.  The environment is
cluttered with a potentially large string.  The string must be parsed,
but not in place, to get the arguments out.  The '\377' character may
not be used in an argument.  Forcing the command line length value to
be 126 breaks the documented definition of the command line image, and
may confuse some ignorant programs.  If an ignorant program creates an
ARGS variable or passes one on from it's parent, and it uses 126 bytes
of command line space, the smart child will assume that extended
arguments are present.  This basically says that the validation procedure
is not robust.

According to Ken, this issue is by no means decided, and xArgs is still
a viable option, so let's hear more discussion.

hakanson@mist.CS.ORST.EDU (Marion Hakanson) (04/08/89)

Thank you for a good description.  This approach (xArgs) is my current
favorite.  All the advantages listed in the article are true, and there
is one more:  It is similar to and compatible with what you run into
on a Unix system.  This is really nice for those of us who port C programs
from Unix to the ST -- these programs can now fiddle around with the
environment and with the arg list (after it's been copied to our address
space by the startup code).

I agree that any approach which stuffs the args into a single
environment variable is going to have problems -- the environment is
either going to be broken (with unexpected nulls in the middle of it),
or you are going to have to parse the thing, including some way of
escaping the separator characters (a waste, since the shell or some
other program already parsed the args anyway).  Since many Unix
programs munge their args in place, it's nice to have them in their
own address space (and not the parent's), too.

So, xArgs is wonderful.  If the Mark Williams Co. is listening, this
is one of your customers requesting that you adopt xArgs over your
current approach (I hate it when child programs get their parents'
args!).  If you don't, the next time I'm doing serious C work on the
ST, I'll just chuck all of your utility programs and replace them with
PD tools like the GNU stuff.

Don't get me wrong, the MWC compiler is my favorite.  But nowadays the
GNU compiler is passing it by, and the price is certainly right.

-- 
Marion Hakanson         Domain: hakanson@cs.orst.edu
                        UUCP  : {hp-pcd,tektronix}!orstcs!hakanson