[comp.sys.atari.st] Extended argument passing conventions in Pexec

dag@per2.UUCP (Daniel A. Glasser) (04/11/89)

[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
[ The following is posted for a friend at Mark Williams Company.  Please ]
[ direct replies to mwc.uucp!rec (mwc connects to cbmvax regularly, and  ]
[ is somewhere in the network maps for Illinois.)                        ]
[                                             Daniel A. Glasser          ]
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]

There has been a lot of noise about argument passing conventions
over the past weeks.  Atari is proposing an ARGS convention.  A
group of users is counter-proposing an xArgs convention.  These
two groups only agree that the MWC ARGV convention must go.

My name is Roger Critchlow.  I work for Mark Williams.  I invented
the MWC ARGV argument passing convention in October 1985.  I don't
think either of the proposed replacements is an improvement.
-----------------------------------------------------------------------
Our convention is incomplete because you can't tell if the ARGV was
actually passed by your parent or simply passed on by some intervening
shell?

  #include <basepage.h>
  stupid_parent_p()
  {
    char *mine = BP->p_env;
    char *its = BP->p_parent->p_env;
    while (*mine != 0)
      if (strcmp(mine, its) == 0)
        mine += strlen(mine), its += strlen(its);
      else
        return 0;	/* the environment contains a difference */
    return 1;		/* the environment is identical to the end */
  }

This example is only the simplest of many procedural validations
of the ARGV convention.  You can chase the parent pointers all the
way to the desktop and determine who exactly is responsible for
each part of the environment in each of the processes active.

I don't think that every program which uses ARGV needs be burdened
with this sort of validation, but it's clearly possible.
-----------------------------------------------------------------------
Our convention is messy because it involves copying arguments around
in the environment?

But all three proposed argument conventions require copying, and all
three pass linkage information in the environment, and the ARGV convention
is the most efficient of the three.

The ARGS convention requires:
  1) The copying of the environment by Pexec() which includes arguments,
  2) A scan of the environment to find the arguments,
  3) A size scan of the arguments to find out how much buffer is required,
  4) then a rescan and copy into a local buffer translating FF into 00,
  presumably building the argv vector at the same time.

The xArgs convention requires:
  1) The copying of the environment by Pexec() which excludes arguments,
  2) A scan of the environment to find the xArgs pointer,
  3) A scan of the xArgs arguments to count up the size of the arguments,
  4) then a rescan and copy into a local buffer.

The ARGV convention requires:
  1) The copying of the environment by Pexec() which includes arguments,
  2) A scan of the environment to find the arguments building the
  envp[] and argv[] vectors in parallel.

The ARGS and xArgs proponents are proposing to load more code into the
runtime startup to support their proposals.  This extra code is to count
up the size of some strings, allocate a buffer, and copy the strings
into it.  What does Pexec() do with the environment pointer?  Well it
counts up the size of the strings, allocates a buffer, and copies the
strings into it.  How many independent count-allocate-copies does it
take to start a Gemdos process?

ARGV rides for free on top of Pexec(), ARGS and xArgs saddle the user
process with additional work.
-----------------------------------------------------------------------
Programs which ignore the ARGV convention for arguments end up
mis-interpreting ARGV specified arguments as environmental variables
and passing screwed up environments on to their children.

This problem is addressed differently by the ARGS and xArgs proposals.

The ARGS convention will allow non-complying programs to pass their
arguments onto their children, but the arguments will no longer be
misinterpreted as environmental variables.  The environment is still
polluted with arguments, but the pollution isn't as cruddy as the
MWC kind. :-)

The xArgs convention passes a pointer, encoded in ascii, to the parent's
copy of the xArgs block.  { :-)  I've always felt that if consenting
processes on the ST wanted to pass loaded guns among themselves that
was their business, but that the default interprocess communication
path shouldn't force processes to be so trusting. }  So the environment
is still polluted with argument information, but it is even less cruddy
than the ARGS crud.  :-)

In article <156@np1.hep.nl> Geert J v Oldenborgh suggested truncating
the environment after extracting the ARGV from it.  I think this is
an excellent suggestion:  it adds one instruction, 2 words, to the
runtime startup module and immediately cures one whole class of bugs,
the class created by MWC compiled programs doing naive Pexec()'s.

I have two solutions for binary programs which screw up the
environment by ignoring the ARGV standard but cannot be rebuilt
with the improved runtime startup.  Both involve the same
transformation suggested above.

Solution 1 - call the program indirectly through a second program
which truncates the environment before passing it on.  This would
involve renaming the offending program:

  Frename(0,"numbnuts.prg","numbnuts.ugh");

and then compiling the following interface layer, which assumes
that its own runtime startup is stepping on ARGV as discussed above,
in its place:

  #include <basepage.h>
  #include <osbind.h>
  main()
  {
    Pterm((int)Pexec(0,"numbnuts.ugh",BP->p_cmdlin,BP->p_env));
  }

Solution 2 - rewrite the binary of the offending program to
incorporate the environment truncation into the offending
program's runtime startup.  As long as the offending program
does not address the process basepage with program-counter
relative addressing, this is a simple matter of shifting
the existing program text and data up by enough bytes to
make room for the environment truncation code and rewriting
the program relocation to reflect the new position.  I
have a program to do this.

How do you identify programs which need this treatment?  Well,
I assume that they usually reveal themselves when they attempt
to call a command compiled with the MWC package.  If the usage
message generated by make, ld, cc, or some other MWC compiled
utility seems inappropriate, then try calling a simple echo
program like this:

  main(argc, argv, envp) int argc; char *argv[], *envp[];
  {
    int i;
    for (i = 0; i < argc; i += 1)
      printf("%s%c", argv[i], (i+1==argc) ? '\n' : ' ');
    for (i = 0; envp[i] != 0; i += 1)
      printf("%s\n", envp[i]);
    return(0);
  }

This will tell you what arguments the suspect shell is actually
passing.  If you can't get the suspect shell to call 'echo.prg'
you can trick it by temporarily renaming 'echo.prg' to the name
of the program which the suspect shell insists on miscalling.

If you're more energetic, you could write a protocol policeman
using the ARGV validation ideas suggested above.
-----------------------------------------------------------------------
In summary, the ARGV convention can be validated by comparing the
ARGV received in the environment to the ARGV received by the parent;
the ARGV convention rides for free on Pexec() while the new proposals
both involve additional overhead;  and it is trivial to fix programs
which pollute the environment by ignoring the ARGV convention.

If Atari wants MWC to discard the ARGV argument convention, I hope
they'll wait until they can find something better to replace it with.  :-)

   -- rec --

[%%% End of included text %%%]
-- 
 _____________________________________________________________________________
    Daniel A. Glasser                           One of those things that goes
    uwvax!per2!dag                              "BUMP!!!(ouch)" in the night. 
 ---Persoft, Inc.---------465 Science Drive-------Madison, WI 53711-----------

kbad@atari.UUCP (Ken Badertscher) (04/14/89)

In article <841@per2.UUCP> mwc.uucp!rec writes:
| My name is Roger Critchlow.  I work for Mark Williams.  I invented
| the MWC ARGV argument passing convention in October 1985.  I don't
| think either of the proposed replacements is an improvement.
| -----------------------------------------------------------------------
| Our convention is incomplete because you can't tell if the ARGV was
| actually passed by your parent or simply passed on by some intervening
| shell?
| 
[ environment comparison code omitted ]
| 
| This example is only the simplest of many procedural validations
| of the ARGV convention.  You can chase the parent pointers all the
| way to the desktop and determine who exactly is responsible for
| each part of the environment in each of the processes active.

It is simple, yes, and it simply won't work.  Using ARGV, a parent
is required to copy its environment before it execs a child.  What if
someone decides they'd like to sort the environment variables for the
child as they create the child's environment?  What if they want to
filter it?  The environment comparison fails, and validation goes out
the window.  The MWC tools already* provide a flag ("unreasonable"
command line length byte) which can be used to validate the ARGV, and it
is unused by the MWC startup code!

| I don't think that every program which uses ARGV needs be burdened
| with this sort of validation, but it's clearly possible.

One of the *main* reasons that xArgs came into existence in the first
place was that ARGV lacks validation that the arguments in the ARGV
pseudo-environment-variable really do come from one's parent.  You may
not find the validation necessary, Roger, but there are many people
who use tools which were not compiled using MWC libraries.  A typical ST
developer's toolkit contains sundry tools with a varied ancestry.  The
plain fact is that ARGV does not peacefully coexist with tools that
don't use ARGV.

| -----------------------------------------------------------------------
| Our convention is messy because it involves copying arguments around
| in the environment?
| 
| But all three proposed argument conventions require copying, and all
| three pass linkage information in the environment, and the ARGV convention
| is the most efficient of the three.
| 
[ enumeration of implementation requirements omitted ]
| 
| The ARGS and xArgs proponents are proposing to load more code into the
| runtime startup to support their proposals.  [...]
| How many independent count-allocate-copies does it
| take to start a Gemdos process?
 
Who cares???  How long does it take an 8Mhz 68000 to copy 1k, or even
(gasp) 2k of memory, even with extra processing to parse it?  Somewhere
on the order of tens of milliseconds per program invocation, if that much.
God forbid that we should fritter away people's valuable milliseconds in
our wanton disregard of startup module code efficiency.

| -----------------------------------------------------------------------
| Programs which ignore the ARGV convention for arguments end up
| mis-interpreting ARGV specified arguments as environmental variables
| and passing screwed up environments on to their children.
| 
| This problem is addressed differently by the ARGS and xArgs proposals.
| 
[ means of addressing the problem omitted ]
| 
| In article <156@np1.hep.nl> Geert J v Oldenborgh suggested truncating
| the environment after extracting the ARGV from it.  I think this is
| an excellent suggestion:  it adds one instruction, 2 words, to the
| runtime startup module and immediately cures one whole class of bugs,
| the class created by MWC compiled programs doing naive Pexec()'s.

Hah, I caught you!  You admit that ARGV is imperfect! ;-)  You are even
willing to change the startup code to make it more perfect!

| I have two solutions for binary programs which screw up the
| environment by ignoring the ARGV standard but cannot be rebuilt
| with the improved runtime startup.  Both involve the same
| transformation suggested above.
| 
| Solution 1 - call the program indirectly through a second program [...]
| Solution 2 - rewrite the binary of the offending program to
| incorporate the environment truncation into the offending
| program's runtime startup.  [...]

That's very nice, but the other two methods of extended argument passing
don't require ANYBODY to be insulated from them.  In fact, all THREE
methods can peacefully coexist in one system, except that ARGV will
cause problems for programs which haven't been "innoculated" against it.

| -----------------------------------------------------------------------
| In summary, the ARGV convention can be validated by comparing the
| ARGV received in the environment to the ARGV received by the parent;

unless the parent has done anything to the environment...

| the ARGV convention rides for free on Pexec() while the new proposals
| both involve additional overhead;

negligible additional overhead...

| and it is trivial to fix programs
| which pollute the environment by ignoring the ARGV convention.

perhaps we disagree on the meaning of the word trivial; in any case,
the other two schemes don't cause any pollution, and thus don't require
any programs to be fixed...


| If Atari wants MWC to discard the ARGV argument convention, I hope
| they'll wait until they can find something better to replace it with.  :-)
| 
|    -- rec --

If MWC wants Atari to adopt the ARGV argument convention, I hope they'll
come up with some better defenses for it. >;-)

-- 
 Ken Badertscher                 | #include <disclaimer>
 Atari R&D                       | No pith, just a path:
 Software Engine                 |   {portal,ames,imagen}!atari!kbad

david@bdt.UUCP (David Beckemeyer) (04/18/89)

In article <1453@atari.UUCP> kbad@atari.UUCP (Ken Badertscher) writes:

[ about the MWC ARGV method ]

>It is simple, yes, and it simply won't work.  Using ARGV, a parent
>is required to copy its environment before it execs a child.  What if
>someone decides they'd like to sort the environment variables for the
>child as they create the child's environment?  What if they want to
>filter it?  The environment comparison fails, and validation goes out
>the window.  The MWC tools already* provide a flag ("unreasonable"
>command line length byte) which can be used to validate the ARGV, and it
>is unused by the MWC startup code!
>

Ken, it sure looks like you want to attack MWC here.  If the method is
"standard", "good" programs won't mess with the environment in "bad" ways.

I think something that Allan Pratt (sp?) suggestted a long time ago where
the parent places a variable like "PARENT=" or simething which contains the
parents basepage address is a reasonable additional method of validation
of the ARGV environment.  Also the MWC startup could be modified to use
the "unreasonable" command line length byte and Roger's suggestion of 
nulling out the env. at the ARGV is also good.

>| -----------------------------------------------------------------------
>| Programs which ignore the ARGV convention for arguments end up
>| mis-interpreting ARGV specified arguments as environmental variables
>| and passing screwed up environments on to their children.
>| 
>| This problem is addressed differently by the ARGS and xArgs proposals.
>| 
>[ means of addressing the problem omitted ]
>
>Hah, I caught you!  You admit that ARGV is imperfect! ;-)  You are even
>willing to change the startup code to make it more perfect!

I think that's *exactly* what Roger was saying.   What's wrong with that?
Are you suggesting the xArgs and/or your method are perfect?   They all
have problems.   I think any method using the environment for something
other than it is intended for is never going to be perfect.  Sorry. GEMDOS
should just do it right.  That's the only "perfect" answer.  I think what
Roger is saying is that with some minor improvements the ARGV method is OK.

>
>| [ Roger discusses solutions for programs which mess up the env. ]
>That's very nice, but the other two methods of extended argument passing
>don't require ANYBODY to be insulated from them.  In fact, all THREE
>methods can peacefully coexist in one system, except that ARGV will
>cause problems for programs which haven't been "innoculated" against it.

Say what?  Are you saying that I won't have to upgrade any of my 500 or
so utilities?  None of my 25,000 customers will need upgrades with the new
Atari standard?    Not including the countless non-commercial home-grown
utilites that use MWC ARGV arguments?  You see all approaches are going
to require lot's of binary patch programs and upgrades.   Whether it's
for "innoculation" or "fixes" I don't really see the difference and neither
will the users.

I think you're forgeting how many *commercial* programs use the ARGV
"standard".  This will play a big role in the standardization phase.

>
>| If Atari wants MWC to discard the ARGV argument convention, I hope
>| they'll wait until they can find something better to replace it with.  :-)
>| 
>|    -- rec --
>
>If MWC wants Atari to adopt the ARGV argument convention, I hope they'll
>come up with some better defenses for it. >;-)

This is the most important and strongest statement that Roger makes and you
simply discount it.   I don't think that either your method or xArgs are
really better than ARGV in all respects; they're just different.
Each method has advantages and disadvantages.   What I'm is saying

    WHY CHANGE IT JUST TO CHANGE IT?

Wait it out; let's get this thing fixed right instead of just introducing
a bunch of new problems for developers and end-users alike.   Do we have
a solution looking for a problem or what? :-)

>
>-- 
> Ken Badertscher                 | #include <disclaimer>
> Atari R&D                       | No pith, just a path:
> Software Engine                 |   {portal,ames,imagen}!atari!kbad


-- 
David Beckemeyer (david@bdt.UUCP)	| "Adios amigos.  And, as they say when 
Beckemeyer Development Tools		| the boys are scratching the bad ones,
478 Santa Clara Ave. Oakland, CA 94610	| 'Stay a long time, Cowboy!'"
UUCP: {uunet,ucbvax}!unisoft!bdt!david	|                  - Jo Mora

t68@np1.hep.nl (Jos Vermaseren) (04/18/89)

In article <546@bdt.UUCP>, david@bdt.UUCP (David Beckemeyer) writes:
> In article <1453@atari.UUCP> kbad@atari.UUCP (Ken Badertscher) writes:
> 
> [ about the MWC ARGV method ]
> 
> >It is simple, yes, and it simply won't work.  Using ARGV, a parent
> >is required to copy its environment before it execs a child.  What if
> >someone decides they'd like to sort the environment variables for the
> >child as they create the child's environment?  What if they want to
> >filter it?  The environment comparison fails, and validation goes out
> >the window.  The MWC tools already* provide a flag ("unreasonable"
> >command line length byte) which can be used to validate the ARGV, and it
> >is unused by the MWC startup code!
> >
> 
> Ken, it sure looks like you want to attack MWC here.  If the method is
> "standard", "good" programs won't mess with the environment in "bad" ways.
> 
> I think something that Allan Pratt (sp?) suggestted a long time ago where
> the parent places a variable like "PARENT=" or simething which contains the
> parents basepage address is a reasonable additional method of validation
> of the ARGV environment.  Also the MWC startup could be modified to use
> the "unreasonable" command line length byte and Roger's suggestion of 
> nulling out the env. at the ARGV is also good.

I think Ken misses something more: The parent doesn't have to copy
anything. When Pexec starts a program the loader copies the environment
string, so the user may hack it up as much as he wants it without
affecting the environment of the parent.

The MWC method (including writing a \0 over the ARGV if you want to
incapacitate an ARGV) works well already for ages by giving validation
via PBP=parentbasepageaddress in front of ARGV. You may also have the
startup code write the \0 over the PBP which is better.

The major shells use something like it (or part of it), MWC uses it,
Aztec uses it (they use Gulam). Only Turbo C didn't use any but the
startup code is fixed quickly. Laser C is the one that opted to either
not understand the convention or go deliberately their own way. So what,
it isn't a good compiler anyway (the code is 30% slower than turbo and
the best debugger is the one of MW). 

First Atari doesn't give us decent documentation, and then when a
standard is emerging they are going to criticize it and talk about
fantastic schemes that 1: aren't better and b: NOBODY is using.
Keep it up guys!

Jos Vermaseren

Claimor: This opinion is mine and my mood ain't too good.

usenet@TSfR.UUCP (usenet) (04/19/89)

In article <546@bdt.UUCP> david@bdt.UUCP (David Beckemeyer) writes:
>In article <1453@atari.UUCP> kbad@atari.UUCP (Ken Badertscher) writes:
>>[ about the MWC ARGV method ]
>>It is simple, yes, and it simply won't work....
>
>Ken, it sure looks like you want to attack MWC here.  If the method is
>"standard", "good" programs won't mess with the environment in "bad" ways.

 Except old programs that don't follow that `standard' will still be stuck
 with large oddities in their environment.

>
>I think something that Allan Pratt (sp?) suggestted a long time ago where
>the parent places a variable like "PARENT=" or simething which contains the
>parents basepage address is a reasonable additional method of validation
>of the ARGV environment...

 By cluttering the environment with yet another variable?  Great.  That's
 just what's needed - even more flot.

>
>>That's very nice, but the other two methods of extended argument passing
>>don't require ANYBODY to be insulated from them.  In fact, all THREE
>>methods can peacefully coexist in one system, except that ARGV will
>>cause problems for programs which haven't been "innoculated" against it.
>
>Say what?  Are you saying that I won't have to upgrade any of my 500 or
>so utilities?

 If they know about the commandline in the basepage, why should they need
 upgrades?

> ... None of my 25,000 customers will need upgrades with the new
>Atari standard?...

 It cuts both ways.  Those of us who use the other ways to pass arguments
 around may end up having to modify our code to work properly with the
 new standard, whatever it may be. 

> ... You see all approaches are going
>to require lot's of binary patch programs and upgrades. ...

 Why?  I've got programs that don't use xArgs and they coexist just fine
 with xArgs.  I cannot imagine that the MWC `standard' works out any
 differently, except for the magical appearance of garbage in the
 environments of the (non-conformant) child processes.

>I think you're forgeting how many *commercial* programs use the ARGV
>"standard".  This will play a big role in the standardization phase.

 And you may be forgetting how many shareware programs use xArgs.
 Considering that dlibs and Sozobon C both use xArgs, the percentage
 of programs using that `standard' will grow.

 Your comment that real argc/argv should be a extention to Pexec() is
 correct, and I agree that it's what Atari should be doing.  But saying
 that the MWC style should be the second choice because it's `there' is
 like saying "we can't fix Malloc() because people take advantage of
 the broken code."

 At risk of sounding like a broken record, I'll note again that one of
 the reasons that Dalnefre', John Stanley and I made up the xArgs method
 is that we (well, *I* - I dunno about Dale and/or John) didn't think that
 the MWC method was worth it's weight in fetid dingos kidneys.  Cluttering
 up the environment with untracable arguments and iovector `stuff' is
 about as unaesthetic as you can get.

   -david parsons
   -orc@pell.uucp