apratt@atari.UUCP (Allan Pratt) (05/29/87)
Attention Mark Williams, Beckmeyer, and Gert Poltiek, and anybody else interested: There is a trick that some shells and compiler libraries use that lets you pass argument strings to programs which are longer than the 127 bytes which fit in the command line area of the basepage. Their trick is to put the word ARGV= in the environment, and follow it with a null-separated list of argument strings. The list is terminated with another null. This scheme works pretty well, but has two drawbacks, one major and one minor. The minor drawback is that it defies the definition of what is in the environment: the environment should consist of strings of the form NAME=value<NUL> terminated by a final <NUL>. This is minor because shells using this convention usually put the ARGV information at the end of the environment anyway. The major drawback is that you can't tell if the ARGV string in your environment is really meant for you. Imagine you have the Mark Williams shell (msh), an editor compiled with Alcyon, and another utility like "echo" compiled with MWC. Imagine further that the editor has a "shell-escape" command that lets you execute another program from within the editor. Do this: From msh (the MWC shell): start up the editor with the command line arguments "this is a test." Tell the editor to execute the command "echo hello world." The "echo" command will echo "this is a test," not "hello world." What happened is that msh put "this is a test" in the environment for the editor (as well as in the command tail in the basepage). The editor, not knowing any better, didn't put "hello world" in the environment before executing "echo." When "echo" started, it found "ARGV=this is a test" in its environment and echoed that. What is needed is a way for a program to tell if the "ARGV=" string in its environment is really intended for it, or is just left over from an earlier program. There is a way to do this that doesn't affect old programs compiled without this fix. The new convention could be to place another string in the environment with your own basepage address, before Pexec'ing your child. The child could start up, and check to see if its parent's basepage address (in its basepage) matches the address in the environment. If it does match, the child will know that the ARGV= string is for it. If it doesn't match, the child will know it was started from a non-MWC program like the editor above, and will look in its basepage for the command line. Note that if the parent's basepage isn't in the environment at all, but the ARGV= string is, the child must assume that the ARGV string is intended for it, just as it does now. Therefore, old-style programs could still Pexec new-style children, and vice-versa. This would all require a change in the startup code that calls main(), and the exec() code which Pexec's the child. How about it, guys? If we could all agree on the name and format of this new environment variable, we could get rid of a serious flaw in Mark Williams' otherwise clever scheme. Other shells could adopt this, too, and ultimately everybody would be able to kiss the 127-character command-line limit goodbye. For now, I propose that the environment variable in question be called PBP, and that its value be the decimal string of digits making up the parent's basepage. The reason for this is that almost all libraries have an atol() function, where not all have an atolx() function. A shell using this trick, with a basepage at 366494 (decimal), could Pexec a child called "test.prg" with these strings in the environment: ... PBP=366494<NUL> ARGV=test.prg<NUL>first<NUL>second<NUL>third<NUL><NUL> In the startup code of the child, you would do something like this: If there's a PBP= in the environment If atol(PBP) == my parent's basepage get args from environment else get args from command line endif else if there's an ARGV= in the environment get args from environment else get args from command line endif endif Does this sound reasonable? I would like to see this kind of thing become a standard, but until a safeguard like this is in place, I can't condone using ARGV= in the environment for finding your arguments. It's too chancy just to assume that you were started by a program savvy to this scheme. /----------------------------------------------\ | Opinions expressed above do not necessarily | -- Allan Pratt, Atari Corp. | reflect those of Atari Corp. or anyone else. | ...lll-lcc!atari!apratt \----------------------------------------------/
manis@ubc-cs.UUCP (05/29/87)
In article <741@atari.UUCP> apratt@atari.UUCP (Allan Pratt) writes: >For now, I propose that the environment variable in question be called >PBP, and that its value be the decimal string of digits making up the >parent's basepage. The reason for this is that almost all libraries >have an atol() function, where not all have an atolx() function. This is a really good suggestion, but it can be improved by noting that the major goal is to make the whole argument transmission scheme a bit more reliable. (Why DRI didn't do it right the first time is a matter I don't consider worthy of discussion.) However, under the above proposal, the user who accidentally sets an environment variable "PBP" (meaning, perhaps, "Print Blank Pages", as far as his/her program is concerned) to 5 could get anomalous behaviour (of the sort discussed in Allan's article). It is worth noting that the base page pointer is only going to be passed from one program to another; therefore I suggest that the name "PBP " (space included) might result in slightly-less error prone behaviour. ----- Vincent Manis {seismo,uw-beaver}!ubc-vision!ubc-cs!manis Dept. of Computer Science manis@cs.ubc.cdn Univ. of British Columbia manis%ubc.csnet@csnet-relay.arpa Vancouver, B.C. V6T 1W5 manis@ubc.csnet (604) 228-6770 or 228-3061 "The difference between capitalism and communism is obvious: under capitalism, man exploits man, while under communism, it is exactly the opposite."