[comp.lang.perl] Proposed enhancements to MS-DOS perl

lbr@holos0.uucp (Len Reed) (09/20/90)

I'm planning on make several important enhancements to the MS-DOS
version of perl.  I'd appreciate knowing about any problems,
ideas, or applicable code anyone may have.

1)   Fix up the test scripts to work under MS-DOS.  Most do
     already.  Those that don't typically do one of the
     following:
       a) Attempt to access an illegal DOS file name.  For
          example, open(HANDLE, ">This.file.tmp");  There's no
          reason that both Unix and MS-DOS couldn't use the same,
          shorter name.
       b) Doing something that MS-DOS can't do, like forking.
       c) Using a Unixism, like "/bin/rm".

2)   Minor bug corrections.  The program should pass as much of
     the test suite as possible under DOS.

3)   Better determine where optimization fails and selectively
     turn off specific optimizer switches as needed.

4)   Swapping while running a subprocess.  Since 640K isn't much
     when you're running perl, it would be nice to have perl swap
     itself out when doing subprocesses.  Fortunately, I have
     already modified one program to do this, so it'll just mean
     hooking perl to run that code.

5)   Complete MKS compatibility (see below for a plug for MKS). 
     I don't usually modify free software to have full MKS
     compatibility, but perl is an obvious case for such
     treatment.  I have parts of this lying around from other
     work.
       a) Accept 8 K-bytes of arguments as per MKS conventions,
          so that Korn shell commands like "perl script *.c" work
          even if there are a lot of c files.
       b) Pass extended argument list to sub processes.
       c) Run Korn shell instead of command.com to do
          subprocesses when a shell is needed, handling
          metacharacter set properly.
       d) Run MKS glob.exe to do globbing of things like <ab*.c>.
       e) Handle "switch" character properly.  (Currently
          perl.exe will run "command /c the_command" instead of
          "command -c the_command" when the switch character is a
          hyphen instead of a slash.
       f) Continue to work as well as can be expected if MKS tool
          kit is not present.  I.e., the enhancements shouldn't
          mess anything up.

Mortice Kern Systems (MKS) sells a line of Unix clone tools for
MS-DOS and OS/2.  If you're a Unix aficionado stuck in these
environments, you should buy at least the basic tool kit.  It
includes vi, the Korn shell, most of the usual head/tail/ls type
tools, and awk.  (Well, perhaps Larry Wall has made that
obsolete.)  I have no relationship with MKS other than that of
satisfied customer and consultant to a satisfied customer.
-- 
Len Reed
Holos Software, Inc.
Voice: (404) 496-1358
UUCP: ...!gatech!holos0!lbr

feustel@netcom.UUCP (David Feustel) (09/22/90)

I haven't got Perl yet, but I've been using the MKS toolkit for
several years. I think the idea of MKS compatibility for PERL is
great!!!  I would be very happy to help you test it if you send me
a copy of the updated PERL.
-- 
David Feustel, 1930 Curdes Ave, Fort Wayne, IN 46805, (219) 482-9631

roy%cybrspc@cs.umn.edu (Roy M. Silvernail) (09/22/90)

lbr@holos0.uucp (Len Reed) writes:

> I'm planning on make several important enhancements to the MS-DOS
> version of perl.  I'd appreciate knowing about any problems,
> ideas, or applicable code anyone may have.
[...]
> 5)   Complete MKS compatibility (see below for a plug for MKS).
>        c) Run Korn shell instead of command.com to do
>           subprocesses when a shell is needed, handling
>           metacharacter set properly.

Please make this optional, as I usually run 4dos for a command
interpreter. Perhaps an environment variable to select the shell of
choice?

Also, please consider hacking the startup code to correctly parse args
such as

perl -e 'put a script here, complete with whitespace' foo.bar

(where the single quotes would be correctly handled)

Thanks for the consideration!
--
Roy M. Silvernail |+|  roy%cybrspc@cs.umn.edu  |+| #define opinions ALL_MINE;
main(){float x=1;x=x/50;printf("It's only $%.2f, but it's my $%.2f!\n",x,x);}
"This is cyberspace." -- Peter da Silva  :--:  "...and I like it here!" -- me

lbr@holos0.uucp (Len Reed) (09/23/90)

In article <cZVVP2w163w@cybrspc> roy%cybrspc@cs.umn.edu (Roy M. Silvernail) writes:
>Please make [running Korn shell for subprocesses] optional, as I usually
>run 4dos for a command interpreter.

Yes, I've thought of those who use command.com (yech), MKS Korn shell, and
even those who use other shells.  I've been planning an environment
variable scheme that should handle all possibilities.

>Also, please consider hacking the startup code to correctly parse args
>such as
>
>perl -e 'put a script here, complete with whitespace' foo.bar
>
>(where the single quotes would be correctly handled)

Several persons have told me about -e problems.  They don't even appear to
work on the Korn shell, which seems weird since the ksh should take the
stuff between single quotes and put it into a single argument for perl.
(This assumes that the expansion doesn't blow exceed the niddling 128 byte
command tail limit that will remain until perl.exe is fully MKS
compatible.)

Note that this should fall out for a Korn shell user: the MKS Korn shell
should do all the messing with metacharacters and just pass the arguments
to perl.exe.  I won't promise to do argument expansion for the non-MKS
user; I'll only promise that I won't make things worse than they are
and that I'll leave a trail for the interested programmer to pick up.
On the other hand, I may end up doing this as a side effect of having
to handle the case where, even with the MKS kit intact, perl.exe gets
run by a non-MKS command and thus has to invoke glob.exe.  Stay tuned.
-- 
Len Reed
Holos Software, Inc.
Voice: (404) 496-1358
UUCP: ...!gatech!holos0!lbr

lbr@holos0.uucp (Len Reed) (09/23/90)

In article <1990Sep22.182316.5325@holos0.uucp> lbr@holos0.uucp (Len Reed) writes:
>
>Several persons have told me about -e problems.  They don't even appear to
>work on the Korn shell, which seems weird since the ksh should take the
>stuff between single quotes and put it into a single argument for perl.

Did I really write this? :-)

Of course it can't work with the current version of perl.exe.  If the shell
strips out the quotes, perl.exe will split the "arguments" on the embedded
white space.  If a shell doesn't strip the quotes, perl.exe won't get
it right either because it doesn't treat quotes specially.  MKS argument
passing will handle this for MKS users; for others, perl.exe will have to
glob the command line.  As I said, I may or may not do the latter, but
at the very least I'll leave hooks for someone else to do it.
-- 
Len Reed
Holos Software, Inc.
Voice: (404) 496-1358
UUCP: ...!gatech!holos0!lbr

dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (09/26/90)

In <1990Sep20.013320.9162@holos0.uucp> lbr@holos0.uucp (Len Reed) writes:

>       e) Handle "switch" character properly.  (Currently
>          perl.exe will run "command /c the_command" instead of
>          "command -c the_command" when the switch character is a
>          hyphen instead of a slash.

I'd like to warn all people who are trying to handle the MS-DOS
"switch" character correctly.  (Non-MS-DOS users may want to tune out.
The MS-DOS "switch" character is one of Microsft's more interesting
fiascos.)

The "switch" character handling under MS-DOS runs roughly as
follows:

     MS-DOS version             Switch char handling
     --------------             --------------------
     1.x			none
     2.x			documented, correctly handled
     3.x			undocumented;  correctly handled
				by COMMAND.COM, CONFIG.SYS,
				and most utilities
     4.x			undocumented; unrecognized in
				CONFIG.SYS;  correctly handled
				by no utilities (or almost none);
				recognized by parts of COMMAND.COM
				but mostly not.

The real problem arises in MS-DOS 4.x, in which COMMAND.COM no
longer accepts a /C or -C etc. depending upon the switchar.
It always wants /C.  So, if your program checks the switchar,
finds it is "-", and uses "COMMAND -C", it will get an
error message from COMMAND.COM.

Thus, there are two possible ways of correctly handling the switchar
value.

(a) See what the switchar is, find out the MS-DOS version, and handle
according to the above table.  For MS-DOS versions 2.x and 3.x, use
"COMMAND -C" if the switchar is "-".  For MS-DOS version 4.x, always
use "COMMAND /C".

(b) Save the current switchar, set it to "/", and use "COMMAND /C".
Then set it back to what it was.
--
Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com>
UUCP:  oliveb!cirrusl!dhesi

roy%cybrspc@cs.umn.edu (Roy M. Silvernail) (09/27/90)

lbr@holos0.uucp (Len Reed) writes:

> Several persons have told me about -e problems.  They don't even appear to
> work on the Korn shell, which seems weird since the ksh should take the
> stuff between single quotes and put it into a single argument for perl.
> (This assumes that the expansion doesn't blow exceed the niddling 128 byte
> command tail limit that will remain until perl.exe is fully MKS
> compatible.)

The problem is actually within the startup code of the C compiler. I did
some initial experiments with a batch file that echoed args. This showed
that 4dos was indeed breaking up the args according to my wishes. A
similar C program, though, showed all args broken on whitespace, except
where surrounded by double quotes. (under Turbo C 2.0) From this, I
deduced that the command interpreter simply passes the entire command
line to the program, and the program does its own expansion and arg
assignment.

Examining the asm source for setargv and the startup source (both
supplied with Turbo C) shows this to be the case. It looks now as though
I'd be better off waiting for Perl 4.0 before attempting to port it to
Turbo. (how soon *is* RSN? :-)
--
Roy M. Silvernail |+|  roy%cybrspc@cs.umn.edu  |+| #define opinions ALL_MINE;
main(){float x=1;x=x/50;printf("It's only $%.2f, but it's my $%.2f!\n",x,x);}
"This is cyberspace." -- Peter da Silva  :--:  "...and I like it here!" -- me

stu@gtisqr.uucp (Stu Donaldson) (09/28/90)

In article <1990Sep20.013320.9162@holos0.uucp> lbr@holos0.uucp (Len Reed) writes:
>I'm planning on make several important enhancements to the MS-DOS
>version of perl.  I'd appreciate knowing about any problems,
>ideas, or applicable code anyone may have.

>4)   Swapping while running a subprocess.  Since 640K isn't much
>     when you're running perl, it would be nice to have perl swap
>     itself out when doing subprocesses.  Fortunately, I have
>     already modified one program to do this, so it'll just mean
>     hooking perl to run that code.

I've run out of memory with perl under DOS, and would like to see
it capable of using extended memory.  (What about overlay's :-)).

Also, as for another freaping creature, I'd like to see access to
the associative array binding to files.  There are several PD versions
of B-tree type file access routines that would be better than nothing.

tdinger@hiredgun.East.Sun.COM (Tom Dinger - Sun BOS SPA) (09/29/90)

I have already done a bunch of the things you mentioned in the following:

In article <1990Sep20.013320.9162@holos0.uucp> lbr@holos0.uucp (Len Reed) writes:
>I'm planning on make several important enhancements to the MS-DOS
>version of perl.  I'd appreciate knowing about any problems,
>ideas, or applicable code anyone may have.
>
>1)   Fix up the test scripts to work under MS-DOS.  Most do
>     already.  Those that don't typically do one of the
>     following:
>       a) Attempt to access an illegal DOS file name.  For
>          example, open(HANDLE, ">This.file.tmp");  There's no
>          reason that both Unix and MS-DOS couldn't use the same,
>          shorter name.
>       b) Doing something that MS-DOS can't do, like forking.
>       c) Using a Unixism, like "/bin/rm".

I have modified many (most?) of the scripts, that weren't hopelessly mired in
**ix-isms.  For example, many of the io.* scripts did not seem easy to
"translate."

The ones that I have had problems with are:
	comp.cpp	(C pre-processor is is no fixed DOS place)
	io.*		(seem to be **ix-dependent)
	op.fork		(no fork() under DOS)
	op.goto		(??? can't find my notes on this one)
	op.magic	(DOS DIR command incompatible with **ix ls command)
	op.split	(??? no notes here either)
	op.stat		(lots of differences, mostly due to handling of
			 file time under DOS)
	op.times	(I used a "no-op" times() function).

The rest of the scripts (renamed) run, and I believe almost all tests pass.

>
>2)   Minor bug corrections.  The program should pass as much of
>     the test suite as possible under DOS.

I have found and fixed a bunch of bugs:
1. I "fixed" some of the #includes in perl.h to include more of the Microsoft
	headers.
2. I added a dummy times() function -- none was supplied with the DOS perl
	support files.
3. I cleaned up use of the symbols FCNTL and I_FCNTL -- in the sources and
	Configure script, they were effectively identical and interchangable.
	I changed things so that I_FCNTL means "the file <fcntl.h> should be
	#included" (MSC has that file), and FCNTL means "the fcntl() function
	is available, use it" (MSC does not have that function).
4. You need a <sys/param.h> file to satisfy the #includes, but it can be empty.

>
>3)   Better determine where optimization fails and selectively
>     turn off specific optimizer switches as needed.

I did this -- I can now compile all the sources using -Ox (full optimization,
including loop optimization.)  This turned up a (known) bug in MSC's loop
optimization (found in array.c) that caused the unshift test (I think) to
crash perl.  In addition it turned up a previously unknown bug (to me that is)
in the more "normal" -Os or -Ot optimizations, that caused incorrect code
to be generated within regcomp.c, causing regexp tests 73 and 115 to fail.
I have a 4-page write-up of this bug, if anyone's interested.

>
>4)   Swapping while running a subprocess.  Since 640K isn't much
>     when you're running perl, it would be nice to have perl swap
>     itself out when doing subprocesses.  Fortunately, I have
>     already modified one program to do this, so it'll just mean
>     hooking perl to run that code.

First, the good news: I have a _drop-in replacement_ for the MSC lowest-level
spawn functions, that will permit swapping to EMS, XMS or DISK, by default,
without _any_ source code changes (link with an OBJect file and a library).
Add one line of C (define a global variable) and you can control in which order
it will try the different swap targets, with an environment variable of your
own naming.  When perl (or any program using spawn*()) is swapped, it uses
only 2460 bytes, plus a copy of the child`s environment, plus its own
environment (total: about 3K bytes).  Pretty good for a 300K+ program.

Now the bad news: currently, though I wrote the code, it belongs to my company,
and my partner and I are still discussing what we want to do with it. (I wrote
it for a "make" program I wrote for our company).  So I have not (yet)
distributed either the binaries or sources.  I have only had the code working
for about three weeks, so it may yet see the light of day.

>
>5)   Complete MKS compatibility (see below for a plug for MKS). 
>     I don't usually modify free software to have full MKS
>     compatibility, but perl is an obvious case for such
>     treatment.  I have parts of this lying around from other
>     work.
>       a) Accept 8 K-bytes of arguments as per MKS conventions,
>          so that Korn shell commands like "perl script *.c" work
>          even if there are a lot of c files.
>       b) Pass extended argument list to sub processes.
>       c) Run Korn shell instead of command.com to do
>          subprocesses when a shell is needed, handling
>          metacharacter set properly.
>       d) Run MKS glob.exe to do globbing of things like <ab*.c>.
>       e) Handle "switch" character properly.  (Currently
>          perl.exe will run "command /c the_command" instead of
>          "command -c the_command" when the switch character is a
>          hyphen instead of a slash.
>       f) Continue to work as well as can be expected if MKS tool
>          kit is not present.  I.e., the enhancements shouldn't
>          mess anything up.

PLEASE make this optional (I noted in later postings that you intend to do so.)
I do not use MKS, nor am I about to start; however, I would welcome a
"reasonable" standard for supplying long command lines to applications, as
long as it is backward-compatible with applications that know nothing
about it (lots of them).

Other things I have done to perl [the audience gasps]:

1. As many DOS-perl users will not be compiling perl for their own system,
   and additionally because DOS users must contend with drive letters, I
   have added to _all_ versions of perl (not just DOS) the ability to
   use the PERLLIB environment variable as the path on which to find the
   perl "library."

2. Patch 19 changed stab.c to use the setuid() function "unguarded".
   Patch 21 changed doarg.c to use both setuid() and setgid() functions.
   I added two symbols: SETUID and SETGID, and guarded those function
   calls with them, so that they could be disabled for DOS.

3. I added the OS/2 add_suffix() routine to the DOS version (great stuff!)

4. I added use of the TMP environment variable for the perl -e option, to
   create the temporary input script.  Rationale was the same as for
   PERLLIB, plus it is a quasi-standard in the DOS world.

In the on-deck circle:

1. Using a custom chdir() function for DOS, that will change the drive as well
   as the path if a drive is present.  For example:

	chdir("\\");	/* change to root of the current drive */
	chdir("D:");	/* change to drive D, whatever the current directory*/
	chdir("E:\\");	/* change to drive D, change it's current dir to root*/
	etc.

   No new functions in perl are needed; everything you need to do under DOS
   is avaiable.

2. Finish converting the test scripts to work under DOS.

3. [Just a thought] MSC provides enough information and hooks in the start-up
   code to replace the ARGV processing -- we could produce a command-line
   parser that would handle quoted arguments "smarter" and could glob, like
   the command-line processing the DOS shell recently made available.

>-- 
>Len Reed
>Holos Software, Inc.
>Voice: (404) 496-1358
>UUCP: ...!gatech!holos0!lbr

I intend to make all the changes I made to perl available Real Soon Now.


Acknowledgements:

Thanks to Larry, for a terrific tool, and portable to many platforms;
Thanks to Diomidis Spinellis, for the lion's share of the work in the DOS port;
Thanks to all of the other Perl Hackers and Enhancers, for bug fixes and
	porting work;
Thanks to Len for offering to do the work, and spurring me on to go public;
I'd like to thank my agent, and my parents, and everybody I've ever met...

TD
----------
Tom Dinger	     consulting at:
TechnoLogics, Inc.        Sun Microsystems    Internet: tdinger@East.Sun.COM
(508)486-8500             (508)671-0521       UUCP: ...!sun!suneast!tdinger
Tom Dinger	     consulting at:
TechnoLogics, Inc.        Sun Microsystems    Internet: tdinger@East.Sun.COM
(508)486-8500             (508)671-0521       UUCP: ...!sun!suneast!tdinger

lbr@holos0.uucp (Len Reed) (09/30/90)

In article <1990Sep28.150103.29089@gtisqr.uucp> stu@gtisqr.uucp (Stu Donaldson) writes:
>
>I've run out of memory with perl under DOS, and would like to see
>it capable of using extended memory.  (What about overlay's :-)).

I haven't looked closely at perl's innards; perhaps it's possible to put
the compilation code in a separate overlay.

Are you proposing that DOS perl allow script controlled access to extended
memory or that it transparently use extended memory?  The former would
break Unix compatibility; the latter would be a serious programming effort.
(Maybe not.  Anyone have free DOS extender code; i.e., code that will
run a subprocess in protected mode, handling system calls and hardware
interfaces properly?)

>Also, as for another freaping creature, I'd like to see access to
>the associative array binding to files.  There are several PD versions
>of B-tree type file access routines that would be better than nothing.

Agreed.  I won't do this in the current set of enhancements, though.
-- 
Len Reed
Holos Software, Inc.
Voice: (404) 496-1358
UUCP: ...!gatech!holos0!lbr

lbr@holos0.uucp (Len Reed) (09/30/90)

Tom, it sounds like you and I need to get together to come up with a single
DOS version that can serve as a standard for future work.  I'm sending you
e-mail.

In article <2764@jaytee.East.Sun.COM> tdinger@east.sun.com (Tom Dinger - Sun BOS SPA) writes:
>
>I have already done a bunch of the things you mentioned...
>
>>       c) Using a Unixism, like "/bin/rm".
>
>I have modified many (most?) of the scripts, that weren't hopelessly mired in
>**ix-isms.  For example, many of the io.* scripts did not seem easy to
>"translate."
>
>The ones that I have had problems with are:
	[list omitted]
>
>The rest of the scripts (renamed) run, and I believe almost all tests pass.

You must have been running your swapping version, since the scripts often
invoke sub-perls.
>
>>
>>2)   Minor bug corrections.  The program should pass as much of
>>     the test suite as possible under DOS.
>
>I have found and fixed a bunch of bugs:

I fixed these too, long ago.  I'm surprised that some of these made it out
at all, since they cause compilation to fail.
>
>
>
>First, the good news: I have a _drop-in replacement_ for the MSC lowest-level
>spawn functions, that will permit swapping to EMS, XMS or DISK...
>
>Now the bad news: currently... the code... belongs to my company,

I already had such code lying around, or I'd never have attempted this.
I too wrote it for a make program: Dennis Vadura's dmake.  Mine will use
XMS or DISK, but I usualy compile the XMS portion out and use a RAM disk.
Mine also already has the MKS argument passing built in.
>
>>
>>5)   Complete MKS compatibility (see below for a plug for MKS). 
>
>PLEASE make this optional (I noted in later postings that you intend to do so.)
>I do not use MKS, nor am I about to start; however, I would welcome a
>"reasonable" standard for supplying long command lines to applications, as
>long as it is backward-compatible with applications that know nothing
>about it (lots of them).

It is.  Both the extended argument list and the standard 128 byte command
tail are passed out.  Without this compatibilty, you wouldn't be able to
run non-MKS programs from their Korn shell, which would of course be
completely unacceptable.
>
>Other things I have done to perl [the audience gasps]:
>
>1. As many DOS-perl users will not be compiling perl for their own system,
>   and additionally because DOS users must contend with drive letters, I
>   have added to _all_ versions of perl (not just DOS) the ability to
>   use the PERLLIB environment variable as the path on which to find the
>   perl "library."

I agree.  But of course it's Larry's decision.
>4. I added use of the TMP environment variable for the perl -e option, to
>   create the temporary input script.  Rationale was the same as for
>   PERLLIB, plus it is a quasi-standard in the DOS world.

Of course.
>
>In the on-deck circle:
>
>1. Using a custom chdir() function for DOS, that will change the drive as well
>   as the path if a drive is present.  For example:
>
>	chdir("\\");	/* change to root of the current drive */
>	chdir("D:");	/* change to drive D, whatever the current directory*/
>	chdir("E:\\");	/* change to drive D, change it's current dir to root*/
>	etc.

Similar to the MKS Korn shells cd command, though *it* ignores any concept of
current directory in an alternate drive: the 2nd example above would put
you into D:'s root.  You know that even DOS 4.x will take forward slashes
in system calls?  It's only command.com (which I don't use) and certain
programs (LINK, WP) that insist on backslashes.
>
>3. [Just a thought] MSC provides enough information and hooks in the start-up
>   code to replace the ARGV processing -- we could produce a command-line
>   parser that would handle quoted arguments "smarter" and could glob, like
>   the command-line processing the DOS shell recently made available.

My version, even in an vanilla DOS environment, will glob wildcards and
handle double quoted arguments.  It falls short of a Unix shell, but
it works.
-- 
Len Reed
Holos Software, Inc.
Voice: (404) 496-1358
UUCP: ...!gatech!holos0!lbr