[comp.lang.rexx] parse and getopt

ets@wrkgrp.uucp (Edward T Spire) (09/07/90)

We recognized early on that REXX Parse was not sufficient to easily
implement REXX programs that looked like CMS commands, considering the
fairly complex parameter structure of CMS commands, i.e.

cmdname positional-1 positional-2 ( keyword-1 keyword-2...

Positionals can be required or may have a default value.  Keywords can
have a value or not, and those with values must have a default.  Doing
all that with parse and a do loop is a bit of work.

We developed an external function which is passed a simple template
describing the desired command syntax, along with all the arguements
passed to the main program.  The function (along with it's canned
calling sequence) matches the input arguements up with the syntax
template, and establishes variables in the main program's symbol table  t
corresponding to each operand in the syntax template, the values of which
are the effective values for these operands (either the corresponding
user input, or the appropriate default).  Should the user input fail
to match the required syntax, the function automatically documents the
available operands and defaults, and then fails, so that the main
program will not proceed.

The function is kinda long, so I'll not include it here, but I would
be happy to publish it if anyone would like.  I will show the canned
invocation sequence here, since it is quite short...

/******************* PARSE SEQUENCE PATTERN ***************************/
/* MODIFY THE THIRD LINE AS YOUR "PROTOTYPE" SHOWING PARMS AND DFLTS  */
/* ALSO MODIFY LAST LINE SHOWING PARM NAMES ONLY                      */
/******************* START OF PARSE SEQUENCE **************************/
parse arg a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12
parse value cparse(,
"p1 p2(*) ( nk1 nk2 k1(k1v) k2(k2v)",
||" )" a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12),
with,
p1 '|' p2 '|' nk1 '|' nk2 '|' k1 '|' k2 '|' junk /* JUNK MUST BE LAST */
/******************** END OF PARSE SEQUENCE ***************************/

So the typical REXX programer begins a program by copying this sequence,
and modifying p1, p2, nk1, etc. into their own parameter list.
Unfortunately, they have to do this in two spots.  Maybe we should
have a macro facility in REXX?  Or better yet, a way for the external
REXX function to directly create variables in the caller's name space
would obviate the kludgey calling sequence I have used.

Now to getopt().  It's true that the typical Unix command syntaxes
can be more complex than CMS, less standardized.  getopt() helps a
little in handling them, but it doesn't really do much more than you
can do with REXX Parse and a DO loop.  There's no reason why a parser
similar to what I have done could not be written, so that a command
syntax template of a general nature could be used to interpret the
command input.  I think it could have all the flexibility to handle the
various forms of input mentioned in Sam Drake's posting, to wit:

>cmd -abc10 -d20 -e inputfilepathname
> or
>cmd -a -b -c 10 -d 20 -e inputfilepathname
> or
>cmd -abed20 -c10 inputfilepathname

Now that we have moved much of our REXX work to Unix, we are faced with
the need for a simple parser in that environment as well, which will 
make it easy to write REXX programs that look and feel just like Unix 
commands.  I intend to re-write my parsing program to Unix standards, 
if I can ever find a spare couple of days.

==========================================================================

Ed Spire                           email: ets@wrkgrp.com      (on uunet)
The Workstation Group              voice: 800-228-0255
6300 River Road, Suite 700            or  708-696-4800
Rosemont, Illinois  60018            fax: 708-696-2277