[comp.lang.rexx] need a REXX-flavored version of getopt

lynn@phx.mcd.mot.com (Lynn D Newton) (08/27/90)

Has anyone out there in REXX-land written a function that more or
less duplicates the functionality of the UNIX getopt() library
call, which parses command line values in the same way (from a
black box point-of-view) as is customarily done in UNIX?  If so,
I would be deeply indebted.  The ARexx manual by Bill Hawes is
the only reference I have, which has a great deal of abstract
information on parsing, along with very little in the way of
practical demonstration. So I know all the power is built in to
do it.  I just thought if someone else had written such a thing
they would be willing to share I could save myself a lot of
hassle trying to program it myself.  Actually, I'm surprised that
such a thing is not already written as a builtin function.

Email is fine, although for the last four days the software on my
system has been broken, so I can mail out, but am not receiving
mail. Hope to have that fixed later today, so if your reply
bounces, try again. Much thanks.


--
=================================================================
Lynn D. Newton            | System Test
Motorola MCD, Tempe, AZ   | (Department of Heuristic Neology)
(602) 438-3739            | "The bug stops here!"
lynn@jazz.phx.mcd.mot.com |

Jeff Boyd <BOYDJ@QUCDN.QueensU.CA> (08/28/90)

For those of us who don't know, could you explain what getopt()
does ?  Rexx PARSE is very powerful/flexible, and it would surprise
me if it required any coding at all beyond a single parse command
to deal with most parsing problems.

jpl@charming.nrtc.northrop.com (Jeff Lankford) (08/29/90)

In article <LYNN.90Aug27090555@jazz.phx.mcd.mot.com> lynn@phx.mcd.mot.com (Lynn D Newton) writes:
>Has anyone out there in REXX-land written a function that more or
>less duplicates the functionality of the UNIX getopt() library
>call, which parses command line values in the same way (from a
>black box point-of-view) as is customarily done in UNIX?

I have several VM/CMS REXX scripts that emulate a number of Unix commands.
To support this, I implemented several common library functions, of which
getopt(3) was one (I slightly expanded the acceptable input syntax).  I
also implemented the getopt(1) command, which essentially just invokes the
getopt(3) function.

I suppose you want Arexx for MeSsy-DOS, but I imagine that any conversion
would be trivial, since my getopt(3) function uses standard built-in
functions to manipulate strings.  It's fairly trivial (about 70 lines
including comments) and you could probably implement one from scratch.
However, I'm feeling in a mercenary mood today, so I'll require a nifty*
function in exchange before I distribute.

* nifty: something that helps change the archaic IBM environment
         into a useful contemporary programming environment.

Jeff Lankford           Northrop Research and Technology Center
213/544-5394            One Research Park, Palos Verdes Peninsula, CA 90274

kim@uts.amdahl.com (Kim DeVaughn) (08/31/90)

In article <9091@gremlin.nrtc.northrop.com> jlankford@nrtc.northrop.com (Jeffrey P. Lankford) writes:
> 
> I suppose you want Arexx for MeSsy-DOS, but I imagine that any conversion

I should hope not!  ARexx is the REXX implementation for the Amiga!  It
is "Personal REXX" that's the implementation for the environment you're
refering to.

/kim
-- 
UUCP:  kim@uts.amdahl.com   -OR-   ked01@juts.ccc.amdahl.com
  or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
DDD:   408-746-8462
USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25

drake@drake.almaden.ibm.com (09/05/90)

Actually, getopt() is not entirely trivial to write/simulate in REXX.
Doable, to be sure, but not trivial.  REXX "parse" is perfect for
doing first level parsing of command lines for systems with a CMS-ish
command syntax style.  In CMS, for example, it is conventional for command
lines to look sort of like:

command filename filetype filemode (OPTA OPTB OPTC 10 OPTD 20 OPTE ...

Where "OPTA", "OPTB", and "OPTE" are arguments with no values. REXX Parse
plus a "do" loop for the arguments works pretty well on this.

But other systems have other command conventions.  The same thing in UNIX
might look like:

cmd -abc10 -d20 -e inputfilepathname
 or
cmd -a -b -c 10 -d 20 -e inputfilepathname
 or
cmd -abed20 -c10 inputfilepathname

Parsing this command line is a bit different, not as trivial, and not
as well suited for REXX Parse.  Getopt is better that Parse for this; 
you tell getopt what command line flags have arguments and point it at 
the command line; it tells you what the next flag found is and points you
to it's argument if any (whether blank delimited or not).

SO, getopt() is something that it'd be nice to have, even in REXX, 
on systems with unix-like command syntax.  For that matter, something
similar would be quite nice even on CMS; sure, Parse is great for:

parse args filename filetype filemode junk '(' options
 ... but it's the parsing of "options" that is the hard part, and Parse
 rarely comes into play there.



Sam Drake / IBM Almaden Research Center 
Internet:  drake@ibm.com            BITNET:  DRAKE at ALMADEN
Usenet:    ...!uunet!ibmarc!drake   Phone:   (408) 927-1861

lfk@key.key.amdahl.com (Lynn Kerby) (09/11/90)

In article <147@rufus.UUCP> drake@drake.almaden.ibm.com writes:
>   Actually, getopt() is not entirely trivial to write/simulate in REXX.
>   Doable, to be sure, but not trivial.  REXX "parse" is perfect for
>   doing first level parsing of command lines for systems with a CMS-ish
>   command syntax style.  In CMS, for example, it is conventional for command
>   lines to look sort of like:
>
Well I don't know that I can add much to this discussion, but I am
glad to see that I am not crazy.  The first time I went to write a
Rexx program in AREXX that required UNIX style options, I though I
must have been doing something wrong.  It is definitely non-trivial to
do right (I won't guarantee that I did it right either).  I don't have
my code handy here, but I recall that it became easier to parse the
options into a variable within a loop.  After all I had heard about
REXX's powerful string parsing capabilities from various sources I was
dissapointed. 

>   SO, getopt() is something that it'd be nice to have, even in REXX, 
>   on systems with unix-like command syntax.  For that matter, something
>   similar would be quite nice even on CMS; sure, Parse is great for:
I would love to see an implementation of REXX that has getopt()!  It
may even be useful (at least for those with a UNIX inclination) on CMS
or TSO!

--
Lynn Kerby, Amdahl Corporation:  lfk@key.amdahl.com  or  {...}amdahl!key!lfk
<<<<---------------------------- DISCLAIMER ---------------------------->>>>
<<<<      Any and all opinions expressed herein are my own. My          >>>>
<<<<      employer doesn't pay me for my opinion!                       >>>>

drake@drake.almaden.ibm.com (09/12/90)

A fellow traveler pointed out to me, and I should note for the record,
that CMS *does* have something similar to "getopt()" to assist in parsing
CMS-flavored command lines.  It's called "PARSECMD", and is present in 
all recent releases of CMS.


Sam Drake / IBM Almaden Research Center 
Internet:  drake@ibm.com            BITNET:  DRAKE at ALMADEN
Usenet:    ...!uunet!ibmarc!drake   Phone:   (408) 927-1861

jpl@charming.nrtc.northrop.com (Jeffrey P. Lankford <jlankford>) (09/14/90)

In article <176@rufus.UUCP> drake@drake.almaden.ibm.com writes:
> ... CMS *does* have something similar to "getopt()" to assist in parsing
>CMS-flavored command lines.  It's called "PARSECMD", and is present in 
>all recent releases of CMS.

Before anyone forms the wrong impression that CMS is actually a useful
programming environment, some comparison of PARSECMD and getopt() is in order.

The salient features of getopt(3) are:
	* uses a (simple) grammer to specify valid options and any associated
	  (optional) arguments,
	* accepts a less restrictive format for an instance of options and
	  arguments and generates a canonical expression of the options and
	  arguments,
        * parses the instance and generates an error condition in the
	  event that the instance violates the grammer,
        * the getopt Unix library function returns individual tokens,
	  while the getopt Unix command returns the cononical format (easily
          tokenized, since tokens are merely separated by white space).
In a previous posting, Sam Drake had a good description with some examples.
I would repeat that here but i lost that posting, so here is another
example (vaguely remembered from a man page somewhere),
where the speification is of a statment of "option 'a' or 'b' or 'o'
(followed by one or more argument) followed by anything":
	c = getopt(argc,argv,"abo:")	/* C language invocation */
	set -- `getopt abo: $*`		# Bourne shell invocation
All the following are legal statements in this grammer (the last being in
cononical form), where "f1" and "f2" are (optional) arguments, not options:
	-a -o arg f1 f2
	-aoarg f1 f2
	-oarg -a f1 f2
	-a -o arg -- f1 f2
Getopt has the following limitations:
	* no means of describing mutually exclusive options
	  (i.e., either 'a' or 'b' or neither is legal, but not both --
	  this must handled by user code that deals with the individual
	  options),
	* no means of specifying that arguments to an option may or
	  may not be present (ie "o:" specifies that if the 'o' option
	  appears it must be followed by one (1) or more arguments --
	  in my extension to getopt i arbitrarily use "o." to indicate
	  that any 'o' may be followed by zero (0) or more arguments),
	* limited to about 50 options (ie, a-zA-Z -- although any character
	  in the character set could be used, do you really what to use
	  non-printing characters for option flags?),
	* no multi-character options (ie, "ab" is always 'a' and 'b' --
	  although this could be considered an advantage rather than a
	  drawback).

PARSECMD is both a CMS command and a macro, where the command is
callable from REXX and the macro from assembler programs.  Suppose
we want to parse the command:
	MYCmd1 ft ft [ ( [Disk|PRint] [NUMrecs nnn] [)] ]
where '(' and ')' are literals, '|' is an or, '[' and ']'
enclose and optional part, upper case indicates abbreviation, and
'nnn' indicates and numeric constant.
The grammer is specified in a DLCS (acronym city here we come) formatted file
as follows:
	:DLCS DMS USER AMENG ;;
	  :CMD MMYCMD1 MYCMD1 MYCMD1 3 :;
	    :SYN MY1 3 :;
	    :OPR FCN(FN) :;
	    :OPR FCN(FT) :;
	    :OPT KWL(<DISK 2> <PRINT 2>) :;
	    :OPT KWL(<NUMRECS 3>) FCN(PINTEGER) :;
The grammer specification is converted to a format that PARSECMD
can manipulate by invoking 'CONVERT COMMANDS' CMS command.
Then 'SET LANGUAGE' must be invoked to activate the appropriate language 
parser.  The French grammer spec might look like:
	:DLCS DMS USER FRANC;;
	  :CMD MMYCMD1 MYCMD1 FRANCMD1 8 :;
	    :SYN MY1 3 :;
	    :OPR FCN(FN) :;
	    :OPR FCN(FT) :;
	    :OPT KWL(<DISK 2 DISQUE 4> <PRINT 2 IPRIMER 4>) :;
	    :OPT KWL(<NUMRECS 3 NOMENREG 6>) FCN(PINTEGER) :;
Then your REXX script can call 'PARSECMD MYCMD1' to read the command line
and generate a cononical form, where options are tokenized to non-abbreviated
form and shoved into a compound symbol (named 'token' i believe).  As with
getopt, the CMS command parser mechanism leaves to the user code handling
of mutually exclusive options (despite the fact that the grammer implies
some sort of exclusive option specification).  Unlike getopt, an arbitrary
number of arguments associated with an option is not definable; however,
option can be multiple characters with an abbreviation capability
(this is still way to verbose).

Bottom line:  though perhaps slightly less powerful, getopt is vastly easier
to use than the CMS command parser mechanism; the command syntax of
getopt is more elegant (to my eye) and has an implicit escape to permit
arbitrary number of trailing tokens, whereas the CMS mechanism requires
every token to be predefined (which is onne reason why CMS commands can't
easily handle multiple filename arguments).

Any further discusion of getopt or PARSECMD should be directed to /dev/null.
	    
Whew, now that we've flogged this topic past the pearly gates,
(and it's not even a REXX topic), lets get back to discussing REXX.
How about any of the following
(where clearly the environment of choice should be CMS):
  * Portability issues among the various run-time environments
  * Performance tuning tricks for environment X
  * Debugging techniques for environment Y
  * Language extension proposals (only well-reasoned arguments need apply)
  * Cute functions i (no, not me -- you) have coded
  * Standardization efforts (:-) [WARNING: Surgeon General advises that
    discussion of this topic may be hazardous to your health]

Also, at the risk of being pedantic, I request that future postings should
consider explicit identification of the run-time environment(s) (of course,
i exclude myself because after this posting everyone knows i'm usng CMS/REXX),
since it's evident that REXX features vary among the run-time
environments on which it is supported (Amiga, CMS, Unix(?), ...).

Now here's a question (reply directly and i might post summary)
for all those folks looking for REXX interpreter/compiler for Unix.
Why?   When you could use Bourne shell, or csh, or ksh, or tcsh (or *sh)
(and all the Unix commands expr, awk, sed), why use REXX?
I can't imagine any hefty REXX applications being ported without modification
to a different environment (say CMS to Unix), and trivial applications
could easily be re-written.  REXX without extensions would make a
lousy Unix command interpretter (no pipes or i/o redirection or job
control or ...) and if the REXX application isn't a command script,
but more a string processing application, why not use awk?

Jeff Lankford           Northrop Research and Technology Center
213/544-5394            One Research Park, Palos Verdes Peninsula, CA 90274

brooking@mcnc.org (Jim Brooking) (09/18/90)

In article <9493@gremlin.nrtc.northrop.com>, jpl@charming.nrtc.northrop.com (Jeffrey P. Lankford <jlankford>) writes:
> ...
> (where clearly the environment of choice should be CMS):

I don't see why discussions of REXX ought to be restricted to any
particular environment since REXX is, in fact, available on unix, MS-DOS
PC's, Amigas, TSO and likely others I'm not aware of, as you note
below.
> 
> ...
> environments on which it is supported (Amiga, CMS, Unix(?), ...).
> 
> Now here's a question (reply directly and i might post summary)
> for all those folks looking for REXX interpreter/compiler for Unix.
> Why?   When you could use Bourne shell, or csh, or ksh, or tcsh (or *sh)
> (and all the Unix commands expr, awk, sed), why use REXX?

Spoken like an unemployed unix hacker who's been reduced to accepting
charity from a shop running CMS. I can't believe anyone who has used
the internally consistent REXX language and its function set(s) to any
large extent would suggest the use of multiple shell scripts with
arcane grafted-together hacks like awk are in any way comparable to a
REXX program with its (REXX) simplicity and clarity of expression, not
to mention the ease of programming it.

> I can't imagine any hefty REXX applications being ported without modification
> to a different environment (say CMS to Unix), and trivial applications

Sure, sure, and no C program or shell script has ever had system
dependencies coded into it. Right. A really valid criticism for REXX and
certainly not applicable to a--n--y other language. Please....

> could easily be re-written.  REXX without extensions would make a
> lousy Unix command interpretter (no pipes or i/o redirection or job
> control or ...) and if the REXX application isn't a command script,
> but more a string processing application, why not use awk?

Unless I'm missing something one can pipe into and out of a unix REXX
program, redirect I/O, etc. What's the problem? Awk's great if you can
write it correctly the first time, and cover all the contingencies
extant in the file being awked. If not you will often or not get one of
the really helpful error messages awk is famous for, or possibly not get
an indication that there has been a problem, which awk is also noted
for.

Arguments about "who's thing is better" are fundamentally religious in
nature. If one has been born and raised in a unix environment, one will
likely be inclined to favor that environment and compare all others to
it. If one has had experience with a variety of environments, the
tendency to refer to any as "not useful" or worse (Unisys' timesharing
excluded, of course...8-) does not contribute to much of anything in the
discussions at hand, namely, REXX topics.


-- 
>8-}     >:-)     %\(     8^)     :+/     |'[     ;-)     :-O     B^\    :-)
Jim Brooking........North Carolina Supercomputing Center.......(919)248-1145

lynn@phx.mcd.mot.com (Lynn D Newton) (09/18/90)

I started this whole thing, and have, I believe, seen all of the
traffic on it.  It began when I asked if anyone out there had
written a fairly complete implementation of the UNIX getopt()
function that they would be willing to share with the world (me).

I did receive a number of email replies, some of which missed the
point entirely of what I was looking for. At least one person
understood completely, and even offered to write it if I would
supply the C source code to getopt(), which unfortunately, is
AT&T copyrighted, and as a software engineer working for a UNIX
source licensee, I have signed nondisclosure agreements
preventing me from sending out such things.  I thank all those
who took the trouble to reply in some fashion or other.

BTW, there _is_ an AT&T-produced public domain version of
getopt() floating around somewhere which differs from the
copyrighted version only in the way it handles one or two of the
rules that are controversial to start with.  I've seen the source
to both, and the variable names are even the same. I may even
have it at home, if anyone is interested (it's only about 60
lines of C code, as I recall).

The bulk of the replies have said essentially:

	o 'Tis indeed a difficult thing to do.

which I knew. I'm a reasonably capable programmer myself, but am
not inclined to tackle it on my own, because I have other things
to tend do, which is why I was looking for a handout.

	o My computer is better than yours.

More language wars. On the positive side, it has increased the
amount of traffic in this group, which I am glad to see. I've
seen it go for weeks without a posting.

However, at this writing I still have no REXX implementation of
getopts(). I think I will gather all the discussion that has been
posted and snailmail it to William Hawes, the programmer of
ARexx, and see if I can convince him to write a getopts() library
routine for inclusion with ARexx so that all the great number of
Amiga users who have or will have ARexx (it's bundled with
version 2.0 of the operating system, for the information of
readers who are unaware of it), can also have this very valuable
functionality.
--
=================================================================
Lynn D. Newton            | System Test
Motorola MCD, Tempe, AZ   | (Department of Heuristic Neology)
(602) 438-3739            | "The bug stops here!"
lynn@jazz.phx.mcd.mot.com |