[alt.sources] Update to getarg

darcy@druid.uucp (D'Arcy J.M. Cain) (07/11/90)

In article <1990Jul11.003712.21570@druid.uucp> in alt.sources.d I wrote:
>I will be re-posting getarg, my replacement for getopt, tomorrow.  Mainly
>this fixes a few minor bugs.  Also, as pointed out in email, I forgot to
>cover the "--" argument situation.  This has been fixed in this version.
>Thanks to all those who sent me mail.  Hope you find this latest version
>useful.
>
So here it is:

/*
getarg.c
Written by D'Arcy J.M. Cain
D'Arcy Cain Consulting
275 Manse Road, Unit # 24
West Hill, Ontario
M1E 4X8
416 281 6094

UUCP: darcy@druid

This routine may be freely distributed as long as credit is given to D'Arcy
J.M. Cain, the source is included and this notice remains intact.  There is
specifically no restrictions on use of the program including personal or
commercial.  You may even charge others for this routine as long as the above
conditions are met.

This is not shareware and no registration fee is expected.  If you like the
program and want to support this method of distribution, write a program or
routine and distribute it the same way and I will feel I have been paid.

Of course gifts of money, drinks and extravagant jewels are always welcome.

As for documentation, you're looking at it.

First of all let me start by saying that I do not envision getarg as
a plug and play replacement for getopt.  That is why I used a different
name.  It is designed to look more or less the same to the user but it
is not quite the same for the programmer.  I believe that it is a more
logical and elegant interface to the command line than getopt.  What I
am trying to say is that I make no apology for not emulating the clumsy
programmer interface of getopt.

This set of routines is a replacement for getopt.  To the user it should
look the same except that options and files may be intermingled on the
command line instead of forcing options to come first.  This allows
things like the following where option 'a' takes a word argument:
	command -vx -a on file1 -a off file2
allowing the user to process file1 with the a flag on and file 2 with the a
flag off.

In addition, the caller may set up the argument list from more than one
place.  The first place, of course, would be from the command line as
getopt does.  Other possibilities are to read the an environment variable
or a file for arguments.  You may even have one of the options cause a
file to be read to insert arguments.  I am suggesting that "-@ filename"
be used for consistency unless someone can suggest why this would not
be suitable.

To implement this, getarg splits the function into two main parts plus
a third part which is added for programmer convenience.  The first part
is initarg which initialises the argument list.  The prototype is:

	int		initarg(int argc, char **argv);

and would normally be called as follows:

	initarg(argc - 1, argv + 1);

This function can be called as often as you wish.  Each time you call
initarg, the argument list is stuffed into the current list at the point
which is currently being processed.  Thus, after making the above call you
might look for an environment variable and if it exists then parse it into
an argument list and call initarg again with that list.  This effectively
allows users to set up defaults in their .profile and to override them when
calling the program.  For example, say that there was program 'foo' which
takes an option 'a' which took as an argument the word 'on' or 'off' and a
user wanted to set it up so that it was normally off.  The user could add
the line:
	foo=-aoff
to the .profile.  If one time the user wants the option on then a command
line such as
	foo -a on
is effectively
	foo -aoff -a on
Which, if the code is written to allow flags to change back and forth, will
change the default action.

In addition you can use arguments from the command line to open files
and read in more arguments which can be stuffed into the argument
stream.

    if (c == '@')
		load_args_from_file(optarg);

Note that there is a possibility of a problem if initarg is called while
an argument is being parsed.  Consider the following code:

	while ((c = getarg("abcz")) != 0)
	{
		case 'a':
			something();
			break;

		case 'b':
			something();
			break;

		case 'c':
			something();
			break;

		case 'z':
			foo("standard.arg");
			break;
	}

where foo is designed to read a file as a series of arguments and call
initarg.  This can cause serious problems if you have a command such as
"bar -abzc" since it will replace the pointer to "-abzc" with the first
argument in the file.  Of course this will probably never be a problem
since you would most likely include the file name to be read with the
option rather than hard coding it so the current argument will be consumed
before reading in the file but before pointing to the next argument.


For programmer convenience there is a routine called initarge() which
is prototyped as:
	int		initarge(int argc, char **argv);
and is normally called as
	initarge(argc, argv);

Notice that this is similar to the initarg example above except that all
the arguments are sent including the program name.  This function will in
turn call initarg with argc - 1 and argv +1.  In addition it will take the
program's base name and look for an environment variable with that name.
If found, it parses the string and stuffs that in as well.  Note that the
environment string may include arguments with spaces if the argument is
delimited by quotes.  This could get a little tricky since the quotes must
be escaped in the shell.  for example the following string
	-a "hello, world"
would have to be created with the following command
	foo="-a \"hello, world\""
and of course strings that have quotes in them get even more complicated.
	foo="-a \"This is a quote - \\\"\""
which becomes
	-a "This is a quote - \""
And that becomes the strings
	-a
	This is a quote - "


Both initarg and initarge return -1 on error.  The only error condition
is not enough memory to run the routines.  Otherwise they return the
number of arguments added to the argument list.


The other module is the function getarg which returns options and
arguments to the calling program.  It is prototyped as follows:
	int			getarg(char *optstr);

The parameter optstr is similar to the third parameter to getopt(3).
The other arguments are not needed since initarg gets the argument list.

There are five possible returns from getarg.  If there are no more options
or arguments to read then it returns a zero.  If an option is read but
is not matched in optstr or a needed argument is missing then a question
mark is returned.  If a non option argument is read then -1 is returned
and optarg points to the argument.  If a '-' appears as a separate argument
then a special return of '-' is returned indicating standard input (only by
convention of course.)  Otherwise it must be a valid option and the letter
is returned.  If it is an option expecting an argument then optarg points
to the argument.

The use of "--" is allowed as in getopt to signal the end of options.  As
soon as getarg sees this argument it sets a flag and points to the next
argument.  It then calls itself to get the next argument.  The recursive
call is to keep all the error checking and cleanup in one place.

One extra feature I have added in is a semi-colon operator similiar to the
colon operator.  If an option letter in opstring is followed by a semi-colon,
the option may *optionally* take an argument.  The argument must follow the
option letter as part of the same argument.  This normally means no spaces
between the option letter and its argument.  I am not to sure about this
one since it has to be treated a little different than other arguments by
the user.  Comments on this feature hereby solicited.

The global variable opterr is not implemented.  With Windows and other
screen based applications getting to be so popular it is assumed that the
calling routine will want to handle its own error message handling.  I am
also thinking about dropping optind as a global since I haven't figured
out any use for it now.  If someone can think of one, please let me know.
In the meantime you shouldn't declare optind as external in your program
unless you have to.

Sample usage assuming two mutually exclusive options 'a' and 'b', option
'o' to specify output and option 'v' which must be set before any files
are processed and not allowed after processing begins.

	main(int argc, char **argv)
	{
		int c, proc_started = 0;
		FILE	*in_fp, *out_fp = stdout;
		extern char *optarg;
		static char *opt_str[] = { "abo;v", "abo;" };
		.
		.
		.
		initarg(argc - 1, argv + 1);
			--- OR ---
		initarge(argc, argv);

		while ((c = getarg(opt_str[proc_started])) != 0)
		{
			switch (c)
			{
				case 'a':
					if (bflag)
						errflag++;
					else
						aflag++;
					break;

				case 'b':
					if (aflag)
						errflag++;
					else
						bflag++;
					break;

				case 'v':
					vflag++;
					break;

				case 'o':
					if ((out_fp != stdout) && (out_fp != NULL))
						fclose(out_fp);

					if (optarg == NULL)	** no argument means stdout **
						out_fp = stdout;
					else if ((out_fp = fopen(optarg, "w")) == NULL)
						err_exit("Can't open output file");

					break;

				case -1:
					if ((fp = fopen(optarg, "r")) != NULL)
						do_stuff(in_fp, out_fp);
					else
						err_exit("Can't open input file\n");

					proc_started = 1;
					break;

				case '-':
					do_stuff(stdin, out_fp);
					proc_started = 1;
					break;

				case '?':
					usage();
					errflag++;
					break;
			}

			if (errflag)
				do_error_stuff_and_exit();
		}
	}

*/

#ifdef BSD
#include <strings.h>
#else
#include <string.h>
#define	index	strchr
#endif

#include	<stdlib.h>
#include	<malloc.h>
#include	<ctype.h>

#ifndef		NULL
#define		NULL	(void *)(0)
#endif

int		optind = 0;
char	*optarg;

/* Note that the above declarations can cause problems with programs
that use getopt(3) if this module is scanned first in the link phase.
This means that if you use getopt sometimes then you should keep this
module separate and link it in specifically when needed.  Alternatively
you can change the names of the above externs (perhaps declare optind as
static as programs don't really need it anyway) and have a #define so
that the program still uses the above name(s).  I considered using a
different name for optarg but was afraid that anything I picked would
conflict with user's names.
*/

static char		**pargv = NULL;
static int		pargc = 0;

int		initarg(int argc, char **argv)
{
	int		k = argc * sizeof(char *);

	/* check for trivial case */
	if (!argc)
		return(0);

	/* get or expand space */
	if (pargc == 0)
		pargv = malloc(k);
	else
		pargv = realloc(pargv, pargc + k);

	if (pargv == NULL)
		return(-1);				/* not enough memory for argument pointers */

	/* if adding arguments insert them at current argument */
	if (pargc)
		for (k = pargc - 1; k >= optind; k--)
			pargv[k + argc] = pargv[k];

	for (k = 0; k < argc; k++)
		pargv[optind + k] = argv[k];

	pargc += argc;
	return(pargc);
}


int		initarge(int argc, char **argv)
{
	char	*env_str, *env_args[64];
	int		k, j = 0;
#ifdef	__MSDOS__
	char	prog_name[64];
#endif

	if ((k = initarg(argc - 1, argv + 1)) == -1)
		return(-1);				/* not enough memory for argument pointers */

#ifdef	__MSDOS__
	if ((env_str = strrchr(argv[0], '\\')) == NULL)
	{
		strcpy(prog_name, argv[0]);
		if ((env_str = strchr(prog_name, ':')) != NULL)
			strcpy(prog_name, env_str + 1);
	}
	else
		strcpy(prog_name, env_str + 1);

	if ((env_str = strchr(prog_name, '.')) != NULL)
		*env_str = 0;

	if ((env_str = getenv(prog_name)) == NULL)
#else
	if ((env_str = strrchr(argv[0], '/')) != NULL)
		env_str++;
	else
		env_str = argv[0];

	if ((env_str = getenv(env_str)) == NULL)
#endif
		return(k);

	if ((env_args[0] = malloc(strlen(env_str) + 1)) == NULL)
		return(-1);				/* not enough memory for argument pointers */

	env_str = strcpy(env_args[0], env_str);

	while (isspace(*env_str))
		env_str++;

	while (*env_str)
	{
		if (*env_str == '"')
		{
			env_args[j++] = ++env_str;

			while (*env_str && *env_str != '"')
			{
				if (*env_str == '\\')
				{
					strcpy(env_str, env_str + 1);
					env_str++;
				}
				env_str++;
			}
		}
		else
		{
			env_args[j++] = env_str;

			while (*env_str && !isspace(*env_str))
				env_str++;
		}

		if (*env_str)
			*env_str++ = 0;

		while (*env_str && isspace(*env_str))
			env_str++;
	}

	if ((j = initarg(k, env_args)) == 0)
		return(-1);				/* not enough memory for argument pointers */

	return(j + k);
}

/*
The meat of the module.  This returns options and arguments similar to
getopt() as described above.
*/

int		getarg(const char *opts)
{
	static int sp = 0, end_of_options = 0;
	int c;
	char *cp;

	optarg = NULL;

	/* return 0 if we have read all the arguments */
	if(optind >= pargc)
	{
		if (pargv != NULL)
			free(pargv);

		pargv = NULL;
		pargc = 0;
		optind = 0;
		return(0);
	}

	/* Are we starting to look at a new argument? */
	if(sp == 0)
	{
		/* return it if it is a file name */
		if ((*pargv[optind] != '-') || end_of_options)
		{
			optarg = pargv[optind++];
			return(-1);
		}

		/* special return for standard input */
		if (strcmp(pargv[optind], "-") == 0)
		{
			optind++;
			return('-');
		}

		/* "--" signals end of options */
		if (strcmp(pargv[optind], "--") == 0)
		{
			end_of_options = 1;
			optind++;
			return(getarg(opts));
		}

		/* otherwise point to option letter */
		sp = 1;
	}
	else if (pargv[optind][++sp] == 0)
	{
		/* recursive call if end of this argument */
		sp = 0;
		optind++;
		return(getarg(opts));
	}

	c = pargv[optind][sp];

	if(c == ':' || (cp = index(opts, c)) == NULL)
		return('?');

	if(*++cp == ':')
	{
		/* Note the following code does not allow leading
		   spaces or all spaces in an argument */

		while (isspace(pargv[optind][++sp]))
			;

		if(pargv[optind][sp])
			optarg = pargv[optind++] + sp;
		else if(++optind >= pargc)
			c = '?';
		else
			optarg = pargv[optind++];

		sp = 0;
	}
	else if (*cp == ';')
	{
		while (isspace(pargv[optind][++sp]))
			;

		if (pargv[optind][sp])
			optarg = pargv[optind] + sp;

		optind++;
		sp = 0;
	}

	return(c);
}

-- 
D'Arcy J.M. Cain (darcy@druid)     |   Government:
D'Arcy Cain Consulting             |   Organized crime with an attitude
West Hill, Ontario, Canada         |
(416) 281-6094                     |