[comp.unix.shell] Shell METACHAR's in parameters

david@marvin.jpl.oz (David Magnay) (01/22/91)

Can anyone supply a "rule" of how to consistently handle shell parameters inside
a script, when the parameter MAY contain shell metacharacters or a regular
expression. The danger is that the shell is will process the characters, rather
than pass them literally.

We also should be able to pass regular expressions down into a script from
within a script, indefinitely. Again, without knowing whether the param has
metachar's or not.

This first showed up in a script to "tidy up" the "find" utility with a
friendlier interface. I found that the script would intermittently fail
silently. If I pass a parameter "file*" as a parameter, it passes literally
if it matches NO files in the current directory, but is expanded if any
matching files exist. ( try "echo kkxx*" ).  This is fixed by the user quoting
the command line when invoking the script. But I still found funnies
occasionally, presuamble because of unwise usage of parameters inside the
script.

What I cant see is the RULE of guidance, so I dont repeat the problem.

]) (01/23/91)

In article <829@marvin.jpl.oz> david@marvin.jpl.oz (David Magnay) writes:
>Can anyone supply a "rule" of how to consistently handle shell parameters inside
>a script, when the parameter MAY contain shell meta-characters or a regular
>expression. The danger is that the shell is will process the characters, rather
>than pass them literally.

I first tripped over this with a USAGE variable:

	USAGE="usage: $0 [-f] filename"
	if [ somebadcondition = true ]
	then
		echo $USAGE 1>&2
		exit 1
	fi

and what happened, based on PWD, was that I'd get 

	usage: thecmd f filename

"What happened to the brackets and the minus-sign???", I asked myself.
Well, of course, there was a file named "f" in PWD, and it was matched
by the   [-f]   term in the USAGE variable, and so replaced that term.
Here's one fix:

	USAGE="usage: $0 [-f] filename"
	if [ somebadcondition = true ]
	then
		echo "$USAGE" 1>&2
		exit 1
	fi

because, once quoted, only the first level of evaluation by the shell
remains (ie. the variable is replaced by its contents and the contents
aren't further evaluated).  *Every* time you reference a variable which
you either know to contain shell meta-characters or suspect might (as
the result of a read or user-arguments), double-quote it.

By the way, if you use ksh, you have an alternative (though the use of
double-quotes is identical).  If you    set -f    , then no shell
meta-characters are matched in filename generation.  This is switchable,
so you can

	while read re
	do
		if [ ! -z "$re" ]
		then
			set -f
			grep $re somefile
			set +f
		fi
	done

Note that even if you're running the whole script under 'set -f', you
still have to quote "$re" in the test to see if it's non-null, because
if it *is* null, test will report a missing argument.

When the user, on calling the script, entered

	thecommand -f file*

without quoting file* as either "file*" or 'file*' or file\*, the meta-
character matching was done before your script got control.  If there
were fourteen files in PWD starting with "file" in their names, then
thecommand was called with -f and fourteen filename arguments.  Such
is life when the user's command-environment is a programming language.

...Kris
-- 
Kristopher Stephens, | (408-746-6047) | krs@uts.amdahl.com | KC6DFS
Amdahl Corporation   |                |                    |
     [The opinions expressed above are mine, solely, and do not    ]
     [necessarily reflect the opinions or policies of Amdahl Corp. ]

byron@archone.tamu.edu (Byron Rakitzis) (01/23/91)

In article <1991Jan23.045104.5557@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes:
>As quoted from <11k701EV0dwd00@amdahl.uts.amdahl.com> by krs@uts.amdahl.com (Kris Stephens [Hail Eris!]):
>+---------------
>| In article <829@marvin.jpl.oz> david@marvin.jpl.oz (David Magnay) writes:
>| >Can anyone supply a "rule" of how to consistently handle shell parameters inside
>| >a script, when the parameter MAY contain shell meta-characters or a regular
>| >expression. The danger is that the shell is will process the characters, rather
>| >than pass them literally.
>| 
>| 			set -f
>| 			grep $re somefile
>| 			set +f
>+---------------
>
>Still unsafe --- "set -f" in ksh doesn't stop it from splitting at spaces.
>What happens if the value of $re is "x y"?
>

The shell I am almost finished writing, an almost-public (to use Henry
Spencer's turn of phrase) implementation of rc, the AT&T v10 and plan
9 shell, does not have any problems passing arguments in a literal
fashion even if they are stored in variables.

A variable is a list type. If you assign:

a=(one two three)

then when you type:

grep $a

you get {"grep", "one", "two", "three", NULL} passed to grep in
argv[].

However, if you assign

a='one two three'

when you type:

grep $a

grep sees in its argv: {"grep", "one two three", NULL}

It's that simple; after a value has been assigned to a variable, it
is not altered. If a variable has many elements, then each element
of that variable is assigned a slot in argv.

The latest version of rc is available by anonymous ftp from
archone.tamu.edu in ~ftp/pub/rc. The copies of rc are stored by date
(since rc is still changing very fast) but any given copy, if it
compiles on your machine (portability is not quite guaranteed) will
probably be fairly stable on your machine. Of course, let me know
about any problems you have getting rc to work.

For those keeping track of rc's progress, the only big feature left to
go in is heredocs. I will probably add "break" and "return" keywords
by popular demand, but other than that, rc should freeze very soon now.
(i.e., further developments will be confined to a new rc, and the old
rc will be updated for bug fixes and general unix robustness.)

--
---
Byron Rakitzis
byron@archone.tamu.edu

]) (01/24/91)

In article <1991Jan23.045104.5557@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes:
>As quoted from <11k701EV0dwd00@amdahl.uts.amdahl.com> by krs@uts.amdahl.com (Kris Stephens [Hail Eris!]):
>+---------------
>| In article <829@marvin.jpl.oz> david@marvin.jpl.oz (David Magnay) writes:
>| >Can anyone supply a "rule" of how to consistently handle shell parameters inside
>| >a script, when the parameter MAY contain shell meta-characters or a regular
>| >expression. The danger is that the shell is will process the characters, rather
>| >than pass them literally.
>| 
>| 			set -f
>| 			grep $re somefile
>| 			set +f
>+---------------
>
>Still unsafe --- "set -f" in ksh doesn't stop it from splitting at spaces.
>What happens if the value of $re is "x y"?

Absolutely correct.  I *hate* it when things like that slip my mind.

...Kris
-- 
Kristopher Stephens, | (408-746-6047) | krs@uts.amdahl.com | KC6DFS
Amdahl Corporation   |                |                    |
     [The opinions expressed above are mine, solely, and do not    ]
     [necessarily reflect the opinions or policies of Amdahl Corp. ]