[comp.unix.questions] An awk question or two... answered

rwl@uvacs.CS.VIRGINIA.EDU (Ray Lubinsky) (09/13/87)

In article <3931@well.UUCP>, daniels@well.UUCP (Dan Smith) writes:
> 
> 	Command lines, passing variables to awk:
> 	awk -f comline.awk comvar=\"SUB\" ascii.h
> 	this is the file comline.awk:
> 	comvar	{ print }
:
> 	What I want to do is to pass *two* command line variables, and
> have awk work on the lines within the two patterns -- such as:
:
> 	comlinevar1, comlinevar2 {
> 		/-f/	{ print $2 }
> 	}

Awk(1) simply doesn't allow for variables in the pattern specifications so if
you want to bundle your script into a neat package that will accept command
line options you should use a shell script to handle the insertion of options.

The first problem would become the script:

	#! /bin/sh
	awk '/'$1'/ { print }' ascii.h

where the string SUB would be the argument to the script and the second could
be either:

	#! /bin/sh
	awk '
	$1 == "'$1'" { ok = 1 ; next }
	$1 == "-f" && ok != 0 { S[n++] = $2 ; next }
	$1 == "'$2'" { if (ok) { for (s in S) { print S[s] } ; ok = n = 0 } }
	'

which is pretty rigorous, or it could be:

	#! /bin/sh
	awk '$1 == "'$1'" , $1 == "'$2'" { if ($1 == "-f") print $2 }'

which is a lot simpler but (at least under 4.2 BSD's awk(1)) prints matching
lines between the last occurance of the first pattern and the end of the file,
i.e., without checking to realize that it hadn't seen the second pattern.

Using the shell to insert the arguments from the shell script's command line
overcomes the problems of trying to get awk(1) to do it.  Note that in the
scripts, everything between pairs of matching single quotes will be the parts
of the awk(1) program passed uninterpreted, while those parts outside of the
single quote pairs will be processed before being passed as part of the
argument string to awk(1).

The construct ``$1'' inside single quotes will be passed literally to awk(1)
which will, when interpreting the awk(1) script, use that to mean the string in
the first field.  The same construct outside of the single quotes is
interpreted by the shell to mean the first argument on the command line of the
shell script and will be replaced by that string as the shell passes the
program to awk(1).

I would advise against using double quotes like this:

	awk "$1 ~ /foo/ { print }"

because the shell is allowed to interpret shell variables within pairs of
double quotes and the ``$1'' will be substituted with the first command line
argument to the shell script rather than passed literally as ``$1'' for awk(1)
to see as meaning the first field in its input line.

Also, I would avoid using csh(1) to interpret the shell script.  Using sh(1)
lets you do nice things like inserting literal newlines within a single-
or double-quote delimited argument string and that makes writing in-line awk(1)
programs a lot easier in the script.

-- 
| Ray Lubinsky         Department of Computer Science, University of Virginia |
|                      UUCP:      ...!uunet!virginia!uvacs!rwl                |
|                      CSNET:     rwl@cs.virginia.edu                         |
|                      BITNET:    rwl8y@virginia                              |