rodgers@maxwell.mmwb.ucsf.edu (ROOT) (12/08/90)
Dear Netlanders, Does anyone know how to protect whitespace in items to be passed to the "for" operator of the Bourne shell? Consider the script: #! /bin/sh # # Define list # list="'a b' c" # # Use list # for item in $list do grep $item inputfile done # # Script complete where "inputfile" might contain, for example: a b c d The idea is to grep for each of the regular expressions appearing in $list, one at a time, in the file "inputfile". In the above example, "a b" is meant to comprise one such pattern, and "c" another. I have tried all sorts of combinations of \, ', and " in the definition of "list" and in the appearance of "$list" on the "for" command line, in an attempt to prevent the shell from parsing arguments on the whitespace contained within the the expr "a b", all to no avail. One such combination of failed quoting mechanisms is displayed above. Please, no responses of the form "why do you want to do this," "use perl," "use awk," etc. The above boils down the essence of a problem which appears in quite a different context. Any ideas???? Thanks and Cheerio, Rick Rodgers
mcgrew@ichthous.Eng.Sun.COM (Darin McGrew) (12/08/90)
In article <16570@cgl.ucsf.EDU> rodgers@maxwell.mmwb.ucsf.edu (ROOT) writes: >Does anyone know how to protect whitespace in items to be passed to the >"for" operator of the Bourne shell? Consider the script: Use `eval` so that the quotes are evaluated as such. Here's the revised script-- #! /bin/sh # # Define list # list="'a b' c" # # Use list # eval for item in "$list" \; \ do \ grep \"\$item\" inputfile \; \ done # # Script complete Yes, getting the quoting right can be difficult if the body of the loop is large. Another option might be to have a small loop that feeds a `while read foo` loop-- eval for item in $list \; \ do \ echo \"\$item\" \; \ done | while read item do grep "$item" inputfile # More big, hairy, loop that would be too # confusing with '\' characters everywhere done Darin McGrew mcgrew@Eng.Sun.COM Affiliation stated for identification purposes only.
lml@cbnews.att.com (L. Mark Larsen) (12/08/90)
In article <16570@cgl.ucsf.EDU>, rodgers@maxwell.mmwb.ucsf.edu (ROOT) writes: # Does anyone know how to protect whitespace in items to be passed to the # "for" operator of the Bourne shell? Consider the script: # # #! /bin/sh # # # # Define list # # # list="'a b' c" # # # # Use list # # # for item in $list # do # grep $item inputfile # done # # # # Script complete # # where "inputfile" might contain, for example: # # a b # c # d # One way to do what you want is to set the positional parameters and loop through them: set -- 'a b' c for item do grep "$item" inputfile done Of course, if your script was called with arguments, you may have a small problem to get around - especially if any of the original arguments had embedded white space. Possibly the safest and easiest thing to avoid this sort of problem might be to use a function: doit() { for item do grep "$item" inputfile done } doit 'a b' c # once for arg do # process original args differently echo $arg done doit 'd e' f # again The function idea is quite useful in other situations. For example, suppose you want to change the value of some variable in a script but the change is taking place inside of a loop where the output is redirected. With the Bourne shell (fixed in the Korn shell) such a loop is run in a subshell which means the change to the variable in the script's environment is lost: # with /bin/sh, foo is not changed foo=bar for i do foo=$i echo "loop: foo = $foo" done >/dev/tty echo "final = $foo" However, by putting the loop in a function, the change does take place to the script's environment: # in this case, foo *is* changed doit() { for i do foo=$i echo "loop: foo = $foo" done } foo=bar doit $* >/dev/tty echo "foo = $foo" cheers, L. Mark Larsen lml@atlas.att.com
maart@cs.vu.nl (Maarten Litmaath) (12/11/90)
In article <4198@exodus.Eng.Sun.COM>, mcgrew@ichthous.Eng.Sun.COM (Darin McGrew) writes: )In article <16570@cgl.ucsf.EDU> rodgers@maxwell.mmwb.ucsf.edu (ROOT) writes: )>Does anyone know how to protect whitespace in items to be passed to the )>"for" operator of the Bourne shell? Consider the script: ) )Use `eval` so that the quotes are evaluated as such. Here's the )revised script-- ) ) #! /bin/sh ) # ) # Define list ) # ) list="'a b' c" ) # ) # Use list ) # ) eval for item in "$list" \; \ ) do \ ) grep \"\$item\" inputfile \; \ ) done ) # ) # Script complete ) )Yes, getting the quoting right can be difficult if the body of )the loop is large. [...] Another option is to use the ``set'' command, if the original $* arguments aren't needed: # First remember the original args. argc=0 argv= for i do argc=`expr $argc + 1` eval argv$argc='"$i"' argv="$argv \"\$argv$argc\"" done # Now set the stuff we want to process. # The initial `x' is there to make sure the first argument of the # ``set'' command does not start with a `-'. This method is more # portable than ``set - ...''. eval set x "$list" # Now get rid of the dummy arg. shift for item do # The `-e' option `protects' the pattern. grep -e "$item" $inputfile done # Reset the args. eval set x $argv shift If the loop can be executed in a subshell, we don't need to remember the args: ( eval set x "$list" shift for item do ... done ) -- In the Bourne shell syntax tabs and spaces are equivalent almost everywhere. The exception: _indented_ here documents. :-( Does anyone remember the famous mistake Makefile-novices often make?
martin@mwtech.UUCP (Martin Weitzel) (12/11/90)
In article <16570@cgl.ucsf.EDU> rodgers@maxwell.mmwb.ucsf.edu (ROOT) writes: >Dear Netlanders, > >Does anyone know how to protect whitespace in items to be passed to the >"for" operator of the Bourne shell? Consider the script: > >#! /bin/sh ># ># Define list ># >list="'a b' c" ># ># Use list ># >for item in $list >do > grep $item inputfile >done ># ># Script complete > >where "inputfile" might contain, for example: > >a b >c >d If you have any character that will never appear in the items of your list, you can use this character as delimiter for the items and change IFS (in most cases it is wise to restore IFS for the rest of the script): list="a b:c" CIFS=$IFS # save IFS IFS=: for item in $list do IFS=$CIFS # restore IFS here for the loop grep "$item" inputfile done IFS=$CIFS # or restore it here for the rest of the script In my example I used `:' as delimiter character; if you need all the printing characters you can use some control character (e.g. BEL, aka ^G), and, if you do it right, you can even use a newline character: list="a b c" IFS=" " # ^-- no white space between double quote and newline!!! for item in $list do grep $item inputfile done In this example I left out saving and restoring IFS. Now, as we just touched this topic (and for all who don't know): IFS contains the characters that are used as separators for the command name and its parameters. In the times I had less experience with the (Bourne-) shell, I thought the above (second example) couldn't work, because: How does the shell separate the parts of the command-line in the body of the loop, when no blank (space character) occurs within IFS? The answer is that the space character is *allways* a valid separator, no matter what is specified in IFS. So the line grep $item inputfile is correctly tokenized into three parts. Then, after several other shell constructs were recognized, there is a step which replaces the construct `$item' by he contents ov the variable `item'. And finally, there comes the step where IFS is obbeyed and the line is further separated. For this reason you need no doublequotes around `$item' in the second example, because IFS doesn't contain a space then, but you absolutely need them in the first example!! (Think about it, if this is not clear to you - and then try it.) -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
martin@mwtech.UUCP (Martin Weitzel) (12/11/90)
In article <4198@exodus.Eng.Sun.COM> mcgrew@ichthous.Eng.Sun.COM (Darin McGrew) writes: >In article <16570@cgl.ucsf.EDU> rodgers@maxwell.mmwb.ucsf.edu (ROOT) writes: >>Does anyone know how to protect whitespace in items to be passed to the >>"for" operator of the Bourne shell? Consider the script: > >Use `eval` so that the quotes are evaluated as such. Here's the >revised script-- > > #! /bin/sh > # > # Define list > # > list="'a b' c" > # > # Use list > # > eval for item in "$list" \; \ > do \ > grep \"\$item\" inputfile \; \ > done > # > # Script complete > >Yes, getting the quoting right can be difficult if the body of >the loop is large. Yes, getting the quoting right can be difficult :-( .... but I have found a simple trick that makes it much easier :-). Quoting is necessary as the shell essentially parses the arguments of the `eval'-command two times and the programmer must take care that some parts are evaluated in the first parse, others in the second. Most people now quote (only) the parts that must *not* be evaluated in the first parse. Make it vice versa and quote everything *except* what must be evaluated in the first parse. eval ' for item in '"$list"'; do grep "$item" inputfile; done ' Looks a little nicer, doesn't it? If you have hardcopy of this, there is another trick to see what's going on: Take one of this yellow marker pencils to highlite everything from one single qoute to the next. Leave out the unquoted parts. I'll try to show it here with capitals: eval ' FOR ITEM IN '"$list"'; DO GREP "$ITEM" INPUTFILE; DONE ' Everything that is highlited on your hardcopy (or capitalized above) is taken literally during the first parse. Easy to recognize that only the contents of the variable `list' will be substituted during this. -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83