[net.bugs.4bsd] slight problem with grep.

jxs7451@ritcv.UUCP (Jeffrey Smith) (01/31/86)

Just wanted to mention a slight problem with the UNIX grep command.
What happens actually makes sence, but it does tend to be annoying
if you are caught by it.

Anyway, say you are in a directory that contains some Pascal source
code.  Now this is what happened to me.  I wanted to find all of the
write statements in my programs that i had used for debugging.
What happened is a used grep like this to find all of them.

	grep write * > zzzz

having zzzz being a random file.  I used at to run this in the batchmode
and i came back and got a quota exceeded message.  What happens is
grep searches for write in zzzz after it has already written all of
the previouly founded writes to this file.  Needless to say that this
is not a good situation.

Jeffrey Smith    JMS7451@RITVAXC.BITNET
	         jxs7451!ritcv

ken@rochester.UUCP (Ipse dixit) (01/31/86)

It is not grep's fault, you can get into trouble this way too:

	sed p * > xxx

I used sed because cat * > xxx checks that the output is not in the inputs.
Therefore: watch your wildcards.

	Ken
-- 
UUCP: ..!{allegra,decvax,seismo}!rochester!ken ARPA: ken@rochester.arpa
Snail: Comp. of Disp. Sci., U. of Roch., NY 14627. Voice: Ken!

radzy@calma.UUCP (Tim Radzykewycz) (02/03/86)

In article <9299@ritcv.UUCP> jxs7451@ritcv.UUCP () writes:
>What happened is I used grep like this to find all of them.
>
>	grep write * > zzzz
>
>having zzzz being a random file.

Note that this will only happen if 'zzzz' is an existing file.  If
either of the following situations occur, there won't be a problem:
	grep write * > AAAA		<- AAAA doesn't contain 'write'
	rm zzzz ; grep write * > zzzz

In the first situation, the file 'AAAA' is searched first.
In the second situation, the '*' doesn't match the file 'zzzz',
so grep never even opens it.
-- 
Tim (radzy) Radzykewycz, The Incredible Radical Cabbage.
	calma!radzy@ucbvax.ARPA
	{ucbvax,sun,csd-gould}!calma!radzy

ccc@bu-cs.UUCP (Cameron Carson) (02/04/86)

>In article <9299@ritcv.UUCP> jxs7451@ritcv.UUCP () writes:
>>What happened is I used grep like this to find all of them.
>>
>>	grep write * > zzzz
>>
>>having zzzz being a random file.
>
>Note that this will only happen if 'zzzz' is an existing file.  If
>either of the following situations occur, there won't be a problem:
>	grep write * > AAAA		<- AAAA doesn't contain 'write'
>	rm zzzz ; grep write * > zzzz
>
>In the first situation, the file 'AAAA' is searched first.
>In the second situation, the '*' doesn't match the file 'zzzz',
>so grep never even opens it.

Hmmm...I had the same experience as jxs7451@ritcv:  in the 4bsd csh's
I've used, the shell takes care of creating stdout before it does
file name expansion, resulting in '*' matching 'zzzz' even if it
didn't exist previously.

-- 
Cameron C. Carson
Distributed Systems Group
Boston University ACC

UUCP: ...!harvard!bu-cs!ccc
ARPA:  ccc%bu-cs@csnet-relay.arpa

friesen@psivax.UUCP (Stanley Friesen) (02/05/86)

In article <140@calma.UUCP> radzy@calma.UUCP (Tim Radzykewycz) writes:
>
>Note that this will only happen if 'zzzz' is an existing file.  If
>either of the following situations occur, there won't be a problem:
>       grep write * > AAAA             <- AAAA doesn't contain 'write'
>       rm zzzz ; grep write * > zzzz
>
>In the first situation, the file 'AAAA' is searched first.
>In the second situation, the '*' doesn't match the file 'zzzz',
>so grep never even opens it.
>--
        Actually, the second form *does not* work. The shell performs
I/O redirection *before* filemname substitution, so the file 'zzzz'
does exist and *is* included in the expansion, and grep *will* open
it. At least this is true for csh, I tried it(in a harmless way using
echo rather than grep). The first one will work because when grep
scans AAAA it will be totally empty, due to the shell opening it for
writing.
--

                                Sarima (Stanley Friesen)

UUCP: {ttidca|ihnp4|sdcrdcf|quad1|nrcvax|bellcore|logico}!psivax!friesen
ARPA: ttidca!psivax!friesen@rand-unix.arpa

rtd@gypsy.UUCP (02/05/86)

Actually, the program at "fault" is the shell.  The issue is, given any
command line of the form

    cmd * > file

which comes first: filename substitution or output redirection?

Assume that "file" does not exist.  If filename substitution is
done first then "*" expands to the names of all existing files, which
doesn't include "file", and everything works as you'ld probably expect.
If output redirection is done first, "file" is created and "*" expands
to a list of files which now contains "file" and you get the behavior
described in the original note.

There seems to be no consistency among shells as far as the order of
these operations is concerned.  On 4.2 BSD, csh(1) does output redirection
first, followed by filename substitution, while sh(1) does things in the
reverse order.  To determine easily how your shell works, execute the
following command in a directory that doesn't contain a file called "xxx":

    echo * > xxx

After executing this command, if file "xxx" contains the string "xxx" then
output redirection occurs first.  If not, filename substitution occurs
first.

By the way, I too executed the offending "grep" command on an early
(about 8 yrs. ago) version of AT&T Unix on a PDP 11/70.  The result was
extensive file system corruption followed by a system crash.  And
you thought Unix has mediocre protection schemes now? :-)


Bob Dillberger
Siemens Corp. Research & Support
Princeton, NJ
siemens!gypsy!rtd