bm@bike2work.Eng.Sun.COM (Bill Michel) (12/11/90)
I'm working on a shell script that makes extensive use of (n)awk. I'm *really confused as to the general workings of the script, and would apprecieate some help. Assume my script to be called "nawkfile" it is invoked as follows: nawkfile inputfile where inputfile is the file containing the input to be processed. My main questions are : 1) where does the data put into "string" go after the first call to nawk? 2) does $* mean a recursive call to the script, if so, how can this be used as input to the second nawk call Thanks in advance -------------------------------- The general layout is as follows nawk ' { process a bunch of text, and append it to the variable "string" } END { print string } ' $* | nawk ' { do some more processing }' | nawk ' { do some more processing and send the output to a file } -- Bill Michel bm@eng.sun.com These views are my own, not Sun's.
tchrist@convex.COM (Tom Christiansen) (12/11/90)
In article <4255@exodus.Eng.Sun.COM> bm@bike2work.Eng.Sun.COM (Bill Michel) writes: :I'm working on a shell script that makes extensive use of (n)awk. : :The general layout is as follows : :nawk ' { : process a bunch of text, and append it to the variable "string" :} :END { : print string :} :' $* | :nawk ' :{ :do some more processing :}' | : :nawk ' :{ :do some more processing and send the output to a file :} The data is going down the pipe to the next awk in the pipeline. I'd try really hard if I were you to make this work in just one script if you possibly can. Otherwise, you will be slowed down a great deal. If you can't do that, you might consider using perl instead, which is something like nawk with a built-in fork (plus considerably more). In particular, it supports spawning children and communicating to them through shared file descriptors without having to know about how pipe() calls work. For example: if (open(HANDLE, "|-")) { # parent code writes to HANDLE } else { # child code just reads from STDIN per usual } or else: if (open(HANDLE, "-|")) { # parent code reads from HANDLE } else { # child code just writes to STDOUT per usual } You can also play more elaborate games using explicit pipe() calls. This does sound to me like an application where perl might be more applicable than awk. You can even use a2p awk-to-perl translator to get started. I can't really say though without seeing the script. --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "With a kernel dive, all things are possible, but it sure makes it hard to look at yourself in the mirror the next morning." -me
hunt@dg-rtp.rtp.dg.com (Greg Hunt) (12/12/90)
In article <4255@exodus.Eng.Sun.COM>, bm@bike2work.Eng.Sun.COM (Bill Michel) writes: > > I'm working on a shell script that makes extensive use of (n)awk. > I'm *really confused as to the general workings of the script, and > would apprecieate some help. > Assume my script to be called "nawkfile" it is invoked as follows: > > nawkfile inputfile > > where inputfile is the file containing the input to be processed. > My main questions are : > > 1) where does the data put into "string" go after the first call > to nawk? > 2) does $* mean a recursive call to the script, if so, how can > this be used as input to the second nawk call > > The general layout is as follows > > nawk ' > { > process a bunch of text, and append it to the variable "string" > } > END { > print string > } > ' $* | > nawk ' > { > do some more processing > }' | > [ rest of script deleted ] 1. The data put into "string" is being written (by the print command) to the file descriptor called stdout (standard output). Using the pipe symbol "|", you have told the shell to connect the stdout from the first nawk to the stdin (standard input) of the second nawk. So, the data is being written through the pipe to the second nawk, which will use the data as input. Using a pipe to do this is called "redirecting" the input and output. 2. The "$*" doesn't indicate a recursive call. It is the way you get access to the command line arguments specified to the script. Specifically, $* means "get me all of the command line arguments", which in this case is only the name of the "inputfile". The shell substitutes the arguments in place of the $*, so the first nawk ends up being called with "inputfile" as its argument. You can reference $* as many times as you care to in the script. You can also get specific arguments by using $1, $2, etc. You don't need to use the $* for the second nawk to get it's input, you've already done that by using the pipe "|" from the first nawk. For more details on both of these points, you might want to look at the man page for the shell (use "man sh | more"). It will tell you about the various ways you can redirect input and output, and also the ways you can manipulate shell script arguments. Enjoy! -- Greg Hunt Internet: hunt@dg-rtp.rtp.dg.com DG/UX Kernel Development UUCP: {world}!mcnc!rti!dg-rtp!hunt Data General Corporation Research Triangle Park, NC, USA These opinions are mine, not DG's.