lesh@BRL.ARPA (01/26/87)
I would like to use 'awk' to segregate data from a master file into specific files with the filename based on the contents of a specific field in one of the input records. 'awk' permits directing printed output to filenames specified in quoted variables. "Without quotes, the file names are treated as uninitialized variables and all output then goes to the same file."*1 Unquoted variables thus provide the vehicle to files named from a value obtained from a field of a previous record. THE PROBLEM: "Users should also note that there is an upper limit to the number of files that are written in this way. At present it is ten."*1 I can't find any way to close a file opened by 'awk' and very soon get the "too many opened files error message". Any suggestions? 1. Support Tools Guide, p. 6-31.
dce@mips.UUCP (01/27/87)
In article <3746@brl-adm.ARPA> lesh@BRL.ARPA (ISC | howard) writes: > THE PROBLEM: > "Users should also note that there is an upper limit to the >number of files that are written in this way. At present it is ten."*1 > > I can't find any way to close a file opened by 'awk' and very >soon get the "too many opened files error message". > > Any suggestions? Don't try to close the file (even though some newer versions of awk may have a close builtin). I have used the following methods quite successfully: 1. Iteration - Keep track of the number of files you have used. At that point, put all data that goes to a new file onto the standard output. By redirecting standard output to a temporary file, you can check after the awk script has finished to see if this file is empty. If not, run the script again with the temporary file as the input. Otherwise, you are finished. This method doesn't work when there is a lot of state involved. 2. Pass-thru - Instead of writing the records directly to the file, write records of the form filename data... and use the construct: awk ... | while read file data do echo "$data" >> "$file" done (functions can make this look a lot cleaner). The only problem with this is that the shell read command eats backslashes (bug?). If this is unacceptable, you could get away with generating filename data... and use calls to the line command (head -1 may not work correctly here) to get the data. -- David Elliott UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!dce, DDD: 408-720-1700
kraml@trwrb.UUCP (Robert P. Kraml) (06/30/87)
I have a question concerning the awk utility. Does anyone know if there is a way to pass awk variables out as shell variables on a line-by-line basis. What I mean is the following: 1. awk reads in a line of NF fields 2. some of those fields get passed out as shell variables 3. These variables are operated on (i.e. put into certain files). 4. Awk reads another line and so on... . . . I know how to get shell variables into awk but not visa versa. Any help would be greatly appreciated. -- Phone: (213) 536-1871 {allegra,uscvax,decvax,randvax,ihnp4,sdcrdcf} Address: One Space Park | 82/2024 ------>!trwrb!trwcsed!kraml Redondo Beach CA 90278
pdg@ihdev.ATT.COM (Joe Isuzu) (07/02/87)
In article <718@trwcsed.trwrb.UUCP> kraml@trwcsed.UUCP (Robert P. Kraml) writes: >I have a question concerning the awk utility. Does anyone know if >there is a way to pass awk variables out as shell variables on a >line-by-line basis. What I mean is the following: >1. awk reads in a line of NF fields >2. some of those fields get passed out as shell variables >3. These variables are operated on (i.e. put into certain files). >4. Awk reads another line and so on... Easy. Do something like this..... $eval `awk -f awks` where awks is: { print $1 "=" $2; } or something like that. With input of abc def xyzzy plugh ^D you will find that $abc is def and $xyzzy is plugh, when you are back at the shell. This format requires that you are using ksh or sh. For csh, the line that formats the setting arguments should be print "set " $1 " " $2; Hope this helped. -- Paul Guthrie "Another day, another Jaguar" ihnp4!ihdev!pdg -- Pat Sajak
fyl@ssc.UUCP (Phil Hughes) (07/04/87)
In article <718@trwcsed.trwrb.UUCP>, kraml@trwrb.UUCP (Robert P. Kraml) writes: > I have a question concerning the awk utility. Does anyone know if > there is a way to pass awk variables out as shell variables on a > line-by-line basis. I am now prowd of this and hope someone comes up with a clean way but I had a similar problem (passing stuff back to a calling shell script). The child wrote a file consisting of setenv commands with the appropriate data. When control was returned to the parent, it did a source of the file. (Ok, I'm embarassed but it did what I needed.) You could use the same method writing the file with awk. -- Phil Hughes, SSC, Inc. P.O. Box 55549, Seattle, WA 98155 (206)FOR-UNIX ...!uw-beaver!tikal!ssc!fyl
howard@COS.COM (Howard C. Berkowitz) (07/17/87)
I am attempting to write an awk program which reorganizes text which has a repeating pattern of n lines of text followed by a heading line: -----------------------INPUT TEXT EXAMPLE---------------- The purpose is to test if the implementation accepts an ACCEPT request correctly. .IP ISVB101 The purpose is to test if the implementation detects the error when ACCEPT SPDU is sent with parameters in incorrect order. .IP ISIB102 --------------------------------------------------------- The awk program should store lines until the ".IP ..." line is detected, then output (to file foo) the IP line followed by all text lines: ------------------- DESIRED OUTPUT EXAMPLE ------------ .IP ISVB101 The purpose is to test if the implementation accepts an ACCEPT request correctly. .IP ISVB102 The purpose is to test if the implementation detects the error when ACCEPT SPDU is sent with parameters in incorrect order. --------------------------------------------------------- The awk program I have written for this, which includes debugging code, is: BEGIN { i = 1 nip = 0 ntx = 0 print "INIT foo" >"foo" } $1 !~ /.IP/ { # add this text line to the s array. # do not yet output it. s[i++] = $0 ++ntx } $1 ~ /.IP/ { # capture this line in the array's first # position, then print the array in order. s[0] = $0 for (j=0; j<= i; j++) print j "-" s[j] > "foo" i = 1; ++nip } END {print "nip=" nip " ntx=" > "foo"} ---------------------------------------------------------------- BEGIN gets control; END never does. The output begins: INIT foo 0-The purpose is to test if the implementation 1-The purpose is to test if the implementation 2- 0-accepts an ACCEPT request correctly. 1-accepts an ACCEPT request correctly. 2- 0-.IP ISVB101 1-.IP ISVB101 2- ------------------------------------------------------------ This type of duplicate output continues; the final END print never executes. Help! -- -- howard(Howard C. Berkowitz) @cos.com {seismo!sundc, hadron, hqda-ai}!cos!howard (703) 883-2812 [ofc] (703) 998-5017 [home] DISCLAIMER: I explicitly identify COS official positions.
bazavan@hpcesea.HP.COM (Valentin Bazavan) (07/20/87)
Try this one. It produces the output you want. Valentin Bazavan ...!hplabs!hpcea!bazavan awk '/.IP/ {print $0; for (i=0;i<count;i++)print line[i]; count=0} !/.IP/ {line[count++]=$0} ' infile
dph@beta.UUCP (David P Huelsbeck) (07/20/87)
You're obviously not a beginning awk programmer as you were not far off on this one. I don't remember running into this before but our 4.3 awk seems to get hosed up when I try to use array[0]. I assume this was half of your problem. Also if you store into an array using post-increment the final value of your index is 1 greater than the index of the last valid element. I'm sure this would have been easy to spot if the array[0] problem hadn't been getting you. The following does what you wanted with just a few changes to your script. Hope this helps. Sorry for posting this but my UUCP connection seems a bit flaky lately. [mitch, did you ever get my mail from last week ? ] David Huelsbeck dph@lanl.gov {cmcl2,ihnp4}!lanl!dph ---cut here--------cut here-------cut here--------cut here---------cut here--- BEGIN { i = 0 # not really needed but looks good to pascal types ;-) nip = 0; ntxt = 0 print "INIT foo" > "foo" } $1 !~ /\.IP/ { ++ntxt line[++i] = $0 # if pre-increment is used "i" is always a valid } # array element; just skip line[0] as it is not a # valid array location; THIS WAS YOUR PROBLEM $1 ~/\.IP/ { ++nip print > "foo" # send $0 -the .IP line- to foo for (j=1; j<=i; j++) { print line[j] > "foo" } i = 0 # reset i } END { printf "nip=%d\tntxt=%d\n", nip, ntxt > "foo" }
seb022@tijc02.UUCP (Scott Bemis ) (12/19/87)
12/18/87 The the awk program below works ok on an ms-dos version of awk I have on my pc from Mortice Kern Systems Inc. (MKS awk). The book called "The AWK Programming Language" by the authors of awk: Aho, Weinberger, and Kernighan. refers to user created awk functions and the ** operator to support exponentiation. The ms-dos version of awk from Mortice Kern Systems Inc. supports reference user created awk functions and the ** operator to support exponentiation. Unfortuately, neither feature appears to be supported in the version of awk that I have on my vax 8600. This VAX 8600 is using a port of AT&T UNIX V Release 2.0 Version 2 operating system. Does anyone sell, or provide an awk for AT&T UNIX V Release 2.0 Version for VAXes that supports the ** operator for exponentiation and user created functions. Since I do not know how to get the version number of my awk on the VAX 8600, here is a listing from the /usr/src/cmd/awk directory: total 157 -rw-rw---- 1 bin bin 2662 Jul 4 1983 EXPLAIN -rw-rw---- 1 bin bin 2974 Jul 4 1983 README -rw-rw---- 1 bin bin 3169 Jul 4 1983 awk.def -rw-rw---- 1 bin bin 6385 Jul 4 1983 awk.g.y -rw-rw---- 1 bin bin 4936 Jul 4 1983 awk.lx.l -rw-rw---- 1 bin bin 2226 Nov 7 1983 awk.mk -rw-rw---- 1 bin bin 10877 Jul 4 1983 b.c -rw-rw---- 1 bin bin 528 Jul 4 1983 freeze.c -rw-rw---- 1 bin bin 6537 Jul 4 1983 lib.c -rw-rw---- 1 bin bin 2167 Jul 4 1983 main.c -rw-rw---- 1 bin bin 2379 Jul 4 1983 makeprctab.c -rw-rw---- 1 bin bin 2386 Jul 4 1983 parse.c -rw-rw---- 1 bin bin 2377 Jul 4 1983 proc.c -rw-rw---- 1 bin bin 15629 Jul 4 1983 run.c -rw-rw---- 1 bin bin 1520 Jul 3 1984 token.c -rw-rw---- 1 bin bin 118 Jul 4 1983 tokenscript -rw-rw---- 1 bin bin 6301 Jul 4 1983 tran.c Below is the awk program that works with awk from Mortice Kern Systems Inc. (MKS awk) with ms-dos. It DOES NOT work with the awk on the VAX 8600. BEGIN { # new record separator RS = "sec" # build look-up arrays # tt_types 0..15 (0x00..0x0f) split("L V K X Y CR X-PAC Y-PAC CR-PAC WX WY ** ** ** TCP TCC",tt1) # tt_types 16..23 (0x10..0x17) split("DSP DSC DCP ** ** ** ** **",tt2) # tt_types 45..53 (0x2d..0x3f) split("LSTATUS ** ** ** ** ** ** ** Lmode",tt3) # tt_type 69 (0x45) split("AVF",tt4) # tt_types 96..108 (0x60..0x6c) split("LKC LTI LTD LHA LLA LPV LPVH LPVL LODA LYDA LTS LSP LMN",tt5) # tt_types 112..119 (0x70..0x77) split("LERR LMX LHHA LLLA LRCA ** RSS RDS",tt6) # tt_types 120..127 (0x78..0x7f) split("RRC ST SD RSESB AHA ALA APV APVH",tt7) # tt_types 128..138 (0x80..0x8a) split("APVL AODA AYDA ATS ASP ** ** AERR AHHA ALLA ARC",tt8) # tt_types 240,241 (0xf0, 0xf1) split("V. K.",tt9) } # conv decimal tt_type to element name string function tt_name(n) { if (n <= 15) { return tt1[n+1] } else if(n <= 23) { return tt2[n-15] } else if(n <= 53) { return tt3[n-44] } else if(n <= 69) { return tt4[n-68] } else if(n <= 108) { return tt5[n-95] } else if(n <= 119) { return tt6[n-111] } else if(n <= 127) { return tt7[n-119] } else if(n <= 138) { return tt8[n-127] } else if(n <= 241) { return tt9[n-239] } else { return "UNKNOWN" } } # convert hex string to decimal function frhex(str) { tot = 0 l = length(str) { for(i=0; i<l; i++) { v = substr(str, l-i, 1) x = index("0123456789abcdef",v) if(x == 0) { print "format error" exit } tot += (16 **i)*(x-1) } } return tot } # main prog { if(NF == 0) # ignore blank lines next if($11 != "55") # ignore all but primitive 55 next print "\nprimitive: "$11 print "recno: 0x"$12 tblks = frhex($13) print "total blocks: " tblks "\n" for(j=0; j<tblks; j++) { el_name = tt_name(frhex($(14+j*5))) num_loc = frhex($(15+j*5) $(16+j*5)) start_addr = frhex($(17+j*5) $(18+j*5)) printf("%2s %2s %2s %2s %2s tt_type: %7s num_loc: %2d " \ " start_addr: %4d \n", $(14+j*5),$(15+j*5),$(16+j*5), \ $(17+j*5),$(18+j*5),el_name,num_loc,start_addr) } } Scott Bemis Texas Instruments P. O. Drawer 1255 M/S 3517 Johnson City, Tennessee 37601 U.S.A. telephone: (615) 461-2959 e-mail: mcnc!rti!tijc02!root
frank@hpuxa.ircc.ohio-state.edu (Frank G. Fiamingo) (10/05/89)
My problem is that I want to check a particular character position in a record to see whether or not it is a blank. If so, I want to change it to a zero. I thought I could do this with an awk script similar to the one below, but have had no success. BEGIN {FS=""} {split($0,array); if (array[28] == " ") {array[28] = 0}; for(i in array) print array[$1]} However, it appears that split is only recognizing 2 fields in the record, rather than the 35 characters that it contains. Can anyone tell me where I've gone wrong, or a better way to do this? Thanks, Frank
ok@cs.mu.oz.au (Richard O'Keefe) (10/05/89)
In article <281@nisca.ircc.ohio-state.edu>, frank@hpuxa.ircc.ohio-state.edu (Frank G. Fiamingo) writes: > BEGIN {FS=""} > {split($0,array); > if (array[28] == " ") {array[28] = 0}; > for(i in array) print array[$1]} > However, it appears that split is only recognizing 2 fields > in the record, rather than the 35 characters that it contains. New versions of awk may be different, but the 4.3BSD manual says "The variable FS ... may be changed at any time to any single character." By experiment, the awk that comes with SunOS 4.0 takes the first character of FS as the only separator. The following script does the trick: BEGIN { n = 28 } { if (substr($0, n, 1) == " ") { print substr($0, 1, n-1) "0" substr($0, n+1, length-n); } else { print; } } Another approach would be to use sed.
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/05/89)
In article <281@nisca.ircc.ohio-state.edu>, frank@hpuxa.ircc.ohio-state.edu (Frank G. Fiamingo) writes: | My problem is that I want to check a particular character | position in a record to see whether or not it is a blank. | [ ... ] | BEGIN {FS=""} | {split($0,array); | if (array[28] == " ") {array[28] = 0}; | for(i in array) print array[$1]} | | [ ... ] | Can anyone tell me where I've gone wrong, or a better way | to do this? One liner: substr($0,28,1) == " " { print } -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "The world is filled with fools. They blindly follow their so-called 'reason' in the face of the church and common sense. Any fool can see that the world is flat!" - anon
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/05/89)
In article <281@nisca.ircc.ohio-state.edu>, frank@hpuxa.ircc.ohio-state.edu (Frank G. Fiamingo) writes: | My problem is that I want to check a particular character | position in a record to see whether or not it is a blank. | [ ... ] | BEGIN {FS=""} | {split($0,array); | if (array[28] == " ") {array[28] = 0}; | for(i in array) print array[$1]} | [ ... ] | Can anyone tell me where I've gone wrong, or a better way | to do this? One liner: substr($0,28,1) == " " { print } -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "The world is filled with fools. They blindly follow their so-called 'reason' in the face of the church and common sense. Any fool can see that the world is flat!" - anon
dph@crystal.lanl (David Huelsbeck) (10/06/89)
It's been a little while since I did much awking but I did A LOT of it
at one time. I'm not sure if it's the same in the new awk but in the
old awk setting FS to null gave you the default field separator of
whitespace. That is, if you set it to space or tab, single spaces and
tabs would separate fields so a series of either would give you a bunch
of null fields, but setting it to null gave you the default behavior back.
So your:
> BEGIN {FS=""}
doesn't do anything at all.
The correct technique is to use substr(). This has already been addressed
by others so I'll spare you all the grief of seeing it again.
-dph (still waiting for comp.lang.awk)
larry@macom1.UUCP (Larry Taborek) (10/09/89)
From article <281@nisca.ircc.ohio-state.edu>, by frank@hpuxa.ircc.ohio-state.edu (Frank G. Fiamingo): > My problem is that I want to check a particular character > position in a record to see whether or not it is a blank. > If so, I want to change it to a zero. I thought I could > do this with an awk script similar to the one below, but have > had no success. > > BEGIN {FS=""} > {split($0,array); > if (array[28] == " ") {array[28] = 0}; > for(i in array) print array[$1]} > > However, it appears that split is only recognizing 2 fields > in the record, rather than the 35 characters that it contains. > Can anyone tell me where I've gone wrong, or a better way > to do this? Frank, Why use split at all? If $0 contains the entire record, and if you have fixed position data (as array[28] suggests) then why not just check $0[28]? { FS="" array=$0 if (array[28] == " ") array[28] ="0" print array } I had to assign $0 to some variable as I wanted to change it. Awk complains if you try to change $fields directly. You can also remove the FS="" statement, as we are working off $0 then any fields that awk derives are of no consequence to this code. Hope this helps... -- Larry Taborek ..!uunet!grebyn!macom1!larry Centel Federal Systems larry@macom1.UUCP 11400 Commerce Park Drive Reston, VA 22091-1506 703-758-7000
oneill@getafix.slcs.slb.com (Dennis O'Neill) (01/08/90)
I'm trying to use awk to convert a LaTeX file of mailing address to something
more acceptable to Oracle's bulk data loading facility. Each entry in the file
is a macro for a particular location in the form of
\def\slny{Schlumberger Limited\\
277 Park Avenue\\
New York, NY 10172-0266}
Successive addresses are separated by blank lines; so I'm trying something like
BEGIN{
FS = "\\"
RS = ""
ORS = "\n\n"
}
$2 ~ /def/ {gsub(/\\\\/, ""); print $3 $4 $5 $6 $7 $8 $9 $10}
And I get
awk: syntax error near line 8
awk: illegal statement near line 8
It seems it doesn't like the gsub call. So I tried a test with just
{gsub(/\\\\/, ""); print $0}
and get essentially the same message:
awk: syntax error near line 1
awk: illegal statement near line 1
In fact, regardless of what I use for the first argument to gsub, I get the same
error. What am I doing wrong?
Thanks in advance,
Dennis O'Neill
norm@oglvee.UUCP (Norman Joseph) (01/10/90)
In <3384@linus.SLCS.SLB.COM>, by oneill@getafix.slcs.slb.com (Dennis O'Neill): > > [writing about parsing multi-line records with awk, and getting a > syntax error on a line using the gsub() function call:] > > {gsub(/\\\\/, ""); print $0} > > [generates:] > > awk: syntax error near line 1 > awk: illegal statement near line 1 On my system (Altos running Unix 5.3.1) there are two versions of awk, namely "awk" and "nawk" (new awk). Apparently nawk is the latest version of awk as described in the book _The_AWK_Programming_Language_ by Aho, Kernighan, & Weinberger, and includes gsub() as a builtin function, while plain old awk does not. My suspicion is that you are using the old awk. -- Norm Joseph - Oglevee Computer System, Inc. UUCP: ...!{pitt,cgh}!amanue!oglvee!norm /* you are not expected to understand this */