martin@mwtech.UUCP (Martin Weitzel) (05/09/91)
In article <791@necssd.NEC.COM> harrison@necssd.NEC.COM (Mark Harrison) writes: >In article <1817@wjvax.UUCP>, mario@wjvax.UUCP (Mario Dona) writes: >> 1. How to prevent blank lines from printing if there is nothing to print >Add this line after your BEGIN rule: >/^$/ {next} #skip blank lines > >If you want to skip lines that may have white space: >/^[ \t]*$/ {next} #skip blank (non-text) lines Though I don't think that is what the poster tried to ask, Mark gives a good hint here how to avoid unnecessary processing of empty input lines. I have a further suggestions, which also isn't related to the original question but turns out to be very handy in many occasions. Extend Mark's proposual to /^[ \t]*(#.*|[ \t]*)$/ { next; } #skip blank lines and comments This helps you to embedd comments in the usual style (line beginning with '#') within data that should be processed by AWK. Of course this is only applicable, if you yourself write ALL the tools which process the data. But as it seems that '#'-comments are allowed in many system configuration files, I think it's good practice to stick to this style if you want such a feature in your own tools. And as you see, it's really easy. (There's no excuse to have *NO* such feature if the format of the data you process doesn't come from outside but is your own design.) It's not too hard to have even more sohisticated comment processing, i.e. that '#'-comments can start anywhere within the line, but then it gets less convenient and if you don't really *need* it (but just think it's a nice feature), it's not worth it. (Mail me your solution if you want, I'll put together a summary and select the shortest one.) >> 2. How to concatenate the city and zip fields as shown. > >To concatenate: > > city_and_zip = city " " zip > >To strip trailing space from city before concatenating: > > while (substr(city, length(city)) == " ") > city = substr(city, 1, length(city) - 1) I'd also prefer the above for portability, but as NAWK becomes more and more available I think that "gsub" will be the more convenient approach. >> 3. [Capitalizing] > >This is doable, but not enjoyable. There is more of a chance if >you use nawk or gawk. Otherwise, make an array: > >uc["a"] = "A" ... uc["z"] = "Z" >lc["A"] = "a" ... lc["Z"] = "z" I would add here that you don't have necessarily to write 26 lines here, as you can initialize uc and lc in a loop (OK, OK, it's not totally portable, but at least on ASCII it works and for ISO 8859 any special characters of a foreign character set can be added by hand after this loop: # ***** # NOTE: unportable code follows, ASCII character set is assumed!! # ***** for (i = 65; i < 65+26; i++) { lc[sprintf("%c", i)] = sprintf("%c", i+32); uc[sprintf("%c", i+32)] = sprintf("%c", i); } Or, if you are one of those who like it a bit more obscure, change the body of the loop to: uc[lc[sprintf("%c", i)] = sprintf("%c", i+32)] = sprintf("%c", i); >and loop for the length of the string: > > if (uc[substr(str, i, 1)] == "") > newstr = newstr substr(str, i , 1) > else > newstr = uc[substr(str, i, 1)] It is probably worth the price (for performance reasons), to initialize the array used for mapping completly so that uc[x] == x (when x should remain unmapped). Though this requires a little more data space%, there is no need to check a condition as often as the body of the loop is executed. %: There's a maybe little known pitfall in - at least the old - AWK: When a statement like if (array[z] == "") is executed, an entry for the index value z is made for array. (You'll normaly not note this as the contents of array[z] is the empty string, but it gets in the way if you later have a for (a in array).) Because of this peculiarity you save fewer data space than you may think when you use the above construct, as non-existing index values of `uc' are added by the test. -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83