[comp.unix.questions] ATTN: AWK GURUS!!!

dph@beta.UUCP (David P Huelsbeck) (03/22/88)

The problem I'm faced with is this:

	Convert possibly mixed-case strings to upper-case.
	(not counting case-less chars like digits)

For reasons too involved to explain here I *MUST* do this is
vanilla BSD4.3 awk.  

(A NOTE TO THE SLOW:  The above is means exactly what it says.
                      Though I haven't tried it yet myself I'm
		      sure that "pearl" would most likely make this
		      easier.  I know it would be simple to do
                      in GNU-E-Lisp.  I know that "tr" and "dd" will
		      do case conversion.  I *CANNOT* use any of 
		      these.  I *CANNOT* use the new awk.  I 
		      *CANNOT* use a C program.  So don't bother
		      me or anybody else telling me I can or should.
                      
     IF YOU'RE SOLUTION DOESN'T INVOLVE VANILLA BSD4.3 AWK DON'T BOTHER!)

Sorry, I just know from experience that I need to make that clear.  Though,
I don't think it will really help I thought I'd say it anyway.

It is also worth it to note that the example of my solution is overly
simplified.  What I have done here could have been accomplished with a
internal awk pipe through tr. (i.e. print new | "tr \"[a-z]\" \"[A-Z\"")
However, I need for the upper-case string to appear along with mixed-case
text also generated by the same awk script.  Specificly I need it to be
in with some nroff stuff.  (again note that I mean *nroff* not TeX!)
As an example I'd like to convert something like:

	foo
	some text....

to:

	.ip "FOO" 10
	some text ....

This is the idea but it is again overly simple so I need a fairly general
and powerful solution.  I've tried awk pipes.  The results were at best
predictablely bad and at worst unpredictablely bad.   After giving it some
thought I came up with the following gross solution.  Are there any awk
hackers out there who can think of something not so gross.

*	David Huelsbeck
*	dph@lanl.gov             DON'T USE "r". 
*	{cmcl2|ihnp4}!lanl!dph   IT WON'T WORK!
*
*	Why not Comp.lang.awk ?

----my solution follows-----my solution follows-----my solution follows-------


BEGIN	{
	cap["a"] = "A"; cap["b"] = "B"; cap["c"] = "C"; cap["d"] = "D"
	cap["e"] = "E"; cap["f"] = "F"; cap["g"] = "G"; cap["h"] = "H"
	cap["i"] = "I"; cap["j"] = "J"; cap["k"] = "K"; cap["l"] = "L"
	cap["m"] = "M"; cap["n"] = "N"; cap["o"] = "O"; cap["p"] = "P"
	cap["q"] = "Q"; cap["r"] = "R"; cap["s"] = "S"; cap["t"] = "T"
	cap["u"] = "U"; cap["v"] = "V"; cap["w"] = "W"; cap["x"] = "X"
	cap["y"] = "Y"; cap["z"] = "Z"
	}

{	if ($1 ~ /[a-z]+/) {
		new = ""
		last = length($1)
		for (char=1; char <= last; ++char) {
			cur = substr($1,char,1)
			if (cur ~ /[a-z]/) {
				new = new cap[cur]
			} else {
				new = new cur
			}
		}
		print new
	}
}

bzs@bu-cs.BU.EDU (Barry Shein) (03/22/88)

>	Convert possibly mixed-case strings to upper-case.
>	(not counting case-less chars like digits)

The attached works under 4.3bsd as you required.

	-Barry Shein, Boston University

BEGIN {
  upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
  lower = "abcdefghijklmnopqrstuvwxyz";
}  
{
  out = "";
  for(i=1;i <= length($1);i++) {
    if((cpos = index(lower,c = substr($1,i,1))) > 0)
      c = substr(upper,cpos,1);
    out = out c;
  }
  print out;
}

morrell@hpsal2.HP.COM (Michael Morrell) (03/24/88)

This is a little less gross, I hope (and it also seems to work).

BEGIN	{
	lower="abcdefghijklmnopqrstuvwxyz"
	upper="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
	}

{	new = ""
	for (i=1; i <= length; ++i) {
		cur = substr($0, i, 1)
		j = index(lower, cur)
		if (j != 0)
			new = new substr(upper, j, 1)
		else
			new = new cur
	}
	print new
}

ivor@geac.UUCP (Ivor Williams) (03/24/88)

In article <20805@bu-cs.BU.EDU> bzs@bu-cs.BU.EDU (Barry Shein) writes:
>>	Convert possibly mixed-case strings to upper-case.
>
>The attached works under 4.3bsd as you required.
> [ wonderfully economical awk script followed...]

Very good: I've saved that one as an example of the sort of thing awk can
do.  Just one question, though: shouldn't that have been $0 rather than $1,
or did I miss something in the initial question?

Ivor
-- 
Ivor Williams, Geac Computers International Inc. 
UUCP:	{mnetor|yunexus|utgpu}!geac!ivor