[comp.unix.questions] Selective translation

jackson@dfsun1.electro.swri.edu (Keith Jackson) (07/11/90)

I was trying to filter a file by making the first word lowercase and
leaving the rest as is.  My solution:

    % awk -f filt1.awk foo | tr A-Z a-z > stage1.b
    % pr -m -t -s -l1 stage1.a stage1.b > final

where filt1.awk contains:
{
    print $1;
    for (i = 2; i < NF; i++)
	printf("%s ", $i) >> "stage1.a";
    if (NF > 1)
	print $NF >> "stage1.a";
}

One flaw to this solution is that pr(1) ends up adding an extra
line with the separation character (tab) on it.  The question is, how
does one do this more easily?
-- 
  -*^*-*^*-*^*-*^*-*^*-*^*-*^*-*^*-*^*-*^*-*^*-*^*-*^*-*^*-*^*-*^*-*^*-*^*-
Keith Jackson {convex, texsun}!smu!jackson == jackson@csvax.seas.smu.edu
== jackson@dfsun1.electro.swri.edu     ##  UN*X - live free or die  ##
Disclaimer: All views represented here are made by a person who doesn't plan ah

merlyn@iwarp.intel.com (Randal Schwartz) (07/11/90)

In article <1553@dfsun1.electro.swri.edu>, jackson@dfsun1 (Keith Jackson) writes:
| I was trying to filter a file by making the first word lowercase and
| leaving the rest as is.  My solution:
| 
|     % awk -f filt1.awk foo | tr A-Z a-z > stage1.b
|     % pr -m -t -s -l1 stage1.a stage1.b > final
| 
| where filt1.awk contains:
| {
|     print $1;
|     for (i = 2; i < NF; i++)
| 	printf("%s ", $i) >> "stage1.a";
|     if (NF > 1)
| 	print $NF >> "stage1.a";
| }

I hope you mean "the first word on each line".  That's about the best
I could come up with by reverse-engineering your code.  In Perl, it'd be:

perl -pe 's/^(\W*\w+)/(($x = $1) =~ tr|A-Z|a-z|),$x/e' foo

Just another Perl hacker,
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (07/11/90)

In article <1553@dfsun1.electro.swri.edu> jackson@dfsun1.electro.swri.edu (Keith Jackson) writes:
: I was trying to filter a file by making the first word lowercase and
: leaving the rest as is.  My solution:
: 
:     % awk -f filt1.awk foo | tr A-Z a-z > stage1.b
:     % pr -m -t -s -l1 stage1.a stage1.b > final
: 
: where filt1.awk contains:
: {
:     print $1;
:     for (i = 2; i < NF; i++)
: 	printf("%s ", $i) >> "stage1.a";
:     if (NF > 1)
: 	print $NF >> "stage1.a";
: }
: 
: One flaw to this solution is that pr(1) ends up adding an extra
: line with the separation character (tab) on it.  The question is, how
: does one do this more easily?

Well, there's

  perl -pe 's/\S+/($tmp = $&) =~ y:A-Z:a-z:, $tmp/e' foo >final

or

  perl -pe '/\S+/ && substr($_,length($`),length($&)) =~ y/A-Z/a-z/' foo >final

or

  perl -pe '($x)=/(\S+)/ && $x =~ y/A-Z/a-z/ && s//$x/' foo >final

or

  perl -ne '@x = split(/(\S+)/); $x[1] =~ y/A-Z/a-z/; print @x' foo >final

or even

  sed -e 'h' \
	-e 's/^\([ ]*[^ ][^ ]*\).*/\1/' \
	-e 'y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/' \
	-e 'G' \
	-e 's/\n[ ]*[^ ][^ ]*//'  foo >final

    where [ ] and [^ ] can be expanded to have a tab as well.

Take your pick.  All of these will preserve the whitespace around the
first word on each line, which you can't do very easily with awk.

Larry Wall
lwall@jpl-devvax.jpl.nasa.gov

jackson@dfsun1.electro.swri.edu (Keith Jackson) (07/11/90)

) In article <8677@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall)
) In article <1553@dfsun1.electro.swri.edu> jackson@dfsun1.electro.swri.edu (Keith Jackson) writes:
) : I was trying to filter a file by making the first word lowercase and
) : leaving the rest as is.  My solution:
) : 
) : (cumbersome process deleted)
) 
) Well, there's
) 
) (many perl solutions deleted (gee, I thought she just sang ;^))
)
) or even
) 
)   sed -e 'h' \
) 	-e 's/^\([ ]*[^ ][^ ]*\).*/\1/' \
) 	-e 'y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/' \
) 	-e 'G' \
) 	-e 's/\n[ ]*[^ ][^ ]*//'  foo >final
) 
)     where [ ] and [^ ] can be expanded to have a tab as well.
) 
) Take your pick.  All of these will preserve the whitespace around the
) first word on each line, which you can't do very easily with awk.
) 
) Larry Wall
) lwall@jpl-devvax.jpl.nasa.gov

This sed solution seems to be the only one I've seen to keep whitespace
intact.  Another solution for those who don't care about spacing:

#! /bin/sh
while {read ab} do
{
    echo -n $a\# | tr A-Z a-z;
    echo $b;
}

Where # is whatever character you want to separate the two.
-- 
        ########  U N * X   :   L i v e   F r e e   o r   D i e ########
Keith Jackson {convex, texsun}!smu!jackson == jackson@csvax.seas.smu.edu
== jackson@dfsun1.electro.swri.edu
Disclaimer: They're not paying me for my opinons.  I'd starve.

eli@panda.uucp (Eli Taub/100000) (07/12/90)

In article <1553@dfsun1.electro.swri.edu> jackson@dfsun1.electro.swri.edu (Keith Jackson) writes:
>I was trying to filter a file by making the first word lowercase and
>leaving the rest as is.
>
> Stuff deleted ....
>
>== jackson@dfsun1.electro.swri.edu     ##  UN*X - live free or die  ##
>Disclaimer: All views represented here are made by a person who doesn't plan ah

Try ex as a filter (weird !):

echo 'g/^[ 	]*\<[^ 	]*\>/s//\L&/' | ex - foo

* The `[ 	]' is typed: `[' `SPACE' `TAB' `]'

If the lines don't start with space or tab it could be much simpler.


                                                           _   |___      
Eli Taub                                                    |     |   \  |
(512) 838-4810                                                    |   /\/
Contractor at (AWD) IBM  | I express my opinions not IBM's.      /   |  \