[comp.unix.questions] Picking a character from a word

barsam@eros.ame.arizona.edu (Barsam Marasli) (04/23/88)

I have a csh question.
I would like to have a script that will echo, say the 4th character
of an arguement. I came up with the following:

set word=`echo $1 | od -c` ; echo $word[5]

which seems to do the job but somehow I feel like there are
more elegant ways of doing this. Please reply by e-mail or
post. Also what's an alternative in sh? Thanx.

------------------------------------------------------------------------
Barsam Marasli
Internet: eros!barsam@arizona.edu
UUCP    : ...{allegra,ihnp4,cmcl2,hao!noao}!arizona!eros!barsam
Bitnet  : barsam@arizrvax
------------------------------------------------------------------------
Barsam Marasli                  # Speak slowly, I hear with an accent. #
Internet: eros!barsam@arizona.edu
UUCP    : ...{allegra,ihnp4,cmcl2,hao!noao}!arizona!eros!barsam
Bitnet  : barsam@arizrvax

gandalf@csli.STANFORD.EDU (Juergen Wagner) (04/23/88)

How about
	sed 's/^...\(.\).*/\1/'
which extracts the fourth character of a string read from stdin.

More general (and more ugly):
	awk '{print substr($0,4,1);}'
which allows you to pick an arbitrary substring from the lines on stdin.

If you are using this very often, I suggest to write a small C program
doing the job (string manipulation). Awk and sed are not very fast.

-- 
Juergen "Gandalf" Wagner,		   gandalf@csli.stanford.edu
Center for the Study of Language and Information (CSLI), Stanford CA

ok@quintus.UUCP (Richard A. O'Keefe) (04/23/88)

In article <578@amethyst.ma.arizona.edu>,
barsam@eros.ame.arizona.edu (Barsam Marasli)
[ wants to extract the Nth character of an argument in csh() or sh() ].

The most elegant solution would be one which uses the weakest tool,
yet which uses that tool in a "direct" way.  Using 'sed' or 'awk' is
clearly overkill:  all you need is expr(1).

Suppose you have variables
	String		holds the argument
	Pos		says where to start (1..length(String)+1)
	Len		says how much (0..length(String)+1-Pos)
and want
	SubStr
to hold the indicated chunk.  Do
	set SubStr = `expr substr $String $Pos $Len`
in csh(1), or
	SubStr=`expr substr $String $Pos $Len`
in sh(1).  For example, to get the 4th character, you would do
	set SubStr = `expr substr $String 4 1`
{Don't omit the $Len argument!}

Unfortunately, "expr substr" is a BSD-ism which has yet to find its way
into the SVID.  To get something which works in both, you have to use
infix colon.

	set SubStr = `expr $String : '...\(...\)'`		# csh
or
	SubStr=`expr $String : '...\(...\)'`			# sh

where the first set of dots has one dot for each character you DON'T
want, and the second set of dots (between \( and \)) has one dot for
each character you DO want.  This is like an ed(1) pattern, and the
bit between \( and \) is the value returned by expr.  For example,
to get the 4th character, you would do
	SubStr=`expr $String : '...\(.\)'`

Then, of course, there is the slightly less elegant
	SubStr=`echo $String | cut -c4`

There are of course the usual subtleties to worry about if String
contains strange characters.  Strictly speaking, it is best to write
	set SubStr = `expr "$String" : '...\(...\)'`	# csh
	SubStr=`      expr "$String" : '...\(...\)'`	# sh
or even
	Substr=`cut -c4 <<EOF
	$String
	EOF `

Oh the joys of macro processors; fun till it hurts.

drears@ardec.arpa (Dennis G. Rears (FSAC)) (04/24/88)

Barsam Marasli  writes:

->I have a csh question.
->I would like to have a script that will echo, say the 4th character
->of an arguement. I came up with the following:
->
->set word=`echo $1 | od -c` ; echo $word[5]
->
->which seems to do the job but somehow I feel like there are
->more elegant ways of doing this. Please reply by e-mail or
->post. Also what's an alternative in sh? Thanx.
->
    If all you want is the fourth character of a word echod out try:

echo $1|cut -c4

This works in both the csh and the bourne shell.

...dennis
--------------------------------------------------------------------------
ARPA:	drears@ardec-ac4.arpa	UUCP:  	...!uunet!ardec-ac4.arpa!drears
AT&T:	201-724-6639		Snailmail:	Box 210, Wharton, NJ 07885
Flames:	/dev/null		Reincarnation: newton!babbage!patton!drears
Work:	SMCAR-FSS-E, Dennis Rears, Bldg 94, Picatinny Ars, NJ 07806
--------------------------------------------------------------------------

dhesi@bsu-cs.UUCP (Rahul Dhesi) (04/24/88)

The solutions offered to this question point out a weakness in the UNIX
utilities:

     The UNIX utilities are far from being a minimal, orthogonal,
     complete set of tools.

There is a lot of overlap in what the different tools do.  Yet there
are many things that none of them do properly.

Reckless speculation follows:  Perhaps AT&T, while designing the latest
and greatest non-BSD version of UNIX, will do something about this.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

rupley@arizona.edu (John Rupley) (04/24/88)

In article <578@amethyst.ma.arizona.edu>, barsam@eros.ame.arizona.edu (Barsam Marasli) writes:
> I have a csh question.
> I would like to have a script that will echo, say the 4th character
> of an arguement. I came up with the following:
> 
> set word=`echo $1 | od -c` ; echo $word[5]
> 
> which seems to do the job but somehow I feel like there are
> more elegant ways of doing this. Please reply by e-mail or
> post. Also what's an alternative in sh? Thanx.

It can be done rather neatly the Korn shell:

	aaa=${1#???};echo ${aaa%${1#????}}

Incorporates only shell commands.  Satisfies the test, "use the 
simplest tool."  Reasonably elegant.  And it's fast - for timing, see 
the test output below, which compares the above and several other 
suggestions.

John Rupley
 uucp: ..{ihnp4 | hao!noao}!arizona!rupley!local
 internet: rupley!local@megaron.arizona.edu
 telex: 9103508679(JARJAR)
 (H) 30 Calle Belleza, Tucson AZ 85716 - (602) 325-4533
 (O) Dept. Biochemistry, Univ. Arizona, Tucson AZ 85721 - (602) 621-3929

---------------test-------------------------------
x=${1:-12345}
time (aaa=${x#???};echo ${aaa%${x#????}})
echo "\n"
time (echo $x|cut -c4)
echo "\n"
time (expr $x : '...\(.\)' )
---------------output from above------------------
4

real	0m0.06s
user	0m0.01s
sys	0m0.03s


4

real	0m0.90s
user	0m0.03s
sys	0m0.73s


4

real	0m0.81s
user	0m0.06s
sys	0m0.65s

kutz@bgsuvax.UUCP (Kenneth Kutz) (04/25/88)

In article <578@amethyst.ma.arizona.edu>, barsam@eros.ame.arizona.edu (Barsam Marasli) writes:
> I would like to have a script that will echo, say the 4th character
> of an arguement. I came up with the following:
  
> set word=`echo $1 | od -c` ; echo $word[5]
  
> which seems to do the job but somehow I feel like there are
> more elegant ways of doing this. Please reply by e-mail or
> post. Also what's an alternative in sh? Thanx.

Use 'cut' if you have it.

[echo "stuff" | cut -c3] --> outputs 'u'
         
similarly

[echo "stuff" | cut -c1] --> outputs 's'


-- 
--------------------------------------------------------------------
      Kenneth J. Kutz         	CSNET kutz@bgsu.edu
				UUCP  ...!osu-cis!bgsuvax!kutz
 Disclaimer: Opinions expressed are my own and not of my employer's
--------------------------------------------------------------------

schwartz@gondor.cs.psu.edu (Scott Schwartz) (04/25/88)

In article <2715@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>The solutions offered to this question point out a weakness in the UNIX
>utilities:
>
>     The UNIX utilities are far from being a minimal, orthogonal,
>     complete set of tools.

In particular, they do nothing to help one refill laser printer 
toner cartridges. :-) :-) :-)

Seriously, though, it is unavoidable that as newer, more powerful tools
are introduced (expanding our vector space of solvable problems) they
sometimes necessarily overlap with other tools.  Often this is because
we humans like to see each tool do a particular non trivial set to
tasks well.  Then there are upward compatability issues.  Anyone want
to toss out sed just because awk can do it's job?  My shell scripts
would never forgive me!

>There is a lot of overlap in what the different tools do.  Yet there
>are many things that none of them do properly.

I agree with you to some extent, but I think that you are unfairly
judging what is "proper".  For example, awk solved the n-th character
problem perfectly, except that it is considered to be too slow.
"cut" is redundant with "awk", but is included in our toolchest because it
is less general but more efficient.  The same argument applies to
shell builtins.  They are a language feature that make the shell a better
tool at the expense of redundancy with other things.

Anyway, this is a reasonable topic for discussion, so:
Where would you suggest we start trimming things?  
What do you think they don't handle properly?

>Reckless speculation follows:  Perhaps AT&T, while designing the latest
>and greatest non-BSD version of UNIX, will do something about this.

Not likely, given the need for upward compatablilty with the present.
Fortunately, I think that this forum will have something to say about
the issue. :-)


-- Scott Schwartz     schwartz@gondor.cs.psu.edu    schwartz@psuvaxg.bitnet

rbj@icst-cmr.arpa (Root Boy Jim) (04/27/88)

   From: "Richard A. O'Keefe" <ok@quintus.uucp>

   Unfortunately, "expr substr" is a BSD-ism which has yet to find its way
   into the SVID.

Or the manual entry.

Here is yes another solution. I assume your word has no spaces in
it, or else it would be many words.

set x=abcdefghijklmnopqrstuvwxyz	# input `word'
set y=(`echo $x | sed 's/./& /g'`)	# chg each char to a word
set z=$y[17]				# select char, 1-origin indexing
echo $z					# should produce `q'

   Oh the joys of macro processors; fun till it hurts.

Yeah, but it feels so good when you stop.

	(Root Boy) Jim Cottrell	<rbj@icst-cmr.arpa>
	National Bureau of Standards
	Flamer's Hotline: (301) 975-5688
	The opinions expressed are solely my own
	and do not reflect NBS policy or agreement
The PINK SOCKS were ORIGINALLY from 1952!!
 But they went to MARS around 1953!!

guy@gorodish.Sun.COM (Guy Harris) (04/27/88)

>    Unfortunately, "expr substr" is a BSD-ism which has yet to find its way
>    into the SVID.

> Or the manual entry.

Actually, "expr substr" is an AT&T-ism that they never documented and that they
thought better of in S3 or so and deleted.  It has the disadvantage that a
token of "substr" - or "index" or "length" - is always treated as the operator
in question, even if it's supposed to be just a string.  (Another undocumented
feature of V7's "expr" - namely the "match" prefix operator which is equivalent
to the ":" infix operator - is still around in S5, so it doesn't work as a
string.)

ok@quintus.UUCP (Richard A. O'Keefe) (04/27/88)

In article <50995@sun.uucp>, guy@gorodish.Sun.COM (Guy Harris) writes:
> [I wrote]
> >    Unfortunately, "expr substr" is a BSD-ism which has yet to find its way
> >    into the SVID.
> Actually, "expr substr" is an AT&T-ism that they never documented and that they
> thought better of in S3 or so and deleted.

Thanks for the information.  And thanks for the warning that 
	expr <string1> <relop> <string2>
won't work when <string1> is one of 'length', 'substr', 'index'.
However, deleting those operators did _NOT_ fix the general problem:
it doesn't work too well if <string1> = "(".

It should be noted that test(1) has similar problems: according to the SVID,
	test $string
is supposed to succeed if $string is non-empty, e.g.
	test "a"
is true, and
	test ""
is false.  But let $string be "-z", and you get an error message!  You
would expect that using "-n $string" would eliminate this ambiguity,
but it introduces another: try string="="!.

The world is still waiting for a UNIX utility like test(1) or expr(1)
which works all the time.

les@chinet.UUCP (Leslie Mikesell) (04/28/88)

In article <905@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>is false.  But let $string be "-z", and you get an error message!  You
>would expect that using "-n $string" would eliminate this ambiguity,
>but it introduces another: try string="="!.

The work-around that I have seen is:

if test "X$string" = "X"

Now, does someone have an easy way to test if a directory contains any
files or not?

  Les Mikesell

rbj@icst-cmr.arpa (Root Boy Jim) (05/04/88)

   From: "Richard A. O'Keefe" <ok@quintus.uucp>

   The world is still waiting for a UNIX utility like test(1) or expr(1)
   which works all the time. [re: test "$a", if a is "-z"]

Good point. In fact, one might argue that neither concept ought to be
used. In the old days, we used tricks like `test "x$a" = "x"'. While
such tricks are nonintuitive, they get the job done, and with an
appropriate hint in the manual, solve the user's problem all the time.

	(Root Boy) Jim Cottrell	<rbj@icst-cmr.arpa>
	National Bureau of Standards
	Flamer's Hotline: (301) 975-5688
	The opinions expressed are solely my own
	and do not reflect NBS policy or agreement
	What UNIVERSE is this, please??