[comp.unix.wizards] summary: number range parsing in sh

wyle@solaris.UUCP (Mitchell Wyle) (07/28/88)

Both Chris (the wiz) Torek <chris@mimsy.umd.edu>, and
Dave (the guru) Elliot (dce@mips.com) suggested using
IFS and case rather than sed.  Using these constructs
saves 2 forks and an exec each time!! This savings is
about 50% of the time used by the script.

My heart-felt thanks to both Chris Torek and Dave Elliot.
The net is a great place to grock unix.

Here is the code now.  Anyone care to make it even
faster?  >> grin <<   perl?  awk?   ;-)

#!/bin/sh
#
# parse a range of numbers in the form:
#
# <range> :== <number>
#         |   <number>,range
#         |   <number>-<number>
#         |   <number>-
#
# <number>:== <digit><number>
#
# <digit> :== 0|1|2|3|4|5|6|7|8|9
#
# to create a list of numbers to send such as 1,3,8-11 -> 1 3 8 9 10 11
#
# input range is in variable $RA  ($1 for testing)

RA=$1
R=""

echo in:  $RA

SIFS="$IFS"
IFS=","
for i in $RA ; do
  R=$R" "$i
done
RA=$R
IFS=$SIFS
R=""

for tok in $RA ; do
  case $tok in 
  *-*)
    SIFS="$IFS"
    IFS='-'
    set $tok
    IFS="$SIFS"
    i=$1
    max=${2:-99}
    while test $i -le $max ; do
      R=$R" "$i
      i=`expr $i + 1`
    done ;;
  *)
    R=$R" "$tok ;;
  esac
  RA=$R
done

echo out: $RA

-- 
-Mitchell F. Wyle            wyle@ethz.uucp
Institut fuer Informatik     wyle%ifi.ethz.ch@relay.cs.net
ETH Zentrum                  
8092 Zuerich, Switzerland    +41 1 256-5237

morrell@hpsal2.HP.COM (Michael Morrell) (08/02/88)

/ hpsal2:comp.unix.wizards / wyle@solaris.UUCP (Mitchell Wyle) / 12:03 am  Jul 28, 1988 /

Here is the code now.  Anyone care to make it even
faster?  >> grin <<   perl?  awk?   ;-)

[ code deleted ]
-- 
-Mitchell F. Wyle            wyle@ethz.uucp
Institut fuer Informatik     wyle%ifi.ethz.ch@relay.cs.net
ETH Zentrum                  
8092 Zuerich, Switzerland    +41 1 256-5237
----------
Well, this is an awk version of this which I wrote some time back.
Note it also handles something of the form "-8" as a shorthand for "1-8".
This seemed to run significantly faster on my machine that any of the
posted shell scripts.

   Michael Morrell
   
-------------------
#! /bin/sh
RA=$1
echo in: $RA
RA=`echo $RA | awk -F, -f parse.awk`
echo out: $RA
-------------------

where parse.awk contains
-------------------
{ for (i = 1; i <= NF; i++) {
    j = index($i,"-");
    if (j == 0)
      printf "%d ",$i
    else {
      if (j == 1)
        kmin = 1;
      else
        kmin = substr($i,1,j-1) + 0;

      if (j == length($i))
        kmax = 99;
      else
        kmax = substr($i,j+1) + 0;

      for (k = kmin; k <= kmax; k++)
        printf "%d ",k
    }
  }
}
-------------------

leo@philmds.UUCP (Leo de Wit) (08/02/88)

In article <470@solaris.UUCP> wyle@solaris.UUCP (Mitchell Wyle) writes:
|Both Chris (the wiz) Torek <chris@mimsy.umd.edu>, and
|Dave (the guru) Elliot (dce@mips.com) suggested using
|IFS and case rather than sed.  Using these constructs
|saves 2 forks and an exec each time!! This savings is
|about 50% of the time used by the script.

I don't see how you get to a number of 50%; it depends entirely on the
parameters fed to the script. With long ranges the saving is only a
fraction of the code.

|My heart-felt thanks to both Chris Torek and Dave Elliot.
|The net is a great place to grock unix.
|
|Here is the code now.  Anyone care to make it even
|faster?  >> grin <<   perl?  awk?   ;-)

C. But I don't think that was the original intention. If you plan to use
only sh and 'the tools', you can however speed up the range generation
considerably:


|    while test $i -le $max ; do
|      R=$R" "$i
|      i=`expr $i + 1`
|    done ;;

First you can avoid the test command: do something like:

if test $i -le $max
then
    while :
    do
        R=$R" "$i
        i=`expr $i + 1`
        case $i in
        $max) break;;
        esac
    done
fi

Since case and : are builtin, this avoids a fork and exec for each number.
Note the first test IS needed, because the program would loop forever (?)
if $i was greater than $max already (can it be?).

Second you could count with ten numbers 'at-a-time'; i and max will contain
i/10 and max/10 respectively - the start and end have to be dealt with
separately because they don't have to start with a number ending on 0 or
end with a number ending on 9. The main part becomes:

while :
do
    R=$R" "${i}0" "${i}1" "${i}2" "${i}3" "${i}4" "${i}5" "${i}6" "${i}7" "${i}8" "${i}9
    i=`expr $i + 1`
    case $i in
    $max) break;;
    esac
done

The details of the rest that is needed to make it run OK I leave up to you.
Success!

         Leo.

wu@spot.Colorado.EDU (WU SHI-KUEI) (08/03/88)

In article <470@solaris.UUCP> wyle@solaris.UUCP (Mitchell Wyle) writes:
>... Here is the code now.  Anyone care to make it even
>faster?  >> grin <<   perl?  awk?   ;-)

#! /bin/sh
#
# parse a range of numbers in the form:
#
# <range> :== <number>
#         |   <number>,range
#         |   <number>-<number>
#         |   <number>-
#
# <number>:== <digit><number>
#
# <digit> :== 0|1|2|3|4|5|6|7|8|9
#
# to create a list of numbers to send such as 1,3,8-11 -> 1 3 8 9 10 11
#
# input range is in variable $RA  ($1 for testing)


Using a severe test like 1,2,3-8,11-15,34,35,127-199 for $1, the following
awk program runs about 20 times faster on a 3B2/400 than Chris Torek's solution.

---------------------- cut here --------------------------------
echo "$1" |
awk -F',' '
{
	for (i = 1; i <= NF; ++i) {
		if ((p = index($i, "-")) == 0)
			printf"%s ", $i
		else {
			start = substr($i, 0, p - 1)
			end = substr($i, p + 1)
			for (j = int(start); j <= int(end); ++j)
				printf"%s ", j
		}
	}
	printf"\n"
}'
------------------------------------------------------------------
Just a guest on this machine.  In real life
Carl Brandauer
daemon associates
1760 Sunset Blvd
Boulder, CO 80302
303-442-1731
uunet!nbires!bdaemon!carl

fst@mcgp1.UUCP (Skip Tavakkolian) (08/03/88)

In article <470@solaris.UUCP>, wyle@solaris.UUCP (Mitchell Wyle) writes:
> Here is the code now.  Anyone care to make it even
> faster?  >> grin <<   perl?  awk?   ;-)
> #!/bin/sh
> # to create a list of numbers to send such as 1,3,8-11 -> 1 3 8 9 10 11
> RA=$1
> R=""
> echo in:  $RA
> SIFS="$IFS"
> IFS=","
> for i in $RA ; do
>   R=$R" "$i
> done
> RA=$R
> IFS=$SIFS
> R=""
> for tok in $RA ; do
>   case $tok in 
>   *-*)
>     SIFS="$IFS"
>     IFS='-'
>     set $tok
>     IFS="$SIFS"
>     i=$1
>     max=${2:-99}
>     while test $i -le $max ; do
>       R=$R" "$i
>       i=`expr $i + 1`
>     done ;;
>   *)
>     R=$R" "$tok ;;
>   esac
>   RA=$R
> done
> echo out: $RA
> -- 
> -Mitchell F. Wyle            wyle@ethz.uucp

Here is an AWK script that does what you want and some time measurements
as run on a 3b2-600 SYSV 3.1.1 using time(1) (I am using the old awk).

awk '
END {
    # no error checking done here
    arg = "'$1'"
    if (length(arg) == 0)
	exit

    print "in: ", arg
    printf "out: "
    NF = split(arg, args, ",")
    for (i = 1 ; i <= NF ; i++) {
	if (split(args[i], range, "-") < 2)
	    printf "%d ", args[i]
	else {
	    for (j = range[1]; j <= range[2]; j++)
		printf "%d ", j
	    }
	}
    print
    }' </dev/null

Here are the time results:

# SHELL version
time range.sh 1,3,8-11,22,24-32
in: 1,3,8-11,22,24-32
out: 1 3 8 9 10 11 22 24 25 26 27 28 29 30 31 32

real        1.5
user        0.4
sys         0.8

# AWK version
time range.awk 1,3,8-11,22,24-32
in:  1,3,8-11,22,24-32
out: 1 3 8 9 10 11 22 24 25 26 27 28 29 30 31 32 

real        0.4
user        0.1
sys         0.1

Sincerely

-- 
Fariborz ``Skip'' Tavakkolian
UUCP	...!uw-beaver!tikal!mcgp1!fst

UNIX is a registered trademark of AT&T