[comp.lang.perl] Simple perl question

jorgy@boulder.Colorado.EDU (Eric R. Jorgensen) (08/21/90)

I have come across the following problem I have been unable to solve
since I am new to perl programming, 

I have input that looks like:

x	1
x	5
x	6
x	7
y	25
y	26

And I would like output like:

x	1,5-7
y	25-26

I would be very appreciative if someone could help me, or at least
point me in the direction of a similar example.

Thanks in advance,
jorgy

Eric R. Jorgensen			internet: jorgy@boulder.colorado.edu
University of Colorado, Boulder  	uucp: ...!{ncar|nbires}!boulder!jorgy
					bitnet:   jorgy@colorado.BITNET
"Women.  Can't live with 'em, pass the beer nuts." -- Norm on Cheers

phillips@cs.ubc.ca (George Phillips) (08/21/90)

In article <24941@boulder.Colorado.EDU> jorgy@refuge.colorado.edu (Eric R. Jorgensen) writes:
>I have input that looks like:
>
>x	1
>x	5
>x	6
>x	7
>y	25
>y	26
>
>And I would like output like:
>
>x	1,5-7
>y	25-26

Here's a script which should do it.  I'd bet Randal can manage this in
3 greps and 5 regexps or less.

#!/usr/bin/perl

while (<>) {
  ($left, $num) = split;
  $nums{$left} .= $num . " ";
}

while (($left, $num) = each %nums) {
  print "$left\t";
  @snum = sort(bynum split(' ', $num));
  $#srange = -1;
  for ($i = 0; $i <= $#snum; $i = $j) {
    for ($j = $i; $j <= $#snum && $snum[$i] == $snum[$j] + $i - $j; $j++) {}
    if ($j == $i + 1) {
      push (@srange, $snum[$i]);
    }
    else {
      push (@srange, "$snum[$i]-$snum[$j-1]");
    }
  }
  print join(',', @srange) . "\n";
}

sub bynum { $a - $b; }

--
George Phillips phillips@cs.ubc.ca {alberta,uw-beaver,uunet}!ubc-cs!phillips

merlyn@iwarp.intel.com (Randal Schwartz) (08/21/90)

In article <24941@boulder.Colorado.EDU>, jorgy@boulder (Eric R. Jorgensen) writes:
| I have input that looks like:
| 
| x	1
| x	5
| x	6
| x	7
| y	25
| y	26
| 
| And I would like output like:
| 
| x	1,5-7
| y	25-26
| 
| I would be very appreciative if someone could help me, or at least
| point me in the direction of a similar example.

Well, here's a lukewarm stab at it.  It worked the first time on your
data, but I didn't try it on anything else.  And there are probably
ways to optimize the heck out of it.  (But I'm too busy working on
"that other thing that I'm not supposed to mention because it keeps
sounding more and more like a commercial". :-)

################################################## snip here
#!/local/usr/bin/perl

while (<>) {
	chop;
	($tag,$page) = split;
	(warn "[skipping '$_']"), next if
		(length($tag) < 1) || ($page !~ /^\d+$/);
	if ($tag ne $otag) {
		print "$otag\t$opage\n" if length($otag);
		($otag,$opage) = ($tag,$page);
		next;
	}
	if ($opage =~ /^(.*,)?(\d+)$/) {
		($early,$recent) = ($1,$2);
		if ($recent == $page - 1) {
			$opage = "$early$recent-$page";
		} else {
			$opage = "$early$recent,$page";
		}
	} elsif ($opage =~ /^(.*-)(\d+)$/) {
		($early,$recent) = ($1,$2);
		if ($recent == $page - 1) {
			$opage = "$early$page";
		} else {
			$opage = "$early$recent,$page";
		}
	}
}

print "$otag\t$opage\n" if length($otag);
################################################## snip here

print "Just another Perl hacker,"
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

merlyn@iwarp.intel.com (Randal Schwartz) (08/21/90)

In article <1990Aug20.223101.28912@iwarp.intel.com>, merlyn@iwarp (Randal Schwartz) writes:
| In article <24941@boulder.Colorado.EDU>, jorgy@boulder (Eric R. Jorgensen) writes:
| | I have input that looks like:
| | x	1
| | x	5
| | x	6
| | x	7
| | y	25
| | y	26
| | And I would like output like:
| | x	1,5-7
| | y	25-26
| 
| Well, here's a lukewarm stab at it.

[my lukewarm stab deleted.]

Naah.  I like this one better.  Same general idea, but using a
s/foo/bar/e to do the hard work.
================================================== snip
#!/local/usr/bin/perl

while (<>) {
	chop;
	($tag,$page) = split;
	(warn "[skipping '$_']"), next if
		(length($tag) < 1) || ($page !~ /^\d+$/);
	if ($tag ne $otag) {
		print "$otag\t$opage\n" if length($otag);
		$otag = $tag; $opage = $page;
		next;
	}
	$opage =~ s#([-,])?(\d+)$#
		($2 + 1 != $page) ? "$1$2,$page" :
		($1 eq "-") ? "-$page" : "$1$2-$page"
	#e;
}

print "$otag\t$opage\n" if length($otag);
================================================== snip

Gotta stop hacking this now.  Although, creating this program couldn't
have come at a better time (I presume you are making an index for a
book... a task at which I am becoming more acquainted with as each day
passes :-).

print "Just another Perl hacker,"
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/