[comp.unix.questions] Neat little .newsrc fixer-upper.

jfh@killer.UUCP (The Beach Bum) (04/26/88)

Not only is this an inappropriate posting, it is an intentional posting.
But hey, this turned out to be a real handy three minute hack.

This little toy takes your (possibly uneditable) .newsrc and removes
all the garbage between 1 and the last article you read.  i wrote it
because my .newsrc on killer was in bad shape and couldn't be edited
because some of the lines were too long.  this could be written much
better, with a man page and all, but then it might get posted to a
source group!

just copy the source between my name below and my .sig into fixrc.c
and compile.  then run with stdin as your newsrc and stdout as where
you want the new one to go.  just make sure the two aren't the same
file!

- john.
--
#include <stdio.h>

main ()
{
	int	c;
	int	i;
	char	low[8];
	char	high[8];

	while ((c = getchar ()) != EOF) {
		do {
			putchar (c);
		} while ((c = getchar ()) != ' ' && c != EOF);

		if (! getnum (low)) {
			while ((c = getchar ()) != '\n' && c != EOF)
				putchar (c);
			putchar ('\n');
			continue;
		}
		while ((c = getchar ()) != '\n' && c != EOF)
			getnum (high);

		printf (" %s-%s\n", low, high);
	}
}

getnum (s)
char	*s;
{
	int	c;
	int	i = 0;

	while ((c = getchar ()) != EOF && c >= '0' && c <= '9')
		s[i++] = c;

	if (c != EOF)
		ungetc (c, stdin);

	if (i == 0)
		return (0);

	s[i++] = 0;

	return (i);
}
-- 
John F. Haugh II                  SNAIL:  HECI Exploration Co. Inc.
UUCP: ...!ihnp4!killer!jfh                11910 Greenville Ave, Suite 600
"You can't threaten us, we're             Dallas, TX. 75243
  the Oil Company!"                       (214) 231-0993 Ext 260

dce@mips.COM (David Elliott) (04/26/88)

In article <3931@killer.UUCP> jfh@killer.UUCP (The Beach Bum) writes:
>Not only is this an inappropriate posting, it is an intentional posting.
>But hey, this turned out to be a real handy three minute hack.
>
>This little toy takes your (possibly uneditable) .newsrc and removes
>all the garbage between 1 and the last article you read.  i wrote it
>because my .newsrc on killer was in bad shape and couldn't be edited
>because some of the lines were too long.  this could be written much
>better, with a man page and all, but then it might get posted to a
>source group!

Why write a C program when a standard Unix utility can do the trick?

sed 's/^\([^:!]*[:!]\)[  ]\([0-9][0-9]*\)[-,][-,0-9]*[-,]\([0-9][0-9]*\)[    ]*$/\1 \2-\3/'

It's actually very easy when you understand the subexpressions:

^\([^:!]*[:!]\) 	is the newsgroup name, including the ! or :.
[  ]			(a space and a tab) is the whitespace separating the
			name of the newsgroup from the first article number
\([0-9][0-9]*\) 	is the number of the first article
[-,][-,0-9]*[-,]	describes all of the stuff between the first and
			last article numbers, whether it's ranges or lists
\([0-9][0-9]*\)		is the last article number
[  ]*$			is (optional) trailing whitespace and the end of the
			line

The \1 corresponds to the first set of things in \(\), \2 the second set,
and \3 the third.

-- 
David Elliott		dce@mips.com  or  {ames,prls,pyramid,decwrl}!mips!dce

chris@mimsy.UUCP (Chris Torek) (04/27/88)

In article <2083@quacky.mips.COM> dce@mips.COM (David Elliott) writes:
>Why write a C program when a standard Unix utility can do the trick?
>
>sed [amazing sed command deleted]

Because sed has a limit of BUFSIZ (or LBSIZE, but I suspect that is
for the hold buffer) length lines.

(Actually, I prefer to edit long .newsrc lines with Emacs.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

dik@cwi.nl (Dik T. Winter) (04/27/88)

In article <11237@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
 > In article <2083@quacky.mips.COM> dce@mips.COM (David Elliott) writes:
 > >Why write a C program when a standard Unix utility can do the trick?
 > >
 > >sed [amazing sed command deleted]
 > 
 > Because sed has a limit of BUFSIZ (or LBSIZE, but I suspect that is
 > for the hold buffer) length lines.
 > 
 > (Actually, I prefer to edit long .newsrc lines with Emacs.)
 > -- 
Actually I have never a need to edit long .newsrc lines, because I do
not get them.  How?  Simple, I removed all unsubscribed newsgroups from
it (makes rn start faster also).  And rn will not ask you everytime
about them, it only asks about truly new groups.
-- 
dik t. winter, cwi, amsterdam, nederland
INTERNET   : dik@cwi.nl
BITNET/EARN: dik@mcvax

nate@mipos3.intel.com (Nate Hess) (04/27/88)

In article <3931@killer.UUCP> jfh@killer.UUCP (The Beach Bum) writes:
>[...] I wrote it because my .newsrc on killer was in bad shape and
>couldn't be edited because some of the lines were too long.

Ah, but you *could* have edited it had you been using Emacs instead of
vi...

--woodstock
-- 
	   "How did you get your mind to tilt like your hat?"

...!{decwrl|hplabs!oliveb|pur-ee|qantel|amd}!intelca!mipos3!nate
<domainish> :   nate@mipos3.intel.com		ATT :    (408) 765-4309

jfh@rpp386.UUCP (John F. Haugh II) (04/28/88)

In article <2083@quacky.mips.COM> dce@quacky.UUCP (David Elliott) writes:
>In article <3931@killer.UUCP> jfh@killer.UUCP (The Beach Bum) writes:
>>This little toy takes your (possibly uneditable) .newsrc and removes
>>all the garbage between 1 and the last article you read.
>
>Why write a C program when a standard Unix utility can do the trick?
>

here's why.  this is the vmstat output for my home machine.  the most
notable number is forks.  in general i avoid unix utilities once i
understand the problem.  fixrc is less grief on the machine.

  407925 page cache hits
  205559 page cache misses
     408 procs swapped in
     410 procs swapped out
  205559 filesystem page reads
    8299 swap area page reads
    6897 swap area page writes
  163937 pages reclaimed from free list
 1170660 pages shared due to copy-on-write fork
   10621 pages shared due to cache hits
  958036 shared pages copied
75682332 page faults
13423894 cpu context switches
121472501 (non clock) device interrupts
 3174741 traps
76000468 system calls
  137024 forks

- john.
-- 
John F. Haugh II                 | "You see, I want a lot.  Perhaps I want every
River Parishes Programming       | -thing.  The darkness that comes with every
UUCP:   ihnp4!killer!rpp386!jfh  | infinite fall and the shivering blaze of
DOMAIN: jfh@rpp386               | every step up ..." -- Rainer Maria Rilke

pjh@mccc.UUCP (Pete Holsberg) (04/29/88)

In article <2116@mipos3.intel.com> nate@mipos3.intel.com (Nate Hess) writes:
...In article <3931@killer.UUCP> jfh@killer.UUCP (The Beach Bum) writes:
...>[...] I wrote it because my .newsrc on killer was in bad shape and
...>couldn't be edited because some of the lines were too long.
...
...Ah, but you *could* have edited it had you been using Emacs instead of
...vi...

Could you explain that?  What is there about emacs that lets it edit
.newsrc, and what is there about vi that forbids it?  Thanks.

nate@mipos3.intel.com (Nate Hess) (05/02/88)

In article <606@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
>Could you explain that?  What is there about emacs that lets it edit
>.newsrc, and what is there about vi that forbids it?  Thanks.


vi has a compiled-in limit to the length of lines in its buffer -- 
char buffer_line[LARGE-BUT-NEVER-LARGE-ENOUGH-NUMBER] -- and it croaks
if it tries to read in a file with a line larger than this limit,
truncating the remainder of the file.

Emacs has no such limit.  In fact, I have used Emacs to edit binaries.

--woodstock
-- 
	   "How did you get your mind to tilt like your hat?"

...!{decwrl|hplabs!oliveb|pur-ee|qantel|amd}!intelca!mipos3!nate
<domainish> :   nate@mipos3.intel.com		ATT :    (408) 765-4309

john@jetson.UUCP (John Owens) (05/03/88)

In article <606@mccc.UUCP>, pjh@mccc.UUCP (Pete Holsberg) writes:
> Could you explain that?  What is there about emacs that lets it edit
> .newsrc, and what is there about vi that forbids it?  Thanks.

It doesn't have to do with .newsrc specifically; it's just that most
implementations of emacs have no restrictions on the length of a single
line, but vi does.


-- 
John Owens		SMART HOUSE Development Venture
john@jetson.UUCP	(old uucp) uunet!jetson!john
+1 301 249 6000		(internet) john%jetson.uucp@uunet.uu.net

rbj@icst-cmr.arpa (Root Boy Jim) (05/04/88)

   here's why.  this is the vmstat output for my home machine.  the most
   notable number is forks.  in general i avoid unix utilities once i
   understand the problem.  fixrc is less grief on the machine.

     407925 page cache hits
     205559 page cache misses
	408 procs swapped in
	410 procs swapped out
     205559 filesystem page reads
       8299 swap area page reads
       6897 swap area page writes
     163937 pages reclaimed from free list
    1170660 pages shared due to copy-on-write fork
      10621 pages shared due to cache hits
     958036 shared pages copied
   75682332 page faults
   13423894 cpu context switches
   121472501 (non clock) device interrupts
    3174741 traps
   76000468 system calls
     137024 forks

And the worst part is, his machine's only been up for three minutes!

	(Root Boy) Jim Cottrell	<rbj@icst-cmr.arpa>
	National Bureau of Standards
	Flamer's Hotline: (301) 975-5688
	The opinions expressed are solely my own
	and do not reflect NBS policy or agreement
	Yow!  It's a hole all the way to downtown Burbank!

jc@minya.UUCP (John Chambers) (05/07/88)

In article <11237@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> In article <2083@quacky.mips.COM> dce@mips.COM (David Elliott) writes:
> >Why write a C program when a standard Unix utility can do the trick?
> >
> >sed [amazing sed command deleted]
> 
> Because sed has a limit of BUFSIZ (or LBSIZE, but I suspect that is
> for the hold buffer) length lines.
> 
And also because the C is easier to type, and takes fewer debug
runs to get right. (;-)

-- 
John Chambers <{adelie,ima,maynard,mit-eddie}!minya!{jc,root}> (617/484-6393)

You can't make a turtle come out.
	-- Malvina Reynolds