[net.lang.c] Comments on your program

cottrell@nbs-vms.arpa (COTTRELL, JAMES) (11/28/85)

/*
> Everybody I've shown this program to has *groaned* and complained
> about what a nasssty program it was...  This program was written
> to help decode a bitnet routing table that I had been netcopy'd
> to me and didn't get translated into ascii.  So after running dd
> over it, the line markers had disappeared into never never land.
> But from looking real closely at the file I could see that each
> line was supposed to start with ROUTE.....  thus this program:

It doesn't work. Suppose the sequence `ROUROUTE' occurs. The second
`R' will not be recognized as the start of the sequence!
 
> #include <stdio.h>
> 
> main()
> {
> 	register int r, o, u, t, e;
> 
> 	while ((r = getchar()) != EOF) {
> 		if (r == 'R')
> 			if ((o = getchar()) == 'O')
> 				if ((u = getchar()) == 'U')
> 					if ((t = getchar()) == 'T')
> 						if ((e = getchar()) == 'E')
> 							printf("\nROUTE");
> 						else
> 							printf("ROUT%c", e);
> 					else
> 						printf("ROU%c", t);
> 				else
> 					printf("RO%c", u);
> 			else
> 				printf("R%c", o);
> 		else
> 			putchar(r);
> 	}
> 	putchar('\n');
> }
> -- 
> David Herron,  cbosgd!ukma!david, david@UKMA.BITNET.

I thought of ways to use existing tools to do the job. How about this:
1) run thru `tr' to change all `R's to newlines. This gives you all
possible places where a line might start. Now run an `ex' script that
chex (wheat, corn, rice) each line begins with OUTE. If it doesn't,
then put back the R. Then for each line that begins with an R, join
it with the previous line. Finally, put back an R on each line.
So we have:
		v/^OUTE/s/^/R/
		g/^R/.-1,.j
		g/^/s/^/R/
		wq

If you're clever you might be able to work out a sed script.

> Experience is something you don't get until just after you need it.

To quote what you once said to me:`Now why didn't you think before posting?'

	jim		cottrell@nbs
*/
------

jsdy@hadron.UUCP (Joseph S. D. Yao) (11/30/85)

In article <135@brl-tgr.ARPA> cottrell@nbs-vms.arpa (COTTRELL, JAMES) writes:
>                             ...  `Now why didn't you think before posting?'
>>                                    ...  This program was written
>> to help decode a bitnet routing table that I had been netcopy'd
>> to me and didn't get translated into ascii.  So after running dd
>> over it, the line markers had disappeared into never never land.
>> But from looking real closely at the file I could see that each
>> line was supposed to start with ROUTE.....  thus this program:
>
>It doesn't work. Suppose the sequence `ROUROUTE' occurs. The second
>`R' will not be recognized as the start of the sequence!
>
>I thought of ways to use existing tools to do the job. How about this:
>1) run thru `tr' to change all `R's to newlines. This gives you all
>possible places where a line might start. Now run an `ex' script that
>chex (wheat, corn, rice) each line begins with OUTE. If it doesn't,
>then put back the R. Then for each line that begins with an R, join
>it with the previous line. Finally, put back an R on each line.

Yes, Herron's algorithm won't work without some way of backing up.
No, Cottrell's algorithm won't work either.  It assumes that ALL
NL's have been removed, which is a possible but not necessary
interpretation of the originally stated problem.  In C, one way
to do things is:
	while ((c = my_getchar()) != EOF {
		if (c != 'R') {
			putchar(c);
			last_put = c;
			continue;
		}
		gather 4 more
		test for ROUTE
		if so, print NL + 5 chars; last_put = 'E';
		else ungetchar 4 (which is why my_getchar())
	}
	if (last_put != NL)	/* almost certainly so */
		putchar(NL);
This assumes that Herron is correct in his assumption that the
word "ROUTE" was one-to-one with line starts.

Note also that Herron implies a conversion from E***** to ASCII.
If the original tape/file was blocked with fixed-length records,
then there is a dd arg to size lines (cbs=, I believe).  If var-
length, he may have to read all lines in the original for the
record sizes and substitute for them the E@#$%^ NL character
before dd'ing.
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}