cottrell@nbs-vms.arpa (COTTRELL, JAMES) (11/28/85)
/* > Everybody I've shown this program to has *groaned* and complained > about what a nasssty program it was... This program was written > to help decode a bitnet routing table that I had been netcopy'd > to me and didn't get translated into ascii. So after running dd > over it, the line markers had disappeared into never never land. > But from looking real closely at the file I could see that each > line was supposed to start with ROUTE..... thus this program: It doesn't work. Suppose the sequence `ROUROUTE' occurs. The second `R' will not be recognized as the start of the sequence! > #include <stdio.h> > > main() > { > register int r, o, u, t, e; > > while ((r = getchar()) != EOF) { > if (r == 'R') > if ((o = getchar()) == 'O') > if ((u = getchar()) == 'U') > if ((t = getchar()) == 'T') > if ((e = getchar()) == 'E') > printf("\nROUTE"); > else > printf("ROUT%c", e); > else > printf("ROU%c", t); > else > printf("RO%c", u); > else > printf("R%c", o); > else > putchar(r); > } > putchar('\n'); > } > -- > David Herron, cbosgd!ukma!david, david@UKMA.BITNET. I thought of ways to use existing tools to do the job. How about this: 1) run thru `tr' to change all `R's to newlines. This gives you all possible places where a line might start. Now run an `ex' script that chex (wheat, corn, rice) each line begins with OUTE. If it doesn't, then put back the R. Then for each line that begins with an R, join it with the previous line. Finally, put back an R on each line. So we have: v/^OUTE/s/^/R/ g/^R/.-1,.j g/^/s/^/R/ wq If you're clever you might be able to work out a sed script. > Experience is something you don't get until just after you need it. To quote what you once said to me:`Now why didn't you think before posting?' jim cottrell@nbs */ ------
jsdy@hadron.UUCP (Joseph S. D. Yao) (11/30/85)
In article <135@brl-tgr.ARPA> cottrell@nbs-vms.arpa (COTTRELL, JAMES) writes: > ... `Now why didn't you think before posting?' >> ... This program was written >> to help decode a bitnet routing table that I had been netcopy'd >> to me and didn't get translated into ascii. So after running dd >> over it, the line markers had disappeared into never never land. >> But from looking real closely at the file I could see that each >> line was supposed to start with ROUTE..... thus this program: > >It doesn't work. Suppose the sequence `ROUROUTE' occurs. The second >`R' will not be recognized as the start of the sequence! > >I thought of ways to use existing tools to do the job. How about this: >1) run thru `tr' to change all `R's to newlines. This gives you all >possible places where a line might start. Now run an `ex' script that >chex (wheat, corn, rice) each line begins with OUTE. If it doesn't, >then put back the R. Then for each line that begins with an R, join >it with the previous line. Finally, put back an R on each line. Yes, Herron's algorithm won't work without some way of backing up. No, Cottrell's algorithm won't work either. It assumes that ALL NL's have been removed, which is a possible but not necessary interpretation of the originally stated problem. In C, one way to do things is: while ((c = my_getchar()) != EOF { if (c != 'R') { putchar(c); last_put = c; continue; } gather 4 more test for ROUTE if so, print NL + 5 chars; last_put = 'E'; else ungetchar 4 (which is why my_getchar()) } if (last_put != NL) /* almost certainly so */ putchar(NL); This assumes that Herron is correct in his assumption that the word "ROUTE" was one-to-one with line starts. Note also that Herron implies a conversion from E***** to ASCII. If the original tape/file was blocked with fixed-length records, then there is a dd arg to size lines (cbs=, I believe). If var- length, he may have to read all lines in the original for the record sizes and substitute for them the E@#$%^ NL character before dd'ing. -- Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}