valley@uchicago (Doug Dougherty) (03/29/91)
This article contains the source for a version of UUSANITY, for decoding uuencoded articles. It is adapted from an original idea by Joe Peterson. (joep@stardent.com) It has been correctly compiled under Sun OS and with Turbo C on the PC. The program is basically a replacement for the built-in ":decode" command in my newsreader, nn. nn's :decode often gets confused with multi-part submissions to unmoderated groups. Like :decode, my program handles multiple files in a single pass. To use it, simply pipe all of the files into this program or use the "write pipe" option in nn. This program accepts no command line params, nor does it work with non-redirected stdin. This program spits all "offending" lines out to stdout. To inhibit this, redirect to either nul (under DOS) or /dev/null (under Unix). Informatory messages go out on stderr. Note: If there are any errors in main(), that's my egg. If there are any in read_line() or valid(), talk to Joe... Other than that, standard disclaimers apply. In particular, if you don't like any part of this, that's just tough. ---CUT HERE (File "UUSANITY.TXT", a help file of sorts...) ---- The fact that uuencoded files are often broken up into several parts can be a pain. There are several programs/scripts out there to strip out the non-data from these files, but none seem that robust. There is no way for a program to catch every bad line of text, but you can get pretty close! I wrote a small program (in C) to take a large file, which is a concatination of all the parts in the right order, and write the bare uuencoded file to stdout. This can be piped into uudecode. Note that there is no need to remove any text by hand. It is very unlikely (unless the poster is very devious) that a bad line will sneak through. The program first skips everything until hitting the string "begin " at the start of a line. It then examines every line until "end" is encountered by itself. For each line in between, it compares the encoded byte count with the actual number of bytes on the line (and allows one extra character at the end of the line - I don't know why some postings have this!). It also makes sure all characters (except perhaps that one extra) are in the valid range. If the line does not pass these tests, it is discarded. No attempt is made to look for things like CUT HERE, since these conventions seem to change from post to post. The program has worked for every file I have tryed to decode, but there may be some conventions I have not yet run across! Happy decoding! Joe Peterson joep@stardent.com ---CUT HERE (File "UUSANITY.C") ---- #include <stdio.h> #include <string.h> main(argc) { char line[128]; int start; FILE *fp; #ifdef __MSDOS__ char uudecoder[100],*puu,*getenv(); #else FILE *popen(); #endif if (argc != 1 || isatty(0)) { system("page uusanity.txt"); exit(1); } #ifdef __MSDOS__ if (puu = getenv("UUDECODER")) strcpy(uudecoder,puu); else { write(2,"Enter the command to use for uudecoding...",42); uudecoder[read(2,uudecoder,sizeof(uudecoder)) - 1] = '\0'; } strcat(uudecoder," < tmptmp.uue"); #endif start= 0; while (read_line(stdin, line) != EOF) { if (!strncmp(line, "begin ", 6)) { fprintf(stderr,"\n\tUUdecoding '%s'...\n\n",line + 6); start= 1; #ifdef __MSDOS__ if (!(fp = fopen("tmptmp.uue","w"))) perror("Error opening 'tmptmp.uue'"), exit (1); #else if (!(fp = popen("uudecode","w"))) perror("Error P-opening 'uudecode'"), exit (1); #endif fprintf(fp,"%s\n", line); continue; } if (!strcmp(line, "end")) { start= 0; fprintf(fp," \nend\n"); #ifdef __MSDOS__ fclose (fp); system (uudecoder); unlink ("tmptmp.uue"); #else pclose (fp); #endif } if (start && valid(line)) fprintf(fp,"%s\n", line); else printf("%s\n", line); } exit (0); } read_line(file, string) FILE *file; char string[]; { int pos; char c; pos = 0; while ((c = fgetc(file)) != '\n') { if (c == EOF) { string[0] = '\0'; return (EOF); } string[pos++] = c; } string[pos] = '\0'; return (0); } valid(string) char *string; { int i, count, pad, byte_count, char_count; count= strlen(string); byte_count= (*string - ' '); if (byte_count == 0) return(0); pad= (byte_count % 3) ? (3 - (byte_count % 3)) : 0; byte_count += pad; char_count= byte_count * 4 / 3; if (((count - 1) != char_count) && ((count - 2) != char_count)) return(0); for (i=1; i<(char_count+1); ++i) if ((string[i] < ' ') || (string[i] > (' ' + 64))) return(0); return(1); }