valley@uchicago (Doug Dougherty) (03/29/91)
This article contains the source for a version of UUSANITY, for decoding
uuencoded articles. It is adapted from an original idea by Joe Peterson.
(joep@stardent.com) It has been correctly compiled under Sun OS and
with Turbo C on the PC.
The program is basically a replacement for the built-in ":decode"
command in my newsreader, nn. nn's :decode often gets confused with
multi-part submissions to unmoderated groups. Like :decode, my program
handles multiple files in a single pass.
To use it, simply pipe all of the files into this program or use the
"write pipe" option in nn. This program accepts no command line params,
nor does it work with non-redirected stdin. This program spits all
"offending" lines out to stdout. To inhibit this, redirect to either
nul (under DOS) or /dev/null (under Unix). Informatory messages go out
on stderr.
Note: If there are any errors in main(), that's my egg. If there are
any in read_line() or valid(), talk to Joe...
Other than that, standard disclaimers apply.
In particular, if you don't like any part of this, that's just tough.
---CUT HERE (File "UUSANITY.TXT", a help file of sorts...) ----
The fact that uuencoded files are often broken up into several parts can be
a pain. There are several programs/scripts out there to strip out the
non-data from these files, but none seem that robust. There is no way
for a program to catch every bad line of text, but you can get pretty close!
I wrote a small program (in C) to take a large file, which is a concatination
of all the parts in the right order, and write the bare uuencoded file to
stdout. This can be piped into uudecode. Note that there is no need to
remove any text by hand. It is very unlikely (unless the poster is very
devious) that a bad line will sneak through.
The program first skips everything until hitting the string "begin " at
the start of a line. It then examines every line until "end" is encountered
by itself. For each line in between, it compares the encoded byte count with
the actual number of bytes on the line (and allows one extra character at the
end of the line - I don't know why some postings have this!). It also makes
sure all characters (except perhaps that one extra) are in the valid range.
If the line does not pass these tests, it is discarded. No attempt is made
to look for things like CUT HERE, since these conventions seem to change
from post to post.
The program has worked for every file I have tryed to decode, but there may
be some conventions I have not yet run across! Happy decoding!
Joe Peterson
joep@stardent.com
---CUT HERE (File "UUSANITY.C") ----
#include <stdio.h>
#include <string.h>
main(argc)
{
char line[128];
int start;
FILE *fp;
#ifdef __MSDOS__
char uudecoder[100],*puu,*getenv();
#else
FILE *popen();
#endif
if (argc != 1 || isatty(0)) {
system("page uusanity.txt");
exit(1);
}
#ifdef __MSDOS__
if (puu = getenv("UUDECODER"))
strcpy(uudecoder,puu);
else
{
write(2,"Enter the command to use for uudecoding...",42);
uudecoder[read(2,uudecoder,sizeof(uudecoder)) - 1] = '\0';
}
strcat(uudecoder," < tmptmp.uue");
#endif
start= 0;
while (read_line(stdin, line) != EOF) {
if (!strncmp(line, "begin ", 6)) {
fprintf(stderr,"\n\tUUdecoding '%s'...\n\n",line + 6);
start= 1;
#ifdef __MSDOS__
if (!(fp = fopen("tmptmp.uue","w")))
perror("Error opening 'tmptmp.uue'"),
exit (1);
#else
if (!(fp = popen("uudecode","w")))
perror("Error P-opening 'uudecode'"),
exit (1);
#endif
fprintf(fp,"%s\n", line);
continue;
}
if (!strcmp(line, "end")) {
start= 0;
fprintf(fp," \nend\n");
#ifdef __MSDOS__
fclose (fp);
system (uudecoder);
unlink ("tmptmp.uue");
#else
pclose (fp);
#endif
}
if (start && valid(line))
fprintf(fp,"%s\n", line);
else
printf("%s\n", line);
}
exit (0);
}
read_line(file, string)
FILE *file;
char string[];
{
int pos;
char c;
pos = 0;
while ((c = fgetc(file)) != '\n')
{
if (c == EOF)
{
string[0] = '\0';
return (EOF);
}
string[pos++] = c;
}
string[pos] = '\0';
return (0);
}
valid(string)
char *string;
{
int i, count, pad, byte_count, char_count;
count= strlen(string);
byte_count= (*string - ' ');
if (byte_count == 0)
return(0);
pad= (byte_count % 3) ? (3 - (byte_count % 3)) : 0;
byte_count += pad;
char_count= byte_count * 4 / 3;
if (((count - 1) != char_count) &&
((count - 2) != char_count))
return(0);
for (i=1; i<(char_count+1); ++i)
if ((string[i] < ' ') || (string[i] > (' ' + 64)))
return(0);
return(1);
}