[comp.binaries.ibm.pc.d] New version of UUSANITY

valley@uchicago (Doug Dougherty) (03/29/91)
This article contains the source for a version of UUSANITY, for decoding
uuencoded articles.  It is adapted from an original idea by Joe Peterson.
(joep@stardent.com)  It has been correctly compiled under Sun OS and
with Turbo C on the PC.

The program is basically a replacement for the built-in ":decode"
command in my newsreader, nn.  nn's :decode often gets confused with
multi-part submissions to unmoderated groups.  Like :decode, my program
handles multiple files in a single pass.

To use it, simply pipe all of the files into this program or use the
"write pipe" option in nn.  This program accepts no command line params,
nor does it work with non-redirected stdin.  This program spits all
"offending" lines out to stdout.  To inhibit this, redirect to either
nul (under DOS) or /dev/null (under Unix).  Informatory messages go out
on stderr.

Note: If there are any errors in main(), that's my egg.  If there are
any in read_line() or valid(), talk to Joe...

Other than that, standard disclaimers apply.
In particular, if you don't like any part of this, that's just tough.

---CUT HERE   (File "UUSANITY.TXT", a help file of sorts...) ----
The fact that uuencoded files are often broken up into several parts can be
a pain.  There are several programs/scripts out there to strip out the
non-data from these files, but none seem that robust.  There is no way
for a program to catch every bad line of text, but you can get pretty close!

I wrote a small program (in C) to take a large file, which is a concatination
of all the parts in the right order, and write the bare uuencoded file to
stdout.  This can be piped into uudecode.  Note that there is no need to
remove any text by hand.  It is very unlikely (unless the poster is very
devious) that a bad line will sneak through.

The program first skips everything until hitting the string "begin " at
the start of a line.  It then examines every line until "end" is encountered
by itself.  For each line in between, it compares the encoded byte count with
the actual number of bytes on the line (and allows one extra character at the
end of the line - I don't know why some postings have this!).  It also makes
sure all characters (except perhaps that one extra) are in the valid range.
If the line does not pass these tests, it is discarded.  No attempt is made
to look for things like CUT HERE, since these conventions seem to change
from post to post.

The program has worked for every file I have tryed to decode, but there may
be some conventions I have not yet run across!  Happy decoding!

					Joe Peterson
					joep@stardent.com

---CUT HERE   (File "UUSANITY.C") ----
#include    <stdio.h>
#include    <string.h>

main(argc)
{
    char    line[128];
    int     start;
    FILE    *fp;

#ifdef __MSDOS__
    char    uudecoder[100],*puu,*getenv();
#else
    FILE    *popen();
#endif

    if (argc != 1 || isatty(0)) {
	system("page uusanity.txt");
        exit(1);
	}

#ifdef __MSDOS__
    if (puu = getenv("UUDECODER"))
        strcpy(uudecoder,puu);
    else
	{
	write(2,"Enter the command to use for uudecoding...",42);
	uudecoder[read(2,uudecoder,sizeof(uudecoder)) - 1] = '\0';
	}
    strcat(uudecoder," < tmptmp.uue");
#endif

    start= 0;
    while (read_line(stdin, line) != EOF) {
	if (!strncmp(line, "begin ", 6)) {
	    fprintf(stderr,"\n\tUUdecoding '%s'...\n\n",line + 6);
            start= 1;
#ifdef __MSDOS__
	    if (!(fp = fopen("tmptmp.uue","w")))
	    	perror("Error opening 'tmptmp.uue'"),
		exit (1);
#else
	    if (!(fp = popen("uudecode","w")))
	    	perror("Error P-opening 'uudecode'"),
		exit (1);
#endif
	    fprintf(fp,"%s\n", line);
	    continue;
	    }

        if (!strcmp(line, "end")) {
	    start= 0;
            fprintf(fp," \nend\n");
#ifdef __MSDOS__
	    fclose (fp);
	    system (uudecoder);
	    unlink ("tmptmp.uue");
#else
	    pclose (fp);
#endif
	    }

        if (start && valid(line))
            fprintf(fp,"%s\n", line);
	else
            printf("%s\n", line);
	}
	exit (0);
}

read_line(file, string)
    FILE           *file;
    char            string[];

{
    int             pos;
    char            c;

    pos = 0;
    while ((c = fgetc(file)) != '\n')
    {
	if (c == EOF)
	{
	    string[0] = '\0';
	    return (EOF);
	}
	string[pos++] = c;
    }
    string[pos] = '\0';
    return (0);
}


valid(string)
    char    *string;
{
    int     i, count, pad, byte_count, char_count;
    
    count= strlen(string);

    byte_count= (*string - ' ');
    if (byte_count == 0)
        return(0);

    pad= (byte_count % 3) ? (3 - (byte_count % 3)) : 0;
    byte_count += pad;
    char_count= byte_count * 4 / 3;
    if (((count - 1) != char_count) &&
        ((count - 2) != char_count))
        return(0);

    for (i=1; i<(char_count+1); ++i)
        if ((string[i] < ' ') || (string[i] > (' ' + 64)))
            return(0);
    
    return(1);
}