w8sdz@WSMR-SIMTEL20.ARMY.MIL (Keith Petersen) (12/16/89)
Some readers of this list have reported problems uudecoding files sent by LISTSERV (or TRICKLE in Europe) which contain a "M" at the end of each line and on the final line just before the "end" statement. The first character in each line defines the number of bytes of data which follow on that line. If the uudecoder is working correctly the trailing "M" should be ignored. It is added by LISTSERV to get around problems of some mailers dropping trailing blanks. Here is a freely distributable uudecode for Unix that does the "right thing", courtesy of UC Berkeley. Keith -- Keith Petersen Maintainer of SIMTEL20's CP/M, MSDOS, & MISC archives [IP address 26.2.0.74] Internet: w8sdz@WSMR-SIMTEL20.Army.Mil, w8sdz@brl.arpa BITNET: w8sdz@NDSUVM1 Uucp: {ames,decwrl,harvard,rutgers,ucbvax,uunet}!wsmr-simtel20.army.mil!w8sdz ---cut-here--- /* * Copyright (c) 1983 Regents of the University of California. * All rights reserved. * * Redistribution and use in source and binary forms are permitted * provided that the above copyright notice and this paragraph are * duplicated in all such forms and that any documentation, * advertising materials, and other materials related to such * distribution and use acknowledge that the software was developed * by the University of California, Berkeley. The name of the * University may not be used to endorse or promote products derived * from this software without specific prior written permission. * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. */ #ifndef lint static char sccsid[] = "@(#)uudecode.c 5.5 (Berkeley) 7/6/88"; #endif /* not lint */ /* * uudecode [input] * * create the specified file, decoding as you go. * used with uuencode. */ #include <stdio.h> #include <pwd.h> #include <sys/types.h> #include <sys/stat.h> /* single character decode */ #define DEC(c) (((c) - ' ') & 077) main(argc, argv) char **argv; { FILE *in, *out; int mode; char dest[128]; char buf[80]; /* optional input arg */ if (argc > 1) { if ((in = fopen(argv[1], "r")) == NULL) { perror(argv[1]); exit(1); } argv++; argc--; } else in = stdin; if (argc != 1) { printf("Usage: uudecode [infile]\n"); exit(2); } /* search for header line */ for (;;) { if (fgets(buf, sizeof buf, in) == NULL) { fprintf(stderr, "No begin line\n"); exit(3); } if (strncmp(buf, "begin ", 6) == 0) break; } (void)sscanf(buf, "begin %o %s", &mode, dest); /* handle ~user/file format */ if (dest[0] == '~') { char *sl; struct passwd *getpwnam(); struct passwd *user; char dnbuf[100], *index(), *strcat(), *strcpy(); sl = index(dest, '/'); if (sl == NULL) { fprintf(stderr, "Illegal ~user\n"); exit(3); } *sl++ = 0; user = getpwnam(dest+1); if (user == NULL) { fprintf(stderr, "No such user as %s\n", dest); exit(4); } strcpy(dnbuf, user->pw_dir); strcat(dnbuf, "/"); strcat(dnbuf, sl); strcpy(dest, dnbuf); } /* create output file */ out = fopen(dest, "w"); if (out == NULL) { perror(dest); exit(4); } chmod(dest, mode); decode(in, out); if (fgets(buf, sizeof buf, in) == NULL || strcmp(buf, "end\n")) { fprintf(stderr, "No end line\n"); exit(5); } exit(0); } /* * copy from in to out, decoding as you go along. */ decode(in, out) FILE *in; FILE *out; { char buf[80]; char *bp; int n; for (;;) { /* for each input line */ if (fgets(buf, sizeof buf, in) == NULL) { printf("Short file\n"); exit(10); } n = DEC(buf[0]); if (n <= 0) break; bp = &buf[1]; while (n > 0) { outdec(bp, out, n); bp += 4; n -= 3; } } } /* * output a group of 3 bytes (4 input characters). * the input chars are pointed to by p, they are to * be output to file f. n is used to tell us not to * output all of them at the end of the file. */ outdec(p, f, n) char *p; FILE *f; { int c1, c2, c3; c1 = DEC(*p) << 2 | DEC(p[1]) >> 4; c2 = DEC(p[1]) << 4 | DEC(p[2]) >> 2; c3 = DEC(p[2]) << 6 | DEC(p[3]); if (n >= 1) putc(c1, f); if (n >= 2) putc(c2, f); if (n >= 3) putc(c3, f); }
kim@uts.amdahl.com (Kim DeVaughn) (12/16/89)
In article <KPETERSEN.12550482989.BABYL@WSMR-SIMTEL20.ARMY.MIL>, w8sdz@WSMR-SIMTEL20.ARMY.MIL (Keith Petersen) writes: > Some readers of this list have reported problems uudecoding files sent > by LISTSERV (or TRICKLE in Europe) which contain a "M" at the end of > each line and on the final line just before the "end" statement. > > The first character in each line defines the number of bytes of data > which follow on that line. If the uudecoder is working correctly the > trailing "M" should be ignored. It is added by LISTSERV to get around > problems of some mailers dropping trailing blanks. Several of the newer flavors of the uutwins (uuencode/uudecode) add a couple of characters beyond the specified line length (which is usually "M") to provide line-by-line checksums. If these chars aren't present, the uudecode just assumes there .uue was created by an older uuencode, and doesn't do any checksumming. If there are characters beyond the encoded line length, they do assume these char(s) represent a checksum, which is what's happening in this case. Some also do an overall length check at the EOF. The problem with trailing blanks is also eliminated, as they use a non-blank char in place of a blank (a ` I believe). So there are NO blanks anywhere in the .uue. This non-blank char maps to the same decode value, so this scheme is backwardly compatible with older versions of the uutwins, and doesn't break anything (or at least not any of the many flavors I've ever come across). Would you be interested in converting to the newer versions, as they do provide some additional error checking? If so, I'll be happy to email them to you or whomever the "right" person is at LISTSERV. /kim -- UUCP: kim@amdahl.amdahl.com or: {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim DDD: 408-746-8462 USPS: Amdahl Corp. M/S 249, 1250 E. Arques Av, Sunnyvale, CA 94086 BIX: kdevaughn GEnie: K.DEVAUGHN CIS: 76535,25
usenet@cps3xx.UUCP (Usenet file owner) (12/17/89)
In article <f9ug02QP74xw01@amdahl.uts.amdahl.com> kim@uts.amdahl.com (Kim DeVaughn) writes:
%Several of the newer flavors of the uutwins (uuencode/uudecode) add a couple
%of characters beyond the specified line length (which is usually "M") to
%provide line-by-line checksums. If these chars aren't present, the uudecode
%just assumes there .uue was created by an older uuencode, and doesn't do any
%checksumming.
%
%If there are characters beyond the encoded line length, they do assume these
%char(s) represent a checksum, which is what's happening in this case. Some
%also do an overall length check at the EOF.
%
%The problem with trailing blanks is also eliminated, as they use a non-blank
%char in place of a blank (a ` I believe). So there are NO blanks anywhere
%in the .uue. This non-blank char maps to the same decode value, so this
%scheme is backwardly compatible with older versions of the uutwins, and doesn't
%break anything (or at least not any of the many flavors I've ever come across).
%
%Would you be interested in converting to the newer versions, as they do provide
%some additional error checking? If so, I'll be happy to email them to you or
%whomever the "right" person is at LISTSERV.
Yes, I am very interested. I looked around on several machines, and
none of them has the newer better flavours that do checksumming. Could
you please post the source here? Thank you.
%/kim
%--
%UUCP: kim@amdahl.amdahl.com
% or: {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
Neither of these addresses work from Mich State Univ or Univ of So Calif.
In the rare case that original ideas Kenneth J. Hendrickson N8DGN
are found here, I am responsible. Owen W328, E. Lansing, MI 48825
Internet: kjh@usc.edu UUCP: ...!uunet!usc!pollux!kjh
kevin@kosman.UUCP (Kevin O'Gorman) (12/18/89)
In article <KPETERSEN.12550482989.BABYL@WSMR-SIMTEL20.ARMY.MIL> w8sdz@WSMR-SIMTEL20.ARMY.MIL (Keith Petersen) writes: >Some readers of this list have reported problems uudecoding files sent >by LISTSERV (or TRICKLE in Europe) which contain a "M" at the end of >each line and on the final line just before the "end" statement. > >The first character in each line defines the number of bytes of data >which follow on that line. If the uudecoder is working correctly the >trailing "M" should be ignored. It is added by LISTSERV to get around >problems of some mailers dropping trailing blanks. Some clarification is necessary, because this doesn't sound like the whole story. The UUDECODE and UUENCODE that I have on my clone works very well with the one on my UNIX machine. Neither of them is very happy with the files from SIMTEL20. The ones on MSDOS have a bunch of help screens that include this additional information: After explaining the basic encoding technique, they comment that since some mailers attempt to munge space characters in one way or another, all spaces are converted to back-tick (`) characters, which are not otherwise used in the encoding. The SIMTEL encoded does not seem to do this, and instead ends every line with an "M", which at least avoids problems with mailers that strip trailing blanks. They also comment that the last character of each line is a checksum, and explain a bit about how that is computed. The checksum falls in the same position used by the trailing "M" in the SIMTEL encodings, thus the decoders that I use think there's a checksum error on nearly every line. The one on MSDOS at least notices that this is happening very quickly and asks if I want to just ignore the checksum problems. I only have these two decoders to go by. I have no idea what's common in the BSD world, I can only guess how much like SYSV my software set is, so I'm not going to claim that this is the best and most modern stuff there is, but it sure sounds like a more robust scheme that what's being used at SIMTEL20. If it is also pretty standard, I would hope that SIMTEL20 would begin to use this scheme.
w8sdz@smoke.BRL.MIL (Keith Petersen) (12/19/89)
There has been some discussion about "the uuencode method that SIMTEL20 uses." SIMTEL20 does not uuencode files. LISTSERV uuencodes files. --> SIMTEL20 does NOT run the LISTSERV and TRICKLE netmail servers! Please complain to the administrators of the LISTSERV or TRICKLE you are using. For VM1.NODAK.EDU: Info@VM1.NODAK.EDU For VM.ECS.RPI.EDU: FISHER@VM.ECS.RPI.EDU If you are on BITNET: For NDSUVM1: Info@NDSUVM1 For RPIECS: FISHER@RPIECS This won't get changed unless you complain to the right people! Keith -- Keith Petersen Maintainer of SIMTEL20's CP/M, MSDOS, & MISC archives [IP address 26.2.0.74] Internet: w8sdz@WSMR-SIMTEL20.Army.Mil, w8sdz@brl.arpa BITNET: w8sdz@NDSUVM1 Uucp: {ames,decwrl,harvard,rutgers,ucbvax,uunet}!wsmr-simtel20.army.mil!w8sdz
saj@chinet.chi.il.us (Stephen Jacobs) (12/19/89)
Keith Petersen suggested that people take up TRICKLE-related problems with the administrators of the machines running TRICKLE. As the victim of some really massive screw-ups originating in a nearby BITNET-internet gateway, I'd add that you may want to check the path things reach you by, and keep the postmasters of intermediate machines informed of things that may be happening to mail at their sites. EBCDIC-ASCII translation still messes things up now and then. Steve J.
bob@atom.OZ (Bob Backstrom) (12/22/89)
One thing to watch out for with the C version of uudecode posted a little while back is how you open the output file under Turbo-C. Change the "w" to "wb" to open in BINARY mode. Otherwise, you'll have the great fun of every 0ah -> 0dh, 0ah making executables run for about a millisecond before jumping off to never-never land. Cheers. -- * ACSNET: bob@atom.oz Bob Backstrom, * Phone: (02) 543-3092 Australian Nuclear Science & Technology Organisation, * Private Mailbag 1, Menai, * New South Wales, Australia, 2234.