btb@ncoast.UUCP (01/29/87)
I have d/l'd the hexbin and binhex programs, and they work, but, as we all know, they are very slow, and the binhex process doubles the size of your file for transmitting... The uudecode program has apparently been causing some problems, too... I have d/l'd it, but I haven't tried it yet... I would like to propose that if somebody would post the uudecode algorithms, I will write uudecode/encode in Deep Blue C... binhex it to the net, so everyone can get it... this way at least it will be fairly fast, and I could also make the DBC source available for others to monkey with... Please send the algorithm description to me... Thanks. -- Brad Banko ...!decvax!cwruecmp!ncoast!btb Cleveland, Ohio "The only thing we have to fear on this planet is man." -- Carl Jung, 1875-1961
jhs@MITRE-BEDFORD.ARPA.UUCP (01/30/87)
Brad: Attached is the uuencode/decode definition and a c program for it. I have fixed the uudecode that I posted so I think it now will work correctly. This version is in BASIC, so even those who don't have a c compiler can use it. Also, the inner workings are in machine language, so it is already fairly fast, possibly faster than your c version will be. I am (slowly) working on a version which does encoding as well. It may be a couple of months until I finish it, though, at the present rate of distractions. Uuencoding has its problems. Some of the characters in the character set get changed by some hosts. Lines ending in blanks sometimes get shortened in transit. I have a file of reports of such problems which you should probably read if you are serious about doing another version. -John Sangster jhs@mitre-bedford.arpa ----------------i-n-f-o---o-n---u-u-e-n-c-o-d-e-/-d-e-c-o-d-e----------------- From: randy@NLM-VAX.arpa (Rand Huntzinger) Organization: National Library of Medicine, Bethesda, Md ------------ Uudecode reads a file of the following format: header line(1)->begin <mode> <filename> data lines(many)-><length> <data> trailer line(1)->end where: The header line fields contain: <mode> is a three digit number specifying a Unix file mode (specifies who can read and write the file. You can ignore this on a non-Unix machine. <filename> is the name of the file encoded below. You can use this if it is compatable with file names on your system. The data lines contain: <length> is one character generated by adding the number of bytes encoded on this line to 32 (ASCII space). <data> is the <length>-32 bytes of binary data encoded into text to produce 4 bytes of text for every 3 bytes of binary data as follows: Input: Byte 1 Byte 2 Byte 3 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 \ / \ / \ / \ / \ / \ / \ / \ / ----- + ---- ----- + ------ ------ + ----- ----- + ---- | | | | + 32 + 32 + 32 + 32 | | | | V V V V Output: Byte 1 Byte 2 Byte 3 Byte 4 In other words, encoding involves taking 3 bytes, breaking it into 4 six bit chunks, adding 32 to each six bit value to make it an ascii character, and output them. To decode, you reverse this. Strip the parity bit, subtract 32 from each byte and repack into three words by shifting and or'ing the pieces. I don't think I'll spell this out, since it is an obvious reversal of the above steps. The last line in the section simply says 'end'. The file may contain junk before the begin statement and after the end statement. So the decoder usually skips until it sees begin, extracts the file name and decodes until it sees the end line. Be sure when you implement it that you use the length byte to determine the length of the decoded text, since stuff going though news sometimes gets padding added to the text. Also, you need to do this is you want to get the file length correct. I've included a posted uudecode source for the Atari 520 ST below. I've never used it, but it does give you something to look at. I don't know whether you read C, but it might still be of help. There is nothing which says it's copyrighted, so I assume it's public domain. If not, the copyright was removed before I saw it. Some of the cruft in it indicates it was originally written for Unix. I seem to have lost the credits on this one, I don't see who posted it. =========================================================================== /* * uudecode input * Modified for the ST - cannot use putc because of CR/LF problems * * create the specified file, decoding as you go. * used with uuencode. */ #include <stdio.h> #include <osbind.h> int_isconio; #define NULL 0 /* single character decode */ #define DEC(c)(((c) - ' ') & 077) intoutfile;/* File descriptor of output file */ charoutbuf[BUFSIZ];/* Output buffer for my character out code */ char*nextc= outbuf;/* Pointer to the next character */ #define lputc(outchar){*nextc++ = outchar;if (nextc >= &outbuf[BUFSIZ])do_write();} main(argc, argv) char **argv; { FILE *in; FILE *fopen(); int mode; char dest[128]; char buf[80]; _isconio = 1; /* mandatory input arg */ if (argc != 2) { printf("Usage: uudec filename\n"); exit(1); } if ((in = fopen(argv[1], "r")) == NULL) { printf("Cannot open: %s\n", argv[1]); exit(1); } _isconio = 0; /* search for header line */ for (;;) { if (fgets(buf, sizeof buf, in) == NULL) { printf("No begin line\n"); exit(3); } if (strncmp(buf, "begin ", 6) == 0) break; } sscanf(buf, "begin %o %s", &mode, dest); /* handle ~user/file format */ if (dest[0] == '~') { printf("Cannot handle user formats\n"); exit(1); } /* create output file */ if ((outfile = Fcreate(dest, 0)) < 0) {printf("Cannot create: %s\n", argv[2]); exit(1); } decode(in); if (fgets(buf, sizeof buf, in) == NULL || strcmp(buf, "end\n")) { printf("No end line\n"); exit(5); } exit(0); } /* * copy from in to outfile, decoding as you go along. */ decode(in) FILE *in; { char buf[80]; char *bp; int n; for (;;) { /* for each input line */ if (fgets(buf, sizeof buf, in) == NULL) { printf("Short file\n"); exit(10); } n = DEC(buf[0]); if (n <= 0) break; bp = &buf[1]; while (n > 0) { outdec(bp, n); bp += 4; n -= 3; } } do_write(); } /* * output a group of 3 bytes (4 input characters). * the input chars are pointed to by p, they are to * be output to file f. n is used to tell us not to * output all of them at the end of the file. */ outdec(p, n) char *p; { int c1, c2, c3; c1 = DEC(*p) << 2 | DEC(p[1]) >> 4; c2 = DEC(p[1]) << 4 | DEC(p[2]) >> 2; c3 = DEC(p[2]) << 6 | DEC(p[3]); if (n >= 1) lputc(c1); if (n >= 2) lputc(c2); if (n >= 3) lputc(c3); } do_write() {longbytect; if (nextc != &outbuf[0]) {bytect = nextc - &outbuf[0]; if (Fwrite(outfile, bytect, outbuf) < 0) {printf("Write error on output file\n"); exit(1); } nextc = outbuf; } }
jhs@MITRE-BEDFORD.ARPA.UUCP (01/30/87)
Here are some of the comments I have collected on uuencode/decode problems. Many of them come from the atari16 news group, which is actively using uuencoding/decoding. I have ended the file with a "wish list" for what a uuencode/decode program should ideally do. Comments are solicited -- If somebody is going to write a really GOOD uuencode/decode package, it might as well handle all known problems in the best way we can think of. -John S. ------------------------------------------------------------------------------ From: Mike Vederman <ACS19%UHUPVM1.BITNET@WISCVM.WISC.EDU> Subject: finally, i've figured uudecode ... To: ST Users <INFO-ATARI16@SU-SCORE.ARPA> After many, many problems with BITNET, IBM machines and uudecode, I have found the true answer to having all of my files uudecode. Every single file which has bombed in the past, now decodes perfectly. The normal sequence that I do is to send the file over from the IBM machine (actually we have an AS/9000N) to the AT&T 3B20 and uudecode on the 3B20. Previously, the only two files which worked were UNITERM and uEmacs. Now, however, I have found the solution, but I am uncertain as to the exact cause. Here is what I do. First, while in Xedit, I set the logical record length to 61, then I set the record format to fixed. The following commands do this: set lrecl 61 set recfm f now, when I send the file to the 3B20, I enter the command: sf file uue to acs19 at uhnix1 to receive on the 3B20, I enter the command: receive $job/pnch6 >file.uue (since the 3B20 is hooked up as a punch machine (yuck). What you do may be different, but I know setting the logical record length and the record format have made all of my files work.) When I get to the 3B20, I do have to fix the file a little bit. I go to the last line and delete the blanks to end of line, after 'end'. Then I go up one line and make sure that there is only 1 (one) blank on the second to the last line. Then I save the file and uudecode it. All the files I have tried thus far have worked, and most of these have always bombed. Mike ------------------------------------------------------------------------------ From: <MHD@slacvm.bitnet> Reply-To: MHD%SLACVM.BITNET@forsythe.stanford.edu Like others, I had little success at first in decoding the uuencoded files on this net. After looking closely at the files I noticed that the tilde character (~) was present when it should not have been. Upon further inspection I found that the carat character (^) was missing in all files. I tried a global change of tilde to carat and had no further trouble Perhaps many others have been screwed up in the same way by IBM machines in the net. Perhaps at other sites just one other character is screwed up. Give it a try, preferably do it first on a .ARC file where deARCing will show errors on the CRC checks. The only characters that should be in the uuencoded files are: space( ) through underline(_), ascii 20 through 5F. All of these should be present in a file of reasonable length. Note: The tilde and carat may get screwed up in this file. They are ascii 7E and 5E respectivly. ------------------------------------------------------------------------------ From: XBR1Y049%DDATHD21.BITNET@WISCVM.WISC.EDU (Stephan Leicht c/o HRZ TH Darmstadt, Germany ) Subject: UUDECODE difficulties / Translation tables I found my troubles in being not able to run some uuencoded programs comming over the net. The translation of the not-Char/circumflex(^)-Char is not unique. Translating it to EBCDIC 5Fhex and sending it over some gateways, it not always returns as EBCDIC 5Fhex, sometimes it returns as EBCDIC 71hex. (I saw it sending it to ucbvax and return) Now I have told our translation table to recognize 71hex as 5Fhex too. Strange ! Maybe that this is a hint for those having no success in uudecoding. Stephan Name : Stephan Leicht Organisation : Computer Center of Technical University Darmstadt, Germany Bitnet : XBR1Y049@DDATHD21 insert all usual & unusual disclaimers here --> ------------------------------------------------------------------------------ From: mcvax!ukc!dcl-cs!bath63!pes@seismo.css.gov (Paul Smee) Organization: Bath University, England Subject: Re: Lattice C and UUDECODE I've found several problems with uudecode, which are caused by the uuencoded file being munged by 'terminal handlers' enroute to me. The symptoms sound like some I've seen complained about, specifically that the file appears to decode, but bombs. On looking at the files, I found 2 different sorts of problem. First, spurious 'control chars' get inserted in some cases. In fact, these appear to always be NULs (0x0) put in as 'delay padding' by someones term handling software. Second, trailing spaces get stripped. It is fairly trivial to modify uudecode to ignore all control chars (less that ASCII space) -- and harmless, as they should not be in a uuencoded file. Then, pad the line out to an arbitrarily large length using spaces (I simply tack 64 spaces on the end, in my buffer, before decoding a line). Appending 64 spaces is crude and inelegant, I know, but it insures that there will be enough in all circumstances, and is a much simpler mod than actually looking to see how many are needed -- especially since the line may contain as yet unprocessed control chars to be thrown away. Since making these trivial changes I've had no problems with uudecoded files. Hope this helps someone... ------------------------------------------------------------------------------ From: <RDROYA01%ULKYVX.BITNET@WISCVM.WISC.EDU> (Robert Royar) Organization: University of Louisville Subject: Another version of uue/decode offered (repost) I have a version of uue/decode that I have been testing and that seems to avoid some of the problems with the present setup. I added some code to the original programs to do some error checking. The format of the files the encoder uses is similar to the current version. If you have the current version, you can delete the extra information with your word processor. My version adds the following things: 1. It breaks up long files so that each uuencoded file is no more than 300 lines long. 2. It places and include directive at the end of each file at the break so the companion program can reassemble the files into a program. 3. It adds a key table at the beginning of each file so that the decoder can index into this table to find the value for letters in the file. I discovered that the errors I had with uudecode were often because a mailer had changed each instance of a character to something else. This solves that problem. 4. It checks the key table for integrity before it decodes the file. This way if both 'H' and 'Z' have been changed to ' ', it will exit and tell you. 5. If it finds a letter in the file that was not in the key table, it issues a warning telling what the character was, the filename, and the current file position. 6. It appends a 'part' letter (a-z) at the end of each output line to avoid truncation problems. This program makes uue/decoding somewhat simpler than it has been because it breaks the files up into parts automatically and reassembles them. It actually is smaller and runs faster than the net version of the program because I used freopen() to cut down on file pointers and I used register variables where possible. If you are interested in this program, drop me a line and I will post it to you. Robert Royar rdroya01@ulkyvx.bitnet <NOTE (jhs) - This program runs on the ST and would need work to port to the XL/XE/800 series.> ------------------------------------------------------------------------------ Summary of desirable uudecode features (J. Sangster 30 Jan 1986): 1. Should check for characters outside the range $20 to $5F, which are not legal uuencode output and indicate a corrupted file. 2. If found, $7E (~) should be changed to $5E (^). 3. If found, $71 should be changed to $5F (acc. to Stefan Leicht) 4. The decoder should be insensitive to loss of trailing blanks at the end of a line. I.e. if the line is shorter than the indicated number of bytes (determined from the first character M or whatever), it should assume that any missing characters were blanks, $20, which will uudecode to nulls ($00) after subtracting $20. 5. If any nulls ($00) are found in the uuencoded file, they should be deleted before processing. 6. The encoder should never generate files longer than some maximum size. One person suggested 300 lines. I think 32K is acceptable to most mailers but room should be left for comments at the top. 7. The decoder should reassemble the parts of a file automatically if it was broken into pieces. 8. The decoder should ignore comments before the begin line and after the end line. 9. The "translation table" idea -- inserting a list of all possible legal characters and checking what is received for uniqueness -- is a terrific idea! 10. Since unix systems name the decoded output file from the info in the begin line, and also set protection codes from it, we Atarians should adopt some conventions related to these procedures. I suggest: (i) If the protect code is is less than 700, lock the output file. (On a unix system, the first digit is owner rights and 7 is "all rights" so this convention makes you think if the owner is not supposed to have all rights.) (ii) The decoder should display the filename embedded in the file and ask if it is OK. User can enter a <RETURN> to accept it or a new filename if desired. 11. While we are at it, it would be nice to be able to handle sector encoding and reconstruction of whole disks. That's all I can think of. Anybody have any additional ideas? -John S.