[comp.binaries.ibm.pc.d] UUencode vs. XXencode

magnus@THEP.LU.SE (Magnus Olsson) (07/17/90)

What's the difference between uuencode and xxencode, and which encoding
scheme is the "best" for mailing binaries (i.e. which one produces the smallest
output files?)

+=====================================================+
| Magnus Olsson		     	| \e+ 	   /_	      |
| Dept. of Theoretical Physics 	|  \  Z	  / q	      |
| University of Lund	     	|   >----<	      |
| Solvegatan 14 a		|  /	  \===== g    |
| S-223 62 LUND, Sweden 	| /e-	   \q	      | 
+===============================+=====================+
|     "Theoretical Physicists don't do it, they run   |
|  	     a Monte Carlo simulation"		      |
+=====================================================+

dmm0t@hudson.acc.Virginia.EDU (David M. Meyer) (07/17/90)

In article <9007171240.AA00322@thep.lu.se> magnus@THEP.LU.SE (Magnus Olsson) writes:
>What's the difference between uuencode and xxencode, and which encoding
>scheme is the "best" for mailing binaries (i.e. which one produces the smallest
>output files?)

As I understand it, both uuencode and xxencode do basically the same
thing:  convert 2 8-bit bytes (i.e. binary data) into 3 6-bit bytes
(i.e. text).  Therefore, they create the same size output files.  
I've been told that (somehow) xxencode is superior, but since uuencode
is much more universally available, uuencode is the method of choice.

For a more complete, technical explanation of the difference between
the two, look in the documentation of Richard Marx' excellent
PC version of uuencode and uudecode, UUEXE402.ZIP (available on
simtel20 and wuarchive).


--
David M. Meyer                                       dmm0t@virginia.edu
Department of Mechanical and Aerospace Engineering
University of Virginia
Charlottesville, Virginia                            (804) 924-7926

djb@wjh12.harvard.edu (David J. Birnbaum) (07/18/90)

Someone asks:

>What's the difference between uuencode and xxencode, and which encoding
>scheme is the "best" for mailing binaries (i.e. which one produces the smallest
>output files?)

To which someone else replies:

>As I understand it, both uuencode and xxencode do basically the same
>thing:  convert 2 8-bit bytes (i.e. binary data) into 3 6-bit bytes
>(i.e. text).  Therefore, they create the same size output files.  
>I've been told that (somehow) xxencode is superior, but since uuencode
>is much more universally available, uuencode is the method of choice.

The methods have the same results as long as the files must pass only
through systems with identical character mapping.  Unfortunately, there
are several flavors of EBCDIC and substantial differences between
EBCDIC and ASCII, which means that unlike machines try to translate
characters to their own mapping.  The translations are not one-to-one,
which means that characters may be conflated, producing a corrupt file.
Xxencode uses a more restricted character set, limited to characters
that are coded identically in ASCII and all flavors of EBCDIC.  Thus,
xxencoded files can be transferred safely anywhere.  Uuencoded files
are frequently corrupted.

There are a few other encoding systems, including util3, abe/dabe, and
ascify.

--David
==================================================================
David J. Birnbaum                 djb@wjh12.harvard.edu [Internet]
                                  djb@harvunxw.bitnet [Bitnet]

ace@sund.cc.ic.ac.uk (Andriko del Saludo) (07/18/90)

In article <9007171240.AA00322@thep.lu.se> magnus@THEP.LU.SE (Magnus Olsson) writes:
>What's the difference between uuencode and xxencode, and which encoding
>scheme is the "best" for mailing binaries (i.e. which one produces the smallest
>output files?)
>
 Actually both programs use the same scheme and generate the same size.
 xxencode, however uses a more limited character set. This is GOOD!!
 because uuencoded files can occasionally be corrupted. The situation is
 that if a uue file hits an IBM mainframe on BITNET then caret (^) is
 translated into tilde (~) because of the ASCII to EBDIC to ASCII
 translation. xxencode does not suffer from this and, therefore, to me
 is superior. However most people are stuck with uuencode and therefore
 one has to use it. Tough....

 ace


---------------------------------------------------------------------
- Andreas C. Enotiadis (ace@cc.ic.ac.uk, ace@grathun1.earn, etc)    -
- (I'm still thinking about something clever to put here...)        -
---------------------------------------------------------------------

magnus@THEP.LU.SE (Magnus Olsson) (07/19/90)

Thanks to everybody who answered my question, both here and by e-mail.

Magnus

readdm@walt.cc.utexas.edu (David M. Read) (07/30/90)

In article <37392@shemp.CS.UCLA.EDU> wales@CS.UCLA.EDU (Rich Wales) writes:
>
>I must disagree.
>
> [discussion of why uuencode gets munged by EBCDIC converters]
>
>Getting all affected machines to "fix" their ASCII/EBCDIC translation
>tables is a gargantuan task that has essentially been given up as hope-
>less.
>
>If we aren't willing to use something as completely different as Brad
>Templeton's ABE, we should at least change to "xxencode".  "xxencode" is
>precisely the same as "uuencode", except that the 64 characters used to
>encode the file are taken from a set that is known to be immune to
>ASCII/EBCDIC mangling (specifically, "xxencode" uses upper and lower
>case letters, digits, and the plus and minus signs).
>
>It would be trivial to add an option to a "uuencode" clone to let it
>support "xxencode" as well -- since, as I said, the file structure is
>utterly identical except for the character code.
>

It is for exactly this reason that I have started to add xxencode 
capability to my new code...a beta version ran last night and
will be available sometime in the next week or so.

However...I have great problems with blaming the conversion problems
on uuencode...the fault lies in these faulty EBCDIC converters.  I
must confess general ignorance of EBCDIC, but I can't see that writing
a simple look-up-table conversion routine would be all that difficult.
It seems to me that if the rest of the world is using one standard that
it's pretty silly for a small group to use another standard if they
intend to interact with the world...and that it rests with the smaller
group to deal with the problems, rather than the larger group.  Recall
Spock's words from Trek II: "The needs of the many outweigh the needs
of the few..." :-)

At any rate, I think that the majority will decide here...if the  
capability for xxencode exists and is widely available, and there
is a general demand for its use, then I imagine that it will 
spontaneously become the new standard.  In general I have no preference
for uuencoding vs. xxencoding...that's why I'll include it in the next
version of my code.

I still think that uuencode will remain the standard, though; there are
too many people out there with UNIX boxes which alredy have uuencode 
installed on them, and they're too lazy to go converting to a new 
program.  I have watched the posts come and go over the last 2 years,
and I have *never* seen an xx-encoded post...they've all been uuencoded!

So I cast this discussion to the users...by choosing which method you use
to post, you will choose which method the rest of the world uses to decode.

Enjoy it either way, and let's try to keep this debate friendly!  So
far it's been great fun!


-Dave                                 | LAMPF and UT don't believe that 
 Dave Read: read@lampf.lanl.gov       | their people have opinions.  Who 
            read@physics.utexas.edu   | am I to disillusion them? 
            readdm@ccwf.cc.utexas.edu | #include <cutepicture.h>  

brad@looking.on.ca (Brad Templeton) (07/30/90)

I just threw in XXENCODE mode into ABE/DABE.  Which means that ABE can
produce xxencode type files with the ABE extra information, and DABE
can decode them, or an xxdecoder can decode them.

Note that in my surveys, the following bytes were the ones that people
reported trouble with in EBCDIC translations.  ABE2 avoids these
as well:

	! ` [ \ ] ^ { | } ~

Note that "!" is in the set because of a bug in "dd" on some UNIX
machines, and some ebcdic translation out there is being done by dd.

Space and DEL are also in the no-no set, although space is usually
preserved these days, you can't count on leading and trailing space
in some systems.

Most UUENCODES used to use space for 0, but they found such troubles
and switched to grave (`), but that also has troubles in some areas.

Currently ABE supports ABE1 (94 printables, not safe over EBCDIC links
but the most compact) ABE2 (safe over EBCDIC), UUENCODE and XXENCODE.

I am thinking of adding a "text" mode that would be good for source
groups.  Next batch of spare time :-)
-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

boylanr@silver.ucs.indiana.edu (ross boylan) (07/31/90)

Some of the bitnet archive servers transmit info uuencoded.  I'm puzzled,
since some of the preceding discussion indicates that bitnet networks
are allergic to uuencoding.  I can't believe bitnet archive servers
send stuff in a form which can't even make it around bitnet.  Or is
their uuencode different from the one discussed here?