xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (07/11/90)
csu@alembic.acs.com (Dave Mack) writes: >The supposed advantage in the case of the Anonymous Contact Service >software which I recently posted to alt.sources is that the uuencoded >compressed tar file was 135K, whereas the corresponding shar file is >235K. Also, my version of shar3.24 died horribly when presented with >a directory tree (I now have Rich Salz' cshar kit, including makekit, >which solves almost all my problems, except that it insists on putting >the README in the second kit.) 1) Part of the problem is that there are lots and lots of versions of "shar" and "unshar" around, only a few of which are reliable and handle all the nasty cases (try unsharing a tree with the shar files out of order) such as files which must be split, binary files, and so on, while "tar" is a reasonably standard program, however cumbersome. >[...] my next release of the ACS will be in the form of shar files. As an >added bonus, all of the filenames will be under 14 characters in this >one. As a newbie to one of the systems that causes this concern, I note that this isn't good enough. If you want to be able to issue patch files that work with "patch", you have to hold the file names down to 9 or fewer characters so that the ".orig" extensions patch creates are respected, rather than lost, making the original have the same name as, and therefore clobber/be clobbered by the patched version. >I cannot, however, guarantee that the README will be in Part01. Not as big a deal as having shar create a file that calls for utilities not present on my system. (Hypothetical problem, so far. ;-( ) >Dave Mack >embittered idealist, net.scum, villain, and commercial abuser of the >net for over three days. Hey, we knew that about you long before you posted ACS, Dave! ;-) Note also that the translation ASCII -> EBCDIC -> ASCII, especially when the site doing the first translation is not the same as the site doing the second, is not (necessarily) an identity transformation even when the original is restricted to the 96 printable ASCII characters plus newline. Printable ASCII contains characters not present in EBCDIC (and of course vice versa) and there is no standard pair of translation tables. Frankly, you're pushing your luck to try to get the alphabet intact through a set of machines with different character encodings, much less the whole printable character set. Note also, whether compress-uuencode-compress saves or wastes bytes is entirely dependent on the original data. Rich $alz and I went around about this one three years ago, and that is the only conclusion that can be drawn; the argument is not worth rehashing in 1990. Kent, the man from xanth. <xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us> -- (member, "lonely old geezers with lap cats" posting cabel)
jtc@van-bc.wimsey.bc.ca (J.T. Conklin) (07/12/90)
In article <1990Jul11.072612.10374@zorch.SF-Bay.ORG> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes: |>[...] my next release of the ACS will be in the form of shar files. As an |>added bonus, all of the filenames will be under 14 characters in this |>one. | |As a newbie to one of the systems that causes this concern, I note that |this isn't good enough. If you want to be able to issue patch files |that work with "patch", you have to hold the file names down to 9 or fewer |characters so that the ".orig" extensions patch creates are respected, |rather than lost, making the original have the same name as, and therefore |clobber/be clobbered by the patched version. If you get the most up to date version of patch, it only appends a ~ or # to the filenames on systems with 14 char filesystems. Therefore, 13 char filenames should be sufficient. --jtc -- J.T. Conklin UniFax Communications Inc. ...!{uunet,ubc-cs}!van-bc!jtc, jtc@wimsey.bc.ca
greggy@infmx.UUCP (greg yachuk) (07/12/90)
In article <1990Jul11.072612.10374@zorch.SF-Bay.ORG> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes: > >Note also, whether compress-uuencode-compress saves or wastes bytes is >entirely dependent on the original data. Rich $alz and I went around >about this one three years ago, and that is the only conclusion that >can be drawn; the argument is not worth rehashing in 1990. This general statement is surely true, but we are discussing alt.sources which contains mostly source (readable text) files. I'm including a (rather long) listing of the results of the two packing techniques. The sources are a sampling of those that I've saved off the net, mostly from alt.sources and comp.sources.misc. I haven't done any pre-selection or editting in order to present the numbers which I prefer. They are simply the sources which I have never moved down to my PC. For each kit presented, the third and fourth lines are the important ones. The third line gives the size of the kit on disk when it reaches the user's system. The forth line gives the size of the file which must be transmitted between systems. (I'm assuming normal old 'compress' is used for this.) As can be seen, there is small variation between the TCU [*] size and the shar size (third line). Sometimes the TCU is larger and sometime the shar is larger. They usually weigh in within a K byte or two of each other. If site-to-site transmission does not include automatic compression, there is little difference between the two. The fourth line is much more interesting. The shar's get compressed to approximately half the size of the TCU's. This means as much as twice the communications costs for shipping TCU's as for shipping shar's. [*] Tar-Compress-Uuencode Some notes are in order: 1) A couple of the sizes for Shar's are missing. This happens when the shar process burps or cannot handle the input files. Most likely cause is that a file is too large to fit into a shar, and I don't have a shar-maker which splits automatically. 2) I'm using Rich Salz' shar and makekit utilities. Thanks Rich. 3) In order to avoid problem 1, I'm using a shar size of 100k. I know that this isn't normally a useful thing, since many mailers would puke on them. Normal sized shar's would have a slightly higher total size, since the overhead stuff would have to be repeated more times. This is probably negligable for the purposes of this experiment, however. > >Kent, the man from xanth. Greg Yachuk Informix Software Inc., Menlo Park, CA 92025 greggy@informix.com | {uunet,pyramid}!infmx!greggy (415) 926-6300 --- no pithy comments --- abe tar: 131072 tar.Z: 54204 tar.Z.uue: 128912 shar: 124207 tar.Z.uue.Z: 121967 shar.Z: 56693 =========================================================== amoeba tar: 294912 tar.Z: 109929 tar.Z.uue: 261416 shar: 280932 tar.Z.uue.Z: 247643 shar.Z: 119618 =========================================================== atty tar: 221184 tar.Z: 87353 tar.Z.uue: 207736 shar: 219818 tar.Z.uue.Z: 196662 shar.Z: 94337 =========================================================== btoa tar: 57344 tar.Z: 20383 tar.Z.uue: 48496 shar: 49063 tar.Z.uue.Z: 46160 shar.Z: 20252 =========================================================== cgrind tar: 16384 tar.Z: 6557 tar.Z.uue: 15622 shar: 12034 tar.Z.uue.Z: 15504 shar.Z: 6346 =========================================================== chmod tar: 24576 tar.Z: 11227 tar.Z.uue: 26727 shar: 18203 tar.Z.uue.Z: 25944 shar.Z: 9765 =========================================================== combine tar: 172032 tar.Z: 65119 tar.Z.uue: 154873 shar: 170195 tar.Z.uue.Z: 146846 shar.Z: 71853 =========================================================== crisp tar: 1597440 tar.Z: 591569 tar.Z.uue: 1406649 shar: tar.Z.uue.Z: 1328614 shar.Z: =========================================================== cshar3b tar: 270336 tar.Z: 102321 tar.Z.uue: 243327 shar: 256169 tar.Z.uue.Z: 230032 shar.Z: 113242 =========================================================== cvs tar: 376832 tar.Z: 146125 tar.Z.uue: 347483 shar: 365439 tar.Z.uue.Z: 328316 shar.Z: 156979 =========================================================== dynlib tar: 49152 tar.Z: 16052 tar.Z.uue: 38199 shar: 35991 tar.Z.uue.Z: 36691 shar.Z: 15226 =========================================================== ed tar: 49152 tar.Z: 22053 tar.Z.uue: 52464 shar: 43367 tar.Z.uue.Z: 49814 shar.Z: 20711 =========================================================== eiffel tar: 32768 tar.Z: 10668 tar.Z.uue: 25397 shar: 25894 tar.Z.uue.Z: 24718 shar.Z: 10558 =========================================================== faces tar: 368640 tar.Z: 126525 tar.Z.uue: 300877 shar: 353624 tar.Z.uue.Z: 284610 shar.Z: 140095 =========================================================== flex2b tar: 311296 tar.Z: 120043 tar.Z.uue: 285468 shar: 313541 tar.Z.uue.Z: 270516 shar.Z: 133655 =========================================================== fmt tar: 8192 tar.Z: 1325 tar.Z.uue: 3179 shar: 4216 tar.Z.uue.Z: 3179 shar.Z: 2425 =========================================================== freegram tar: 221184 tar.Z: 78273 tar.Z.uue: 186148 shar: 225552 tar.Z.uue.Z: 176236 shar.Z: 85148 =========================================================== gawk211 tar: 835584 tar.Z: 347847 tar.Z.uue: 827133 shar: 847162 tar.Z.uue.Z: 780479 shar.Z: 372921 =========================================================== getopt tar: 49152 tar.Z: 20114 tar.Z.uue: 47857 shar: 41980 tar.Z.uue.Z: 45529 shar.Z: 20063 =========================================================== head tar: 8192 tar.Z: 1343 tar.Z.uue: 3222 shar: 4507 tar.Z.uue.Z: 3222 shar.Z: 2473 =========================================================== ilib tar: 548864 tar.Z: 191079 tar.Z.uue: 454372 shar: 502861 tar.Z.uue.Z: 429894 shar.Z: 207734 =========================================================== le tar: 32768 tar.Z: 12652 tar.Z.uue: 30113 shar: 29332 tar.Z.uue.Z: 29112 shar.Z: 13072 =========================================================== menu tar: 483328 tar.Z: 178257 tar.Z.uue: 423884 shar: 466489 tar.Z.uue.Z: 400528 shar.Z: 200122 =========================================================== mmv tar: 90112 tar.Z: 38720 tar.Z.uue: 92096 shar: 87030 tar.Z.uue.Z: 87106 shar.Z: 38780 =========================================================== multitee tar: 32768 tar.Z: 13240 tar.Z.uue: 31517 shar: 25752 tar.Z.uue.Z: 30437 shar.Z: 12622 =========================================================== mz tar: 57344 tar.Z: 20740 tar.Z.uue: 49343 shar: 49971 tar.Z.uue.Z: 46939 shar.Z: 20560 =========================================================== nroff tar: 147456 tar.Z: 56068 tar.Z.uue: 133348 shar: 146268 tar.Z.uue.Z: 126227 shar.Z: 61393 =========================================================== pcmail tar: 696320 tar.Z: 265607 tar.Z.uue: 631586 shar: 675083 tar.Z.uue.Z: 595848 shar.Z: 296326 =========================================================== perl tar: 1212416 tar.Z: 468325 tar.Z.uue: 1113604 shar: 1191568 tar.Z.uue.Z: 1050659 shar.Z: 504233 =========================================================== rcs tar: 1089536 tar.Z: 425666 tar.Z.uue: 1012168 shar: 1100833 tar.Z.uue.Z: 954597 shar.Z: 476641 =========================================================== rn tar: 630784 tar.Z: 250719 tar.Z.uue: 596180 shar: 617464 tar.Z.uue.Z: 562641 shar.Z: 272664 =========================================================== ro tar: 163840 tar.Z: 60961 tar.Z.uue: 144980 shar: 158113 tar.Z.uue.Z: 137318 shar.Z: 66229 =========================================================== sh tar: 475136 tar.Z: 192792 tar.Z.uue: 458443 shar: 481751 tar.Z.uue.Z: 433311 shar.Z: 210913 =========================================================== sh07 tar: 458752 tar.Z: 185643 tar.Z.uue: 441446 shar: 464890 tar.Z.uue.Z: 417548 shar.Z: 203428 =========================================================== shape tar: 1630208 tar.Z: 574431 tar.Z.uue: 1365899 shar: 1630030 tar.Z.uue.Z: 1290386 shar.Z: 669724 =========================================================== simped tar: 81920 tar.Z: 28301 tar.Z.uue: 67324 shar: 72055 tar.Z.uue.Z: 63922 shar.Z: 28359 =========================================================== teleplay tar: 57344 tar.Z: 19139 tar.Z.uue: 45542 shar: 47882 tar.Z.uue.Z: 43448 shar.Z: 18458 =========================================================== tinfo tar: 237568 tar.Z: 88367 tar.Z.uue: 210147 shar: 224697 tar.Z.uue.Z: 198990 shar.Z: 96511 =========================================================== tooltool tar: 688128 tar.Z: 267789 tar.Z.uue: 636774 shar: tar.Z.uue.Z: 601370 shar.Z: =========================================================== vtree tar: 49152 tar.Z: 20135 tar.Z.uue: 47907 shar: 38718 tar.Z.uue.Z: 45606 shar.Z: 18609 =========================================================== which tar: 16384 tar.Z: 6570 tar.Z.uue: 15650 shar: 12900 tar.Z.uue.Z: 15521 shar.Z: 7072 =========================================================== xfig tar: 1097728 tar.Z: 349213 tar.Z.uue: 830382 shar: 1086200 tar.Z.uue.Z: 784855 shar.Z: 396620 =========================================================== xtail tar: 16384 tar.Z: 5563 tar.Z.uue: 13259 shar: 9325 tar.Z.uue.Z: 13253 shar.Z: 4972 =========================================================== xxencode tar: 16384 tar.Z: 6011 tar.Z.uue: 14326 shar: 14787 tar.Z.uue.Z: 14302 shar.Z: 6867 =========================================================== -- Greg Yachuk Informix Software Inc., Menlo Park, CA 92025 greggy@informix.com | {uunet,pyramid}!infmx!greggy (415) 926-6300 --- no pithy comments ---