jeff@cullsj.UUCP (Jeffrey C. Fried) (05/01/88)
Currently there is an ongoing discussion in comp.binaries.ibm.pc.d concerned with establishing a standard for the exchange of software over the net. I would like to offer a suggestion. The following tools are available in source code format: COMPRESS (Lem-Ziv text compressor), ARITH (arithmetic compression for binary), UUencode/decode. Since all of these will run under a variety of environments (IBM-PC, AMIGA, ATARI, VMS, SYS5, BSD), why not make these the basis for communicating. I may have a public domain SHAR/UNSHAR program in C as well (i do have a text archiver in PASCAL as well that would suffice if PASCAL were acceptable to everyone). These should be enough to support everyone's needs, and because we have the source, it can be made available to everyone. I realize that some of you may have a favorite tool which you may feel surpasses the capabilities of these. I am making these suggestions to provide a starting point towards arriving at a mutually acceptable standard. For those who do not know, COMPRESS is a single file text compressor which works faster than any of the ARC clones, and, ARITH is something i constructed from a description in the ACM. ARITH compresses slower, but better, than huffman, ie., SQ/UNSQ. Most of all its in the public domain and i'll be posting source if enough people show an interest. In any case, let the discussion continue.
w8sdz@brl-smoke.ARPA (Keith B. Petersen ) (05/01/88)
Rather than discussing how to compress our files we should be discussing how to get them transferred error-free through the network. Uuencode/uudecode and compress/uncompress do no error checking. Think about this the next time you are tempted to uuencode a binary file. How do you know it will be received error-free by the recipient? At least when it is compressed by the ARC program a CRC of the original file is stored *inside* the ARC. It is checked when you extract the member file. The net spends thousands of dollars on reposts of truncated or otherwise munged files. Some of that money would be better spent on finding where the problem is and fixing it. A uuencode with CRC or checksum would go a long way towards finding the site(s) responsible for this waste. -- Keith Petersen Arpa: W8SDZ@SIMTEL20.ARPA Uucp: {bellcore,decwrl,harvard,lll-crg,ucbvax,uw-beaver}!simtel20.arpa!w8sdz GEnie: W8SDZ
wcf@psuhcx.psu.edu (Bill Fenner) (05/01/88)
Just one thing that needs to be known -- PC's can do no more than 12-bit compression. So if you are compressing your file from a UNIX system, you need to say comress -b12 filename . Bill -- __ _ _ _____ Bill Fenner Bitnet: wcf @ psuhcx.bitnet / ) // // / ' Internet: wcf @ hcx.psu.edu /--< o // // ,-/-, _ __ __ _ __ UUCP: ihnp4!psuvax1!psuhcx!wcf /___/_<_</_</_ (_/ </_/ <_/ <_</_/ (_ Fido: Sysop at 263/42
jpn@teddy.UUCP (John P. Nelson) (05/02/88)
>Just one thing that needs to be known -- PC's can do no more than 12-bit >compression. So if you are compressing your file from a UNIX system, >you need to say comress -b12 filename . This myth has been repeated several times, so I felt it was necessary to speak up. PCs most certainly CAN do a 16 bit compress/uncompress. It takes 512K of available memory to run, and you also either need a compiler that supports HUGE model arrays, or else you have to manually break up the buffer space into multiple 64K arrays (this is what the version I have does - The port was done a couple of years ago for XENIX, but it works just fine under MSDOS as well). -- john nelson UUCP: {decvax,mit-eddie}!genrad!teddy!jpn ARPA (sort of): talcott.harvard.edu!panda!teddy!jpn
rsalz@bbn.com (Rich Salz) (05/02/88)
If you are (sigh) going to post binaries on Usenet, DO NOT compress them first. Many Usenet sites use compress to pack up their news batches. Compressing a compressed file makes it larger. -- Please send comp.sources.unix-related mail to rsalz@uunet.uu.net.
wcf@psuhcx.psu.edu (Bill Fenner) (05/02/88)
In article <4740@teddy.UUCP> jpn@teddy.UUCP (John P. Nelson) writes: >>Just one thing that needs to be known -- PC's can do no more than 12-bit >>compression. So if you are compressing your file from a UNIX system, >This myth has been repeated several times, so I felt it was necessary to >speak up. PCs most certainly CAN do a 16 bit compress/uncompress. It >takes 512K of available memory to run, and you also either need a compiler Hard as it is to believe, a lot of people don't have 640k computers... But, I think that this utility would do well to be distributed... mind posting it on comp.binaries.ibm.pc? (Can it do 12-bit also?) Thanks -- __ _ _ _____ Bill Fenner Bitnet: wcf @ psuhcx.bitnet / ) // // / ' Internet: wcf @ hcx.psu.edu /--< o // // ,-/-, _ __ __ _ __ UUCP: ihnp4!psuvax1!psuhcx!wcf /___/_<_</_</_ (_/ </_/ <_/ <_</_/ (_ Fido: Sysop at 263/42
jeff@cullsj.UUCP (Jeffrey C. Fried) (05/03/88)
In article <55@psuhcx.psu.edu>, wcf@psuhcx.psu.edu (Bill Fenner) writes: > Just one thing that needs to be known -- PC's can do no more than 12-bit > compression. So if you are compressing your file from a UNIX system, > you need to say comress -b12 filename . I've constructed a version of COMPRESS using 13 bits and the small model by making only one array large. I've also constructed a version in BIG mode which runs at half the speed and compress only 10 better using the full addressing used under UNIX.
wcf@psuhcx.psu.edu (Bill Fenner) (05/03/88)
In article <696@fig.bbn.com> rsalz@bbn.com (Rich Salz) writes: >If you are (sigh) going to post binaries on Usenet, DO NOT compress >them first. Many Usenet sites use compress to pack up their news >batches. Compressing a compressed file makes it larger. We've gone through this before, and it has never been explained to my satisfaction. I think you do save something by compressing a uuencoded compressed file over compressing the uuencoded uncompressed file. I just did a test. The file I used may not have been a good 'average binary' (I used a moria save character - the best I could find on short notice). Anyway... Origional size: (cannot send; it's binary): 95,348 bytes Compressed (also cannot send; also binary): 6,772 bytes Now... UUEncoded then compressed (the amount that would be transmitted if you simply uuencode the file) : 11,531 bytes And the kicker... compressed, UUEncoded, then compressed (as if you compressed it, then uuencoded it, then posted it, then the news will compress it) : 9009 bytes. Like I said, this may not have been a proper 'average binary'. I am going to write a shell script to check all these things, and run it on several actual PC binaries and ARC files. I will post the results to comp.binaries.ibm.pc.d. -- __ _ _ _____ Bill Fenner Bitnet: wcf @ psuhcx.bitnet / ) // // / ' Internet: wcf @ hcx.psu.edu /--< o // // ,-/-, _ __ __ _ __ UUCP: ihnp4!psuvax1!psuhcx!wcf /___/_<_</_</_ (_/ </_/ <_/ <_</_/ (_ Fido: Sysop at 263/42
loci@csccat.UUCP (Chuck Brunow) (05/03/88)
In article <55@psuhcx.psu.edu> wcf@psuhcx (Bill Fenner) writes: >Just one thing that needs to be known -- PC's can do no more than 12-bit >compression. So if you are compressing your file from a UNIX system, >you need to say comress -b12 filename . > Are you quite sure about that? 13-bit compress will run on other 64k segment machines (80?86 based).
jpn@teddy.UUCP (John P. Nelson) (05/03/88)
>If you are (sigh) going to post binaries on Usenet, DO NOT compress >them first. Many Usenet sites use compress to pack up their news >batches. Compressing a compressed file makes it larger. This is incorrect. I hope I can clear this up once and for all: If you have ascii files (like source or documentation), then it is true that compressing, then uuencoding is a BAD IDEA, even though the posting appears to be smaller than the cleartext. That is because when the file is compressed again, it will be larger than the cleartext after IT is compressed. If you have a binary file that MUST be uuencoded to be posted, then compression before uuencoding IS HELPFUL! Most files that are compressed, then uuencoded, then compressed again are signficantly smaller than files that are simply uuencoded, then compressed once! I think that the reason this is true is that uuencoding tends to interfere with the compression process. By the way, compressing a uuencoded file almost always results in a small reduction in size. When I say "compressed", I include archival programs such as ARC and ZOO. These conclusions were reached by experimental evidence (I didn't conduct the experiments, others did, and they posted their results). Perhaps no one bothered to read these informative articles, (or else my suspicion is true: the maximum long-term memory of the average USENET reader is no more than 1 month long). -- john nelson UUCP: {decvax,mit-eddie}!genrad!teddy!jpn ARPA (sort of): talcott.harvard.edu!panda!teddy!jpn
egisin@watmath.waterloo.edu (Eric Gisin) (05/04/88)
In article <696@fig.bbn.com>, rsalz@bbn.com (Rich Salz) writes: > If you are (sigh) going to post binaries on Usenet, DO NOT compress > them first. Many Usenet sites use compress to pack up their news > batches. Compressing a compressed file makes it larger. But you are would not be compressing the compressed file, you would be compressing an encoded file. Here are the results of some experiments on a 100K UNIX binary: $ uuencode | compress -rw-r--r-- 1 egisin 83111 May 3 16:25 uu.Z $ compress | uuencode | compress -rw-r--r-- 1 egisin 81241 May 3 16:30 uuz.Z Compressing before encoding results in a 2% shorter file, but that is not really significant. You can get better results by using a simple hex encoding: $ compress | hexencode | compress -rw-r--r-- 1 egisin 78831 May 3 16:31 hdz.Z None of this applies to source files, they should never be compressed and encoded.
jeff@cullsj.UUCP (Jeffrey C. Fried) (05/04/88)
1) COMPRESS is a text only compression routine. It will not now, or ever, help in the compression of binary files. 2) ARITH is a more general compression routine using adaptive arithmetic coding. It will compress binary files where there is redundancy, but when it fails (on an extremely random file) the result increases very little (under 1% in my experience). It compresses better than HUFFMAN, but it is NOT faster than SQ/UNSQ which are written in assembler whereas ARITH is written in C. (Once again, i will post it if there is sufficient interest.) 3) The source for ZOO, PKARC, and the others is NOT available. Therefore we are at the whims of whomever is currently supporting (or not supporting) them. 4) COMPRESS works faster and better on text files then the ARC routines because they use 12 bit compression, where 13-bit (and more) are possible under even the PC for COMPRESS (i've tried it on ans AT-clone). 5) On the weak side, there is as yet, no CRC or checksum for any of these, but adding it would be someithing i am willing to take responsibility for should enough people decide they would like to take the approach which i'm currently suggesting. Also, there no directory support provided with these tools. They work on only one file at a time. This is also correctable since the source is available. 5) LASTLY: I am not trying to criticize the ARC routines, rather i am trying to offer an alternative which i feel will reduce the time for transmission of files, as well as, providing us with portability. COMPRESS, ARITH, UNSHAR and UUENCODE are all available at the source level. COMPRESS and ARITH have been tried in at least three different environments: UNIX (BSD), VMS and PC/MS-DOS. Remember, for those of us who are NOT using the NET at the expense of a university, the cost of communication, and therefore the time required to transmit a file, are VERY important. If this sounds like a flame, then please assign my apparent bad attitude to poor methodology rather than a desire to upset people. This is provided in the spirit of adding to what i hope will become a meaningful dialog with a very practicle result.
mike@ists (Mike Clarkson) (05/04/88)
In article <696@fig.bbn.com>, rsalz@bbn.com (Rich Salz) writes: > If you are (sigh) going to post binaries on Usenet, DO NOT compress > them first. Many Usenet sites use compress to pack up their news > batches. Compressing a compressed file makes it larger. How about compressing a uuencoded compressed file. Does that result in a significantly larger than original file? I would really like to see a uniform standard, with error checking, and I think it is something worth the time it takes to do it. We could probably evolve the result to take care of another pet peeve of mine: error correction in the tar format. One thing I really miss from VMS is the backup tape archiver, which has tremendous error checking and correction. In 7 years I have only ever had (touch wood) 1 tape go on me, and that was because the oxide was falling off. Having spent a good part of today dealing with yet another dead Unix tar tape, I really wish we could find a better way. -- Mike Clarkson mike@ists.UUCP Institute for Space and Terrestrial Science mike@ists.yorku.ca York University, North York, Ontario, CANADA M3J 1P3 (416) 736-5611
chasm@killer.UUCP (Charles Marslett) (05/04/88)
In article <55@psuhcx.psu.edu>, wcf@psuhcx.psu.edu (Bill Fenner) writes: > Just one thing that needs to be known -- PC's can do no more than 12-bit > compression. ... Actually, I have sent several people copies of a minor mod to compress 4.0 that works fine if you have the memory (requires about 350-400 K above DOS to do 16-bit compression). The source assumes Turbo or Microsoft C for the PC but it doesn't take up an immense amount of disk space either (about 40K if I remember correctly). I have also ported it to Atari STs, so that covers some of the PC field. Anyone want to merge these changes into the more recent (4.1?) posting and perhaps make it work on Macs and Amigas? Any good rule of thumb on how many requests imply a posting choice? > Bill Charles Marslett chasm@killer.UUCP ...!ihnp4!killer!chasm
jcs@tarkus.UUCP (John C. Sucilla) (05/04/88)
In article <55@psuhcx.psu.edu> wcf@psuhcx (Bill Fenner) writes: >Just one thing that needs to be known -- PC's can do no more than 12-bit >compression. So if you are compressing your file from a UNIX system, >you need to say comress -b12 filename . Wrong! My 640K AT&T PC6300 has compress v4.0 running 16 bits on it right now. -V shows the options at: MSDOS, XENIX_16 and BITS=16. -- John "C" Sucilla {ihnp4}!tarkus!jcs Don't let reality stop you....
karl@triceratops.cis.ohio-state.edu (Karl Kleinpaste) (05/04/88)
1) COMPRESS is a text only compression routine. It will not now, or ever, help in the compression of binary files. Nonsense. [58] [8:33am] tut:/dino0/karl/bin/pyr/private> list enable 34831 -rwsr-x--- 2 root staff 8192 Apr 13 09:54 enable [59] [8:33am] tut:/dino0/karl/bin/pyr/private> file enable enable: 90x family demand paged pure executable [60] [8:33am] tut:/dino0/karl/bin/pyr/private> compress -v < enable > enable.Z Compression: 72.44% [61] [8:33am] tut:/dino0/karl/bin/pyr/private> list enable.Z 35427 -rw-r--r-- 1 karl staff 2257 May 4 08:34 enable.Z [62] [8:33am] tut:/dino0/karl/bin/pyr/private> --Karl
davidsen@steinmetz.ge.com (William E. Davidsen Jr) (05/04/88)
In article <296@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: | | 1) COMPRESS is a text only compression routine. It will not now, or ever, | help in the compression of binary files. [ compress gives about 30% compression on binaries, depending on content. Whoever told you that it was for text only was completely wrong. ] | | 2) ARITH is a more general compression routine using adaptive arithmetic | coding. It will compress binary files where there is redundancy, but | when it fails (on an extremely random file) the result increases very | little (under 1% in my experience). It compresses better than HUFFMAN, | but it is NOT faster than SQ/UNSQ which are written in assembler whereas | ARITH is written in C. | (Once again, i will post it if there is sufficient interest.) [ once again, do it, in source, so that others can test it themselves rather than relying on your opinion. ] | | 3) The source for ZOO, PKARC, and the others is NOT available. Therefore | we are at the whims of whomever is currently supporting (or not supporting) | them. [ the sources for zoo and arc have been posted several times to the net, and are available on a number of sites via ftp, uucp, and simple BBS download. ] | 5) On the weak side, there is as yet, no CRC or checksum for any of these, | but adding it would be someithing i am willing to take responsibility | for should enough people decide they would like to take the approach | which i'm currently suggesting. [ zoo and arc both have CRC. ] | Also, there no directory support provided with these tools. They work | on only one file at a time. This is also correctable since the source | is available. [ arc works on multiple files in multiple directories, but doesn't preserve subdirectory information. zoo preserves the information unless told not to do it (an option). ] | | 5) LASTLY: I am not trying to criticize the ARC routines, rather i am trying | [ deleted for brevity ] | Remember, for those of us who are NOT using the NET at the expense of a | university, the cost of communication, and therefore the time required | to transmit a file, are VERY important. [ everyone would like faster transmissions, but not at the expense of using a non-standard format which people can't use. Sending info which is not useful is a *real* waste of bandwidth. ] | | If this sounds like a flame, then please assign my apparent bad attitude to | poor methodology rather than a desire to upset people. This is provided in the | spirit of adding to what i hope will become a meaningful dialog with a very | practicle result. The most charitable assumption I can make is that you are woefully misinformed about the matters on which you speak. Please post this "ARITH" routine to let others evaluate it, and read the responses to your posting, many of which will probably not be even a polite as this one. -- bill davidsen (wedu@ge-crd.arpa) {uunet | philabs | seismo}!steinmetz!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me
davidsen@steinmetz.ge.com (William E. Davidsen Jr) (05/04/88)
I would like to add a little fuel to the fires of "which archiver" discussion. Use of the 'btoa' routine instead of uuencode would save 12% (!) on binary postings. This is a PD program, included in the compress package, and runs just fine on a PC. All the discussion of using PKARC to save 1-2% or not using it to save time for many of the people on the net seems pointless. We should use both (standard) arc and zoo formats, uuencode them, and save bandwidth by dropping this discussion. Hopefully Rahul will clarify this by edict. -- bill davidsen (wedu@ge-crd.arpa) {uunet | philabs | seismo}!steinmetz!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me
jpn@teddy.UUCP (John P. Nelson) (05/04/88)
> 1) COMPRESS is a text only compression routine. It will not now, or ever, > help in the compression of binary files. Whoa! Where did THIS come from!?!? It is simply not true! It IS true that compress does a better job at compressing text files, but this is because there is usually more redundency in text files than most binary files (like executables). Compress is simply MARVELOUS for binary files like bit-mapped graphics, getting something like 90% compression for many of them. > 2) ARITH is a more general compression routine using adaptive arithmetic > coding. It will compress binary files where there is redundancy, but > when it fails (on an extremely random file) the result increases very > little (under 1% in my experience). It compresses better than HUFFMAN, > but it is NOT faster than SQ/UNSQ which are written in assembler whereas > ARITH is written in C. > (Once again, i will post it if there is sufficient interest.) Now we get some facts. ARITH is HUFFMAN encoding. Compress is Lempel-Ziv encoding. Lempel-Ziv almost ALWAYS beats HUFFMAN (when there is a redundancy). It is certainly possible that Lempel-ziv might expand random files more than HUFFMAN, I haven't done any tests. Older versions of ARC used to try both HUFFMAN and Lempel-Ziv, and use the one that gave better compression. The HUFFMAN support was dropped (except for extracting from old archives), because Lempel-Ziv beat HUFFMAN 99% of the time! > 3) The source for ZOO, PKARC, and the others is NOT available. Therefore > we are at the whims of whomever is currently supporting (or not supporting) > them. MORE untruths. The source for both ZOO and ARC are in C, and have been distributed on USENET several times! Some versions of the ARC source included the extra code to handle the SQUASH compression algorithm added by PKARC. > 4) COMPRESS works faster and better on text files then the ARC routines > because they use 12 bit compression, where 13-bit (and more) are possible > under even the PC for COMPRESS (i've tried it on ans AT-clone). PKARC's SQUASH is 13 bit compression. Any more than this requires a working buffer larger than 64K, which is why they are generally not used very much on PCs. The amount of additional compression between 13 bit and 16 bit is no more than 2 or 3 percent! Also, there is very little difference in speed between the 12 bit and 13 bit compression algorithms. The major difference is in the memory requirements. > 5) On the weak side, there is as yet, no CRC or checksum for any of these, > but adding it would be someithing i am willing to take responsibility > for should enough people decide they would like to take the approach > which i'm currently suggesting. This is the LEAST of the problems with using compress. > Also, there no directory support provided with these tools. They work > on only one file at a time. This is also correctable since the source > is available. True, but why reinvent the wheel. The source for the EXISTING programs is ALSO available! > If this sounds like a flame, then please assign my apparent bad attitude to >poor methodology rather than a desire to upset people. This is provided in the >spirit of adding to what i hope will become a meaningful dialog with a very >practicle result. Your bad attitude appears to be due to an overdose of misinformation! -- john nelson UUCP: {decvax,mit-eddie}!genrad!teddy!jpn ARPA (sort of): talcott.harvard.edu!panda!teddy!jpn
dhesi@bsu-cs.UUCP (Rahul Dhesi) (05/04/88)
In article <296@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: > 3) The source for ZOO, PKARC, and the others is NOT available. The source for zoo 1.51 was posted to comp.sources.unix in the summer of 1987. The source for zoo 2.01 will be posted in the near future. -- Rahul Dhesi UUCP: <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi
bobmon@iuvax.cs.indiana.edu (RAMontante) (05/04/88)
cullsj.UUCP (Jeffrey C. Fried) writes, among other things: , , 1) COMPRESS is a text only compression routine. It will not now, or ever, , help in the compression of binary files. This statement made me shell out and run the following quick experiment: -rwxr-xr-x 1 bobmon 15360 Feb 27 01:22 pgen -rwxr-xr-x 1 bobmon 10116 May 4 08:46 pgen.Z -rwxr-xr-x 1 bobmon 14336 Feb 24 08:19 pom -rwxr-xr-x 1 bobmon 9945 May 4 08:47 pom.Z Pgen and pom are both executable files (compiled from 'c'). Granted, this is on a VAX machine, running the full-blown compress. My attempts to run compress on my 8088 box were frustrating, given its memory requirements, and I haven't seen enough '.Z' formatted files to be worth the hassle. But I would assume that if it runs at all on a smaller machine, it will produce the same results; unlike zoo and arc, it cannot choose one compression method over another. , 3) The source for ZOO, PKARC, and the others is NOT available. Therefore , we are at the whims of whomever is currently supporting (or not supporting) , them. Source for arc is, at least for some Unix boxes. Zoo source has been promised. Pkarc was originally written in 8088 assembler, not the friendliest source. , 4) COMPRESS works faster and better on text files then the ARC routines , because they use 12 bit compression, where 13-bit (and more) are possible , under even the PC for COMPRESS (i've tried it on ans AT-clone). I haven't seen source for compress, either. And the executables I've seen were enormous, and limited to 12-bit LZW on 8088's under MSDOS; just like zoo and arc (and pkarc's squash method is some sort of 13-bit LZW). I've never heard anyone claim responsibility for compress, while the authors of zoo, pkarc, and arc are named, revered, vilified, and flamed frequently. At least one of them is an active participant on the Usenet. (Plug: I think that's one strength of zoo, although Rahul might disagree :-) , 5) On the weak side, there is as yet, no CRC or checksum for any of these, Any of WHAT? Zoo and arc certainly have a CRC value. Compress is compress. Its Unix-origin philosophy says that separate functions should be done by separate routines with their outputs tied together by the operating system. I think this is at the heart of some of the debates here. The philosophy works fine on a big multitasking machine like a VAX (or a suitably equipped 680x0 or '386?), and the entire news mailer system is predicated on that principle -- the mailer just calls compress (EVERYbody has compress, right?) to pack things in for it; it doesn't worry about whether the result is correct, and neither does compress. It's up to you to aggregate your files with shar or something. This piece-at-a-time philosophy is weaker on something like my MSDOS 8088 box. There aren't multiple users all needing similar fundamental tools, there's just me. And I haven't the resources (memory or CPU cycles) to support lots of little pieces that work fine individually but need sophisticated glue to work together; MSDOS's simulation of pipes is pathetic. In such a situation an integrated package (viz., zoo or arc) makes a lot more sense. They can incorporate in a consistent manner all those little pieces that a system admin. may have put on a Unix box, but which I haven't yet found while rummaging around BBS's. By integrating everything a top-down design is possible, unlike what happens when you bend the problem to fit the tools you already have. , but adding it would be someithing i am willing to take responsibility , for should enough people decide they would like to take the approach , which i'm currently suggesting. At which point it will become yet another uncommon non-standard (like ARITH?). I don't think adding code will make it fit any better on small machines, and the big machines can afford to calculate a CRC with an external routine. Not to mention the question of what you DO with it... Is the CRC for compress's use? Then it becomes not-quite-compress. Is it for human use? Then how do I recreate it to find out if the file is still intact? ... , 5) LASTLY: I am not trying to criticize the ARC routines, rather i am trying , to offer an alternative which i feel will reduce the time for transmission , of files, as well as, providing us with portability. COMPRESS, ARITH, , UNSHAR and UUENCODE are all available at the source level. COMPRESS and , ARITH have been tried in at least three different environments: UNIX (BSD), , VMS and PC/MS-DOS. , Remember, for those of us who are NOT using the NET at the expense of a , university, the cost of communication, and therefore the time required , to transmit a file, are VERY important. I don't find 1200bps transmission to be a lot of fun to wait for, either... but I take it that your basic argument is that compress makes smaller archives than zoo or arc, which are therefore cheaper to transmit. I don't see that the compression improvement is as significant as you imply (and your statement about binary is completely at odds with all my experience). The other strengths of the integrated packages offer a LOT of functionality, some of which I would seek out even if there were no compression involved. The biggest problem I see is that many news mailers compress everything blindly, so that an already-compressed file gets bigger. This would also be true of a sufficiently random file, although I think most executables aren't that random. And this compress-and-be-damned behavior is not a strength of the system, it's a weakness. (Even compress will complain if its result is bigger than its original; does the mailer ignore this, or are the net.gods lying when they claim they're shipping bigger files because of the double compression?)
ralf@b.gp.cs.cmu.edu (Ralf Brown) (05/04/88)
In article <296@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: } 3) The source for ZOO, PKARC, and the others is NOT available. Therefore } we are at the whims of whomever is currently supporting (or not supporting) } them. Sources are available for ZOO and ARC. -- {harvard,uunet,ucbvax}!b.gp.cs.cmu.edu!ralf -=-=- AT&T: (412)268-3053 (school) ARPA: RALF@B.GP.CS.CMU.EDU |"Tolerance means excusing the mistakes others make. FIDO: Ralf Brown at 129/31 | Tact means not noticing them." --Arthur Schnitzler BITnet: RALF%B.GP.CS.CMU.EDU@CMUCCVMA -=-=- DISCLAIMER? I claimed something?
wtr@moss.ATT.COM (05/04/88)
In article <296@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: > >1) COMPRESS is a text only compression routine. It will not now, or ever, > help in the compression of binary files. > [I'm not sure if this will be construed as a flame, but, asbestos suit in hand, here goes!] WHAT ARE YOU TALKING ABOUT!?!?!?!? I have assumed that everyone has been talking about the program COMPRESS v4.0 that was posted to comp.sources.???? late last year (let's not get too picky about the dates ;-). It was based upon a "modified Lempel-Ziv algorithm" as published in IEEE Computer by Terry A. Welch. PD source was (at least in part) written by Joe Orost. (appologies to anyone unintentionally left out of the credits) With the full sixteen-bit compression, it does a great job of compressing (almost ;-) all files, binary and source. Most compression ratios are in the 50-60% range, occasionally as high as 75%. (larger files seem to compress a little better) I have no idea what program you are referring to when you are describing your 'compress' but it is certainly not the same program that I run on my AT clone at home. ===================================================================== Bill Rankin Bell Labs, Whippany NJ (201) 386-4154 (cornet 232) email address: ...![ ihnp4 ulysses cbosgd allegra ]!moss!wtr ...![ ihnp4 cbosgd akgua watmath ]!clyde!wtr =====================================================================
jeff@cullsj.UUCP (Jeffrey C. Fried) (05/05/88)
I stand corrected. Since Lem-Ziv was DESIGNED for text compression, and the authors do not mention its use for binaries, i never considered using it. I tried it on an executable under UNIX and obtained a good reduction, for reasons which are not apparent. I'm sure that there are cases where this does not work (like graphics files), but it does appear to work , and in this case better than the current version of ARITH. However, my point was that for TEXT, COMPRESS does a better job then the ARC programs with which i'm familiar. Also, i did not know that source for zoo was available - a consideration which i believe to be VERY important since support usually comes best from those who use a product. I would like to thank those who took the time to correct my missunderstanding concerning the use of compression on the net, but i find it just a bit difficult because of the tone used in communicating with me. For those who suggested that i "do my homework" before posting something to the net, i can only say that since the net is my ONLY contact with this problem, and that the comp...d group is for discussions, i am in essence "doing my homework". I'm sorry if my attempt to add to the discussion has caused anyone to feel that their precious time has been wasted, but i think that you're as wrong as you are rude. Humbly yours, Jeff Fried ...!ames!cullsj!jeff Cullinet Software 2860 Zanker Road, Suite 206 Reality, what a concept! San Jose, CA, 95134
cudcv@daisy.warwick.ac.uk (Rob McMahon) (05/05/88)
In article <292@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: >The following tools are >available in source code format: COMPRESS (Lem-Ziv text compressor), ARITH >(arithmetic compression for binary), UUencode/decode. Since all of these >will run under a variety of environments (IBM-PC, AMIGA, ATARI, VMS, SYS5, >BSD), why not make these the basis for communicating. I hope we're talking about binary files here, in which case I don't care because I'd never just take a binary from the net and run it on one of my machines. If you're talking about sources, I like to scan down, read the README, check out the comments in main etc., before I even save it to disk. If I get all the bits of a posting, tack them together, uudecode them, and uncompress them, only to find it's of no use to me, I'm not going to be amused. I have this feeling that people aren't going to bother to send proper introductory articles in plain text before the actual posting. Rob -- UUCP: ...!mcvax!ukc!warwick!cudcv PHONE: +44 203 523037 JANET: cudcv@uk.ac.warwick.cu ARPA: cudcv@cu.warwick.ac.uk Rob McMahon, Computing Services, Warwick University, Coventry CV4 7AL, England
geoff@utstat.uucp (Geoff Collyer) (05/05/88)
In article <296@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: > 1) COMPRESS is a text only compression routine. It will not now, or ever, > help in the compression of binary files. This is absolutely dead wrong. compress compresses any kind of file, and has been used to compress (and correctly uncompress!), for example, graphics bit maps, sendmail configuration files :-), and tar archives containing binaries. -- Geoff Collyer utzoo!utstat!geoff, utstat.toronto.{edu,cdn}!geoff
jbuck@epimass.EPI.COM (Joe Buck) (05/05/88)
In article <296@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: > > 1) COMPRESS is a text only compression routine. It will not now, or ever, > help in the compression of binary files. Most emphatically wrong. compress works just fine on many types of binary files. It can give 90% or more compression on bitmap data, and usually > 50% compression on Unix executable files. About the only type of file I know of that compress fails on consistently is floating point data in binary format. As long some strings of bytes occur much more frequently than others (whether they represent characters, opcodes, or grey levels) compress kicks ass. -- - Joe Buck {uunet,ucbvax,sun,<smart-site>}!epimass.epi.com!jbuck Old Internet mailers: jbuck%epimass.epi.com@uunet.uu.net Argue for your limitations and you get to keep them. -- Richard Bach
campbell@maynard.BSW.COM (Larry Campbell) (05/05/88)
In article <4740@teddy.UUCP> jpn@teddy.UUCP (John P. Nelson) writes:
<>>Just one thing that needs to be known -- PC's can do no more than 12-bit
<>>compression. So if you are compressing your file from a UNIX system,
<>>you need to say comress -b12 filename .
<>
<>This myth has been repeated several times, so I felt it was necessary to
<>speak up. PCs most certainly CAN do a 16 bit compress/uncompress. ...
Only a subset of PCs can do 16-bit compress/uncompress. Mine can't.
I'm running VENIX/86 2.0, which is basically V7; the PCC-derived
C compiler has only the tiny and small memory models (exactly
corresponding to non-split and split PDP-11s, which also cannot
handle 16-bit compress).
So it is true that PCs with a C compiler that supports multiple data
segments can handle 16-bit compress, but that hardly encompasses all
PCs in the world.
--
Larry Campbell The Boston Software Works, Inc.
Internet: campbell@maynard.bsw.com 120 Fulton Street, Boston MA 02109
uucp: {husc6,mirror,think}!maynard!campbell +1 617 367 6846
loci@csccat.UUCP (Chuck Brunow) (05/05/88)
Let me point out one simple fact: source code is VERY MUCH SMALLER than binaries.
paul@devon.UUCP (Paul Sutcliffe Jr.) (05/05/88)
In article <296@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: +--------- | 1) COMPRESS is a text only compression routine. It will not now, or ever, | help in the compression of binary files. +--------- This is absolute and complete Bull-Ka-Ka. # cp /bin/sh /tmp # cd /tmp # ls -l sh -rwx--x--t 1 root root 37762 May 5 09:23 sh # compress -V -v sh $Header: compress.c,v 4.0 85/07/30 12:50:00 joe Release $ Options: BITS = 16 sh: Compression: 34.90% -- replaced with sh.Z # ls -l sh.Z -rwx--x--t 1 root root 24582 May 5 09:23 sh.Z Looks like you can compress binaries to me! Granted, the compression factor isn't as good as can be had with text files (I've seen as much as 90% in text files with plenty of repeating characters), but it *does* work on binaries. - paul -- Paul Sutcliffe, Jr. +------------------------+ | Know what I hate most? | UUCP (smart): paul@devon.UUCP | Rhetorical questions. | UUCP (dumb): ...rutgers!bpa!vu-vlsi!devon!paul +------<Henry Camp>------+
feg@clyde.ATT.COM (Forrest Gehrke) (05/05/88)
In article <10712@steinmetz.ge.com>, davidsen@steinmetz.ge.com (William E. Davidsen Jr) writes: > > I would like to add a little fuel to the fires of "which archiver" > discussion. Use of the 'btoa' routine instead of uuencode would save > 12% (!) on binary postings. This is a PD program, included in the > compress package, and runs just fine on a PC. Having tried this sometime back, I have often wondered why this approach is not used by USENET. It would save a lot of transmission time. > All the discussion of using PKARC to save 1-2% or not using it to save > time for many of the people on the net seems pointless. We should use > both (standard) arc and zoo formats, uuencode them, and save bandwidth > by dropping this discussion. Hopefully Rahul will clarify this by edict. Also an excellent suggestion. We could quickly find out from experience which archiver works out best through use. BTW what is holding up Rahul from taking over as moderator? Forrest Gehrke k2bt
jbuck@epimass.EPI.COM (Joe Buck) (05/05/88)
In article <299@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: > > I stand corrected. Since Lem-Ziv was DESIGNED for text compression, and >the authors do not mention its use for binaries, i never considered using it. >I tried it on an executable under UNIX and obtained a good reduction, for >reasons which are not apparent. I'm sure that there are cases where this does >not work (like graphics files), but it does appear to work , and in this case >better than the current version of ARITH. Jeff, Jeff, Jeff. You're STILL putting your foot in your mouth. :-) A Unix file is just a stream of bytes, and so is an MS-DOS file except that it has extra attributes as well. Compress replaces byte strings with codes whose lengths are between 9 and 16 bits. It will work well on any file in which some byte sequences are more common than others. An executable file consists of instructions, which, for almost all processors are integral numbers of bytes, and some are much more common than others. So compress works fine, and will give good compression for just about any executable file. There are several types of graphics files: bitmaps are HIGHLY compressible; other types of files act like a program for an imaginary computer and consist of byte codes, some much more common than others. These compress well also. There are only three types of files I've ever given to compress that haven't been reduced in size as a result: random binary data, floating point binary data, and files that have already been compressed. -- - Joe Buck {uunet,ucbvax,sun,<smart-site>}!epimass.epi.com!jbuck Old Internet mailers: jbuck%epimass.epi.com@uunet.uu.net Argue for your limitations and you get to keep them. -- Richard Bach
tneff@dasys1.UUCP (Tom Neff) (05/06/88)
In article <299@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: > ... Also, i did not know that source for >zoo was available - a consideration which i believe to be VERY important >since support usually comes best from those who use a product. The source for ARC is available too, and it's running on (for instance) this Stride. Don't confuse the ARC standard with Phil Katz's PC-optimized clone PKARC. Due to an assiduous sales job most PC sysops have the Katz thing, but it ain't the original. The "C" language real McCoy is slower on PC's but more portable. >For those who suggested that i "do my homework" before posting something >to the net, i can only say that since the net is my ONLY contact with this >problem, and that the comp...d group is for discussions, i am in essence >"doing my homework". There is a school of thought, notably expressed in the cat.announce.newusers material, than the Net is a place for authoritative answers and requests for same, not for "homework" owing to the expense of carrying it all. I try to keep an open mind. :-) Not that your posting was anything to apologize for anyway... -- Tom Neff UUCP: ...!cmcl2!phri!dasys1!tneff "None of your toys CIS: 76556,2536 MCI: TNEFF will function..." GEnie: TOMNEFF BIX: are you kidding?
brianc@cognos.uucp (Brian Campbell) (05/06/88)
In article <696@fig.bbn.com> rsalz@bbn.com (Rich Salz) writes: > If you are (sigh) going to post binaries on Usenet, DO NOT compress > them first. Many Usenet sites use compress to pack up their news > batches. Compressing a compressed file makes it larger. Maybe those Usenet sites should not use the -f (force) flag with compress. Every version I've used (Sun, XENIX and DOS) will not replace the original if the compressed version would be larger. Try compressing a file twice using the -v (verbose) option and see what happens. -- Brian Campbell uucp: decvax!utzoo!dciem!nrcaer!cognos!brianc Cognos Incorporated mail: POB 9707, 3755 Riverside Drive, Ottawa, K1G 3Z4 (613) 738-1440 fido: (613) 731-2945 300/1200, sysop@1:163/8
laba-5ac@web7f.berkeley.edu (Erik Talvola) (05/06/88)
In article <1082@maynard.BSW.COM> campbell@maynard.UUCP (Larry Campbell) writes: <>In article <4740@teddy.UUCP> jpn@teddy.UUCP (John P. Nelson) writes: <><>>Just one thing that needs to be known -- PC's can do no more than 12-bit <><>>compression. So if you are compressing your file from a UNIX system, <><>>you need to say comress -b12 filename . <><> <><>This myth has been repeated several times, so I felt it was necessary to <><>speak up. PCs most certainly CAN do a 16 bit compress/uncompress. ... <> <>Only a subset of PCs can do 16-bit compress/uncompress. Mine can't. <>I'm running VENIX/86 2.0, which is basically V7; the PCC-derived <>C compiler has only the tiny and small memory models (exactly <>corresponding to non-split and split PDP-11s, which also cannot <>handle 16-bit compress). <> <>So it is true that PCs with a C compiler that supports multiple data <>segments can handle 16-bit compress, but that hardly encompasses all <>PCs in the world. <>-- What's wrong with getting a 16-bit Compress executable file for the PC which was compiled with a proper C compiler? Then, you can run a 16-bit compress on any PC. You are right in that you may not be able to compile it with all C compilers, but you can run the executable on any PC (as long as you have ~500K free). >Larry Campbell The Boston Software Works, Inc. >Internet: campbell@maynard.bsw.com 120 Fulton Street, Boston MA 02109 >uucp: {husc6,mirror,think}!maynard!campbell +1 617 367 6846 --------------------------------------------------- Erik Talvola erikt@zen.berkeley.edu "...death is an acquired trait." -- Woody Allen ---------------------------------------------------
caf@omen.UUCP (Chuck Forsberg WA7KGX) (05/06/88)
In article <296@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: : : 1) COMPRESS is a text only compression routine. It will not now, or ever, : help in the compression of binary files. The 13 bit compression in zoo gets about 29% compresseing YAM.EXE. : 2) ARITH is a more general compression routine using adaptive arithmetic : coding. It will compress binary files where there is redundancy, but Please post it! : 3) The source for ZOO, PKARC, and the others is NOT available. Therefore : we are at the whims of whomever is currently supporting (or not supporting) : them. The sources to ZOO *are* available, in fact it was a copy of ZOO I compiled for 386 Xenix that I used in the above micro-benchmark. : 4) COMPRESS works faster and better on text files then the ARC routines : because they use 12 bit compression, where 13-bit (and more) are possible : under even the PC for COMPRESS (i've tried it on ans AT-clone). Compress, ARC, PKARC, and ZOO all use forms of LZW compression, derived from the original Unix compress program. : 5) On the weak side, there is as yet, no CRC or checksum for any of these, : but adding it would be someithing i am willing to take responsibility : for should enough people decide they would like to take the approach : which i'm currently suggesting. The lack of a CRC in compress is a serious weakness. ZRC and ZOO include CRC. : Also, there no directory support provided with these tools. They work : on only one file at a time. This is also correctable since the source : is available. ZOO has excellent directory support - full Unix pathnames are supported. Again, please post the ARITH program. It would be most interesting if the memory requirements are small - like Huffman encoding instead of LZW. Newsgroups: comp.sources.d,comp.binaries.ibm.pc.d Subject: Re: Standard for file transmission Summary: Expires: References: <292@cullsj.UUCP> <55@psuhcx.psu.edu> <537@csccat.UUCP> <I> <would> <like> <to> <clear> <up> <a> <couple> <of> <notions> <that> <have> <been> <expressed> <over> <296@cullsj.UUCP> Sender: Reply-To: caf@omen.UUCP (Chuck Forsberg WA7KGX) Followup-To: Distribution: Organization: Omen Technology Inc, Portland Oregon Keywords: protocol compression source In article <296@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: : : 1) COMPRESS is a text only compression routine. It will not now, or ever, : help in the compression of binary files. The 13 bit compression in zoo gets about 29% compresseing YAM.EXE. : 2) ARITH is a more general compression routine using adaptive arithmetic : coding. It will compress binary files where there is redundancy, but Please post it! : 3) The source for ZOO, PKARC, and the others is NOT available. Therefore : we are at the whims of whomever is currently supporting (or not supporting) : them. The sources to ZOO *are* available, in fact it was a copy of ZOO I compiled for 386 Xenix that I used in the above micro-benchmark. : 4) COMPRESS works faster and better on text files then the ARC routines : because they use 12 bit compression, where 13-bit (and more) are possible : under even the PC for COMPRESS (i've tried it on ans AT-clone). Compress, ARC, PKARC, and ZOO all use forms of LZW compression, derived from the original Unix compress program. : 5) On the weak side, there is as yet, no CRC or checksum for any of these, : but adding it would be someithing i am willing to take responsibility : for should enough people decide they would like to take the approach : which i'm currently suggesting. The lack of a CRC in compress is a serious weakness. ZRC and ZOO include CRC. : Also, there no directory support provided with these tools. They work : on only one file at a time. This is also correctable since the source : is available. ZOO has excellent directory support - full Unix pathnames are supported. Again, please post the ARITH program. It would be most interesting if the memory requirements are small - like Huffman encoding instead of LZW.
mark@adec23.UUCP (Mark Salyzyn) (05/06/88)
I'm sorry, I don't care if the IBM-PC can handle better than 12 bit compress! I run UNIX on a PDP 11/23 *NON SPLIT I/D MACHINE* and that allows me to use 12 bit compress (however I have a 13 bit LZW pack routine that was posted in 1983 that works fine). In order to read stuff that is packed more than 12 bit LZW I had to rewrite compress to use disk rather than memory. BOY IS IT SLOW. In the interest of compatibility with ALL types of machines I suggest that we use 12 bit compress. This is the most available compression bit selection. If not, then I am going to extend my disk version to handle 17 bit compress, post something usefull and watch you all squirm. G'day -- Mark Salyzyn, mad at the world for advancing and leaving me behind
ephram@violet.berkeley.edu (05/06/88)
In article <1082@maynard.BSW.COM> campbell@maynard.UUCP (Larry Campbell) writes: >In article <4740@teddy.UUCP> jpn@teddy.UUCP (John P. Nelson) writes: ><>>Just one thing that needs to be known -- PC's can do no more than 12-bit ><>>compression. So if you are compressing your file from a UNIX system, ><>>you need to say comress -b12 filename . ><> ><>This myth has been repeated several times, so I felt it was necessary to ><>speak up. PCs most certainly CAN do a 16 bit compress/uncompress. ... > >Only a subset of PCs can do 16-bit compress/uncompress. Hasn't anyone ever heard of a disk drive?!? multiple segments as a limitation? How about writing temporary results to a disk file (random access)? RAM disk? Now I must admitt I have never cracked open the code to compress/uncompress, but it seems to me that using a disk drive as an intermediate result area is a very viable work around. I would rather sit and watch my disk spin for an extra minute than watch the RD light on my modem work 10% more time. I admit it is not elegant but when some one says "can not do" I must speak up. Ephram Cohen ephram@violet.berkeley.edu
jpn@teddy.UUCP (John P. Nelson) (05/06/88)
> Let me point out one simple fact: source code is VERY MUCH > SMALLER than binaries. This is not clear. For small programs in a high-level compiled language (like C), this is true: This is because the small program pulls in the language run-time library. The source is much smaller than the resulting executable: However, I would bet that the object file (before linking) would be about the same size as the source (even WITH the symbol table and relocation information). Assembly language source usually run about 10 times larger than the resulting executable. Large C program (64k+) source usually runs two to three times larger than the resulting executable. Of course, I find source code more valuable: I can make changes to suit my environment, or I can port the program to a different machine entirely. And of course, with an operating system like UNIX which runs on a plethora of machines, source code is the only acceptable distribution mechanism. Other languages have different source/binary size ratios. Some languages can generate a lot of code with a very small amount of source. However, most of the source code posted to USENET is C. -- john nelson UUCP: {decvax,mit-eddie}!genrad!teddy!jpn ARPA (sort of): talcott.harvard.edu!panda!teddy!jpn
rroot@edm.UUCP (uucp) (05/07/88)
From article <3980@killer.UUCP>, by chasm@killer.UUCP (Charles Marslett): > In article <55@psuhcx.psu.edu>, wcf@psuhcx.psu.edu (Bill Fenner) writes: >> Just one thing that needs to be known -- PC's can do no more than 12-bit >> compression. ... > Actually, I have sent several people copies of a minor mod to compress 4.0 > that works fine if you have the memory (requires about 350-400 K above DOS There are still, however, people running on systems who'se compilers don't know how to work with >64K. These systems exist and have to be dealt with. -- ------------- Stephen Samuel Disclaimer: You betcha! {ihnp4,ubc-vision,seismo!mnetor,vax135}!alberta!edm!steve BITNET: USERZXCV@UQV-MTS
loci@csccat.UUCP (Chuck Brunow) (05/07/88)
In article <2096@epimass.EPI.COM> jbuck@epimass.EPI.COM (Joe Buck) writes: >In article <299@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: >> >> I stand corrected. Since Lem-Ziv was DESIGNED for text compression, and >>the authors do not mention its use for binaries, i never considered using it. >>I tried it on an executable under UNIX and obtained a good reduction, for >>reasons which are not apparent. I'm sure that there are cases where this does This is actually partially true. The first "compress" to appear on the net (several years ago) only worked on text files and dumped core on binary files. The reason you get good compressions on binary files is probably that they haven't been stripped of the relocation info. Strip them first and I doubt that the compression will be so good (otherwise, throw your optimizer into the bit bucket). Typical (large) text compression is about 67%, whereas binaries are closer to 20%. (I use 16-bit compress). > >A Unix file is just a stream of bytes, and so is an MS-DOS file >except that it has extra attributes as well. Compress replaces byte >strings with codes whose lengths are between 9 and 16 bits. It will >work well on any file in which some byte sequences are more common >than others. An executable file consists of instructions, which, for >almost all processors are integral numbers of bytes, and some are >much more common than others. So compress works fine, and will give >good compression for just about any executable file. There are This is doubtful. There's a good description of the workings of LZW in the GIF docs (recently posted). Bytes aren't the key feature here, but rather sequences of repeated bytes which should be rare in an optimized executable (on Unix at least). >several types of graphics files: bitmaps are HIGHLY compressible; If they have lots of blank space, or other repeated sequences. Otherwise, they can be very similar to executables: 10-20%. >other types of files act like a program for an imaginary computer and >consist of byte codes, some much more common than others. These >compress well also. You must mean Huffman coding. These comments are true in that case, not LZW. > >There are only three types of files I've ever given to compress that >haven't been reduced in size as a result: random binary data, >floating point binary data, and files that have already been >compressed. > The point being that there is little redundancy. >--
wyle@solaris.UUCP (Mitchell Wyle) (05/07/88)
This discussion will bear fruit only if r$ or the backbone gurus implement one of these schemes as a usenet standard, and distribute sources or binaries packaged with tarmail or whichever scheme wins this debate. I vote for tarmail. Let's get a standard accepted! -- -Mitchell F. Wyle wyle@ethz.uucp Institut fuer Informatik wyle%ifi.ethz.ch@relay.cs.net ETH Zentrum 8092 Zuerich, Switzerland +41 1 256-5237
NETOPRWA@NCSUVM.BITNET (Wayne Aiken) (05/08/88)
As far as packing method (ARC,ZOO,Compress, etc.) the only one that I've ever had any problems with has been compress. (640K IBM AT) Perhaps I've never gotten the right executable, does anyone have a 12 or 13 bit compress guaranteed to run in at least 512K? That aside, the other problem, especially with multi-part postings, is that not all parts consistently make it to all sites, and when they do, one or more parts have been truncated or otherwise mangled. It would be of great help if each uuencoded part had a trailing cut line and signature, so I can tell if a file has been truncated. The new uuencode for the PC by Richard Marks (great job, Richard!) correctly skips the cut lines, and can also extract directly from shar files. One last thing....one of the great advantages of using ARC and ZOO files is that they maintain an internal CRC value for each file. Recently, someone posted a uuencoded EXE file which I had to download twice before I got it to work, and I'm still not 100% positive that there is not some garbled spot hidden somewhere in that binary. If the packing method doesn't include a CRC, then it really should be calculated and included as part of the header or documentation, so I can verify that the file is OK. +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ + Announcing the all-weather, 100% bio-degradable * + + . + + * . . S t a r F l e e t BBS * . + + . + + * (919) 782-3095 PC-Pursuitable * + + . * 24 hrs/day, 300/1200/2400 baud . * + + + + WITH: Utilities, hints, and games for IBM, Apple // & MAC + + USENET favorites, on-line games, message bases + + BITNET access, largest joke database in NC region + + * + +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
ford@elgar.UUCP (Ford Prefect ) (05/08/88)
In article <9644@agate.BERKELEY.EDU> laba-5ac@web7f.berkeley.edu.UUCP (Erik Talvola) writes: >What's wrong with getting a 16-bit Compress executable file for the PC >which was compiled with a proper C compiler? Then, you can run a 16-bit >compress on any PC. You are right in that you may not be able to compile >it with all C compilers, but you can run the executable on any PC (as long >as you have ~500K free). There are a few problems with this approach: 1) Such a compiler has to exist for the operating system you are running. Obviously, the author had his brain in Ms.Dos mode, which, since the article was cross-posted to comp.binaries.ibm-pc, is forgivable in this case. But one of the articles that was being followed up to mentioned an O.S. that only supported 64k segments. Compress just won't work in such an environment without major redisign (like keeping the arrays in a disk file :-). 2) The executable you get must be for your CPU! This is obvious, of course, but I keep detecting a definite ibm-pc-chauvanist state of mind in this discussion. Don't forget that there are people who are still running unix on PDP-11's and proud of it! The PDP-11 is very similar to the 8086 except that nobody does anything as kludgey as geferkin with the segment registers! So the best you can get is 64k code, 64k data. In other words, discussion of a standardized compression format must take into account the existence of small machines. And "PC" != "Intel Cpu". Personally, I use 16-bit compress since I don't need to talk to such small machines. But if I need to post a binary to the net, I will probably use 12-bit compress, because I've never heard of a machine or compiler that couldn't run it. -=] Ford [=- "Once there were parking lots, (In Real Life: Mike Ditto) now it's a peaceful oasis. ford%kenobi@crash.CTS.COM This was a Pizza Hut, ...!sdcsvax!crash!kenobi!ford now it's all covered with daisies." -- Talking Heads
leonard@bucket.UUCP (Leonard Erickson) (05/08/88)
In article <1082@maynard.BSW.COM> campbell@maynard.UUCP (Larry Campbell) writes:
<Only a subset of PCs can do 16-bit compress/uncompress. Mine can't.
<I'm running VENIX/86 2.0, which is basically V7; the PCC-derived
<C compiler has only the tiny and small memory models (exactly
<corresponding to non-split and split PDP-11s, which also cannot
<handle 16-bit compress).
<
<So it is true that PCs with a C compiler that supports multiple data
<segments can handle 16-bit compress, but that hardly encompasses all
<PCs in the world.
Larry, you are confusing being able to *compile* a program and being able
to *use* it! I don't have *any* kind of C compiler. But I can uncompress
stuff that was compressed on a Unix system on my PC.
Some kind soul posted an msdos *binary* for compress a while back. All
you need is DOS and more than 512k of ram...
True, this places two limits on the people who are using the program:
1. they've got to be using MS-DOS. (since we are talking about comp.-
binaries.ibm.pc any arguments that this is a serious restriction
should be routed to /dev/null)
2. they have to have 640k (576 will probably work, but I haven't
tried it). This *is* a problem, but even at current memory prices
it isn't *too* serious. (Unless you have an AT whose memory is mapped
as 512 dos/512 extended)
--
Leonard Erickson ...!tektronix!reed!percival!bucket!leonard
CIS: [70465,203]
"I used to be a hacker. Now I'm a 'microcomputer specialist'.
You know... I'd rather be a hacker."
davidsen@steinmetz.ge.com (William E. Davidsen Jr) (05/10/88)
Since the discussion is on I'm posting atob and btoa to binaries. -- bill davidsen (wedu@ge-crd.arpa) {uunet | philabs | seismo}!steinmetz!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me
davidsen@steinmetz.ge.com (William E. Davidsen Jr) (05/10/88)
In article <552@csccat.UUCP> loci@csccat.UUCP (Chuck Brunow) writes: > > Let me point out one simple fact: source code is VERY MUCH > SMALLER than binaries. And another: not everybody has all compilers. There have been postings in MSC, Turbo C, Turbo Pascal, MASM, Fortran and COBOL (yecch) so far on this group. That's why we have a binary group. Besides I wouldn't give out source to some things which I can distribute as binary. -- bill davidsen (wedu@ge-crd.arpa) {uunet | philabs | seismo}!steinmetz!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me
jeff@cullsj.UUCP (Jeffrey C. Fried) (05/10/88)
I recently acquired the ZOO executables from the net and found them to be incompatible with ARC. The UNIX ARC i received over the net is compatible with ARC5.2.1 under DOS. Has anyone else experienced this incompatibility?
jtara@m2-net.UUCP (Jon Tara) (05/10/88)
In article <3980@killer.UUCP>, chasm@killer.UUCP (Charles Marslett) writes: > In article <55@psuhcx.psu.edu>, wcf@psuhcx.psu.edu (Bill Fenner) writes: > > Just one thing that needs to be known -- PC's can do no more than 12-bit > > compression. ... > > Actually, I have sent several people copies of a minor mod to compress 4.0 Funny, I ran compress 4.0 through the Microsoft 4.0 compiler using large model, and I've been happily compressing and de-compressing with 16 bits ever since. Far as I can tell, it doesn't need any changes, at least under MS/PC-DOS and Microsoft C. It does need a good chunk of memory, which most people should have, unless you're a real TSR nut. -- jtara%m-net@umix.cc.umich.edu ihnp4!dwon!m-net!jtara "You don't have to take this crap. You don't have to sit back and relax." _Walls Come Tumbling Down_, The Style Council
rick@pcrat.UUCP (Rick Richardson) (05/10/88)
In article <679@omen.UUCP> caf@omen.UUCP (Chuck Forsberg WA7KGX) writes: >Again, please post the ARITH program. It would be most interesting >if the memory requirements are small - like Huffman encoding instead >of LZW. In case ARITH never gets posted: the complete article and program appeared in ACM last year, in C. I typed it in myself (and lost it later). The program, as published, runs a lot slower than compress and does not do quite as good a job as compress. It was better than "pack". It is very small, and uses little memory. If you dig into the article (this from memory, I seem to have misplaced the issue of ACM as well), the program separates the algorithm for encoding into a model. Two models are presented, one that just uses a static letter frequency table (for text), and an adaptive model (for binaries). As I recall, the author pointed out that more sophisticated adaptive algorithms could be used for better results. After monkeying around with the program for an evening, and even trying my own hand at a more sophisticated model, I shelved the program, with nary a backup. Since it was slower and less efficient than compress, I think its usefullness is limited to those applications which are sensitive to both program and data size, such as in a modem. BTW, I heard some rumor that a 16 bit "uncompress"-only is available for limited memory systems. If this is true, then why all the fuss about 16 bit compression? -- Rick Richardson, President, PC Research, Inc. (201) 542-3734 (voice, nights) OR (201) 834-1378 (voice, days) uunet!pcrat!rick (UUCP) rick%pcrat.uucp@uunet.uu.net (INTERNET)
davidsen@steinmetz.ge.com (William E. Davidsen Jr) (05/10/88)
In article <307@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: | I recently acquired the ZOO executables from the net and found them to be | incompatible with ARC. Correct. zoo is not "another arc file program," it is a totally separate file structure, containing information which neither arc or pkarc include. | The UNIX ARC i received over the net is compatible | with ARC5.2.1 under DOS. Has anyone else experienced this incompatibility? Alas, there is no "the" UNIX arc, there are a number of slightly diferent versions. If you have the one I suspect, it needs the "-i" option to be compatible with the DOS arc. I highly commend switching the meaning of that flag for default DOS compatibility. Actually I highly commend using zoo... -- bill davidsen (wedu@ge-crd.arpa) {uunet | philabs | seismo}!steinmetz!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me
tneff@dasys1.UUCP (Tom Neff) (05/10/88)
In article <307@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: > I recently acquired the ZOO executables from the net and found them to be >incompatible with ARC. The UNIX ARC i received over the net is compatible >with ARC5.2.1 under DOS. Has anyone else experienced this incompatibility? Yes, everyone has experienced this incompatibility Jeffrey, because they are not SUPPOSED to be compatible! :-) ARC is one archiving standard, ZOO is a completely different standard. You need one set of programs to create, list and extract ARC files, and a different set to manipulate ZOO archives. You can't use one with the other. Now, if your next question was going to be why there are two incompatible archiving standards for the MSDOS/UNIX/VMS environment, you'll have to ask our very own moderator Rahul, because there was only one (ARC) until he decided to invent his own. I told him at the time that user confusion would result, but the argument is moot at this point. -- Tom Neff UUCP: ...!cmcl2!phri!dasys1!tneff "None of your toys CIS: 76556,2536 MCI: TNEFF will function..." GEnie: TOMNEFF BIX: are you kidding?
jeff@cullsj.UUCP (Jeffrey C. Fried) (05/11/88)
Someone asked for the source to COMPRESS, and since i cannot reach them except through posting to this group let me simply say that I have the source, which was posted on the net several months ago. I made a small change which allows it to run in SMALL model addressing using MSC-5.0 to do 13 (yes 13, not 12) bit compression. (Only one large array was required.) If the person who wanted it will contact me with a UUCP address, i'll send it out. If there is a sufficient number of requests, i'll send it to the moderator for posting to comp.binaries.ibm.pc.
pjh@mccc.UUCP (Pete Holsberg) (05/11/88)
In article <10770@steinmetz.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
... Alas, there is no "the" UNIX arc, there are a number of slightly
...diferent versions. If you have the one I suspect, it needs the "-i"
...option to be compatible with the DOS arc. I highly commend switching the
^^^^^^^^^^^^^
...meaning of that flag for default DOS compatibility.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
How do you do that? Thanks.
mitch@stride1.UUCP (Thomas P. Mitchell) (05/12/88)
In article <10758@steinmetz.ge.com> davidsen@kbsvax.steinmetz.UUCP (William E. Davidsen Jr) writes: >In article <552@csccat.UUCP> loci@csccat.UUCP (Chuck Brunow) writes: >> > >And another: not everybody has all compilers. There have been postings >in MSC, Turbo C, Turbo Pascal, MASM, Fortran and COBOL (yecch) so far on >this group. That's why we have a binary group. Besides I wouldn't give >out source to some things which I can distribute as binary. To send binary or not to send binary, this is a tough question. But my general thoughts on this is that binarys are the way to distribute a product you sell and support. Source text is the way to make available something you wish to share and are willing to see improved and expanded. Oh yes critized as well. The argument that not evrybody has all compilers is real. Yet I dislike it. To me compilers are like a key board, a computer is worth little without one. Back to the topic. Not all binary files are code so how do we transfer binarys or anything else for that matter. Some things like bit maps (face server, fonts) and other data needs to be transfered from machine to machine. And at times code binarys as well (blush I did say this). My thought on this is that the link should know how to best send the data. In other words uucp should be expanded to exchange abilities. Consider an initial uucp conection in which the programs exchange information like "have compress16|compress12, have bta/atb, have kermit, have xmodem , have link_is_100%, have TeleBit, have never_talked, have exchanged_compression_tables". Given this type of information the program then can select the best tool to get the best effective transfer rate for the next conversation. Well what say you all? Thanks for the soap. mitch@stride1.Stride.COM Thomas P. Mitchell (mitch@stride1.Stride.COM) Phone: (702)322-6868 TWX: 910-395-6073 FAX: (702)322-7975 MicroSage Computer Systems Inc. Opinions expressed are probably mine.
danno@microsoft.UUCP (Dan Norton) (05/13/88)
In article <145@elgar.UUCP>, ford@elgar.UUCP (Ford Prefect ) writes: > In article <9644@agate.BERKELEY.EDU> laba-5ac@web7f.berkeley.edu.UUCP (Erik Talvola) writes: > >What's wrong with getting a 16-bit Compress executable file for the PC... > > There are a few problems with this approach: > > 1) Such a compiler has to exist for the operating system you are > running... > ... But one of > the articles that was being followed up to mentioned an O.S. > that only supported 64k segments. Compress just won't work > in such an environment without major redisign (like keeping > the arrays in a disk file :-). You are wrong. In fact, such a compress exists, using memory only. Several people, including myself, have been able to modify the standard compress with little trouble, and it works just fine on IBM PC's. . . . . . . .
linhart@topaz.rutgers.edu (Mike Threepoint) (05/13/88)
tneff@dasys1.UUCP (Tom Neff) writes: -=> The source for ARC is available too, and it's running on (for instance) -=> this Stride. >sigh< But the only squash source I can find is in Pascal. Speaking of which... -=> Don't confuse the ARC standard with Phil Katz's PC-optimized clone PKARC. -=> Due to an assiduous sales job most PC sysops have the Katz thing, but it -=> ain't the original. The "C" language real McCoy is slower on PC's but -=> more portable. "Accept no imitations" should be reserved to sales jobs. PKARC is faster and compresses smaller, why wouldn't they use the Katz thing? ARCE, NARC, and NSWEEP also support squashing, so it's not even forcing PK(X)ARC on the users. My bottom line is the archive size, speed is gravy unless it operates as slow as... oh, I dunno... ARC? :-) On my BBS, my own experience is that PKARC creates smaller archives than ZOO, so I use PKARC when I don't need to store a directory subtree. Squashing has saved over a meg of space on my board. Sometimes PKARC is stupid about compression and squashes when it should crunch or crunches at a 0% compression rate instead of storing, but most of the time it's smaller. If ZOO crunched as well (>sigh<), I would use that. [Selfish mode: Maybe Rahul could find out what hashing algorithm PK or DWC is using to get better compression rates. Would simply things for me considerably.] -- "...billions and billions..." | Mike Threepoint (D-ro 3) -- not Carl Sagan | linhart@topaz.rutgers.edu "...hundreds if not thousands..." | FidoNet 1:107/513 -- Pnews | AT&T +1 (201)878-0937
loverso@encore.UUCP (John Robert LoVerso) (05/13/88)
In article <2932@cognos.UUCP> brianc@cognos.UUCP (Brian Campbell) writes: > In article <696@fig.bbn.com> rsalz@bbn.com (Rich Salz) writes: > > If you are (sigh) going to post binaries on Usenet, DO NOT compress > > them first. Many Usenet sites use compress to pack up their news > > batches. Compressing a compressed file makes it larger. > > Maybe those Usenet sites should not use the -f (force) flag with compress. Thats not how (typical) news batching works. Compress is used as a stage of a pipe: batch | compress | uux and because compress doesn't know the size of its input when it starts up, it will *always* produce compressed output. The point is that an article which contains binary thats compressed and then uuencode/btoa/your_favorite'd will lower the compression ration for the batch that contains it. The overall size of the batch will be smaller if the included binary was just uuencoded, etc. I no longer carry comp.binaries.* as I am using its disk space to store more *useful* news. It would be nice to see such things split out into bin.*. As gnu says: "Use the Source, Luke..." John Robert LoVerso, Encore Computer Corp encore!loverso, loverso@multimax.arpa
phil@amdcad.AMD.COM (Phil Ngai) (05/14/88)
In article <786@stride.Stride.COM> mitch@stride1.UUCP (Thomas P. Mitchell) writes: >The argument that not evrybody has all compilers is real. Yet I >dislike it. To me compilers are like a key board, a computer is >worth little without one. Well, I strongly disagree. I don't have source for most of the things I run on this PC, nor do I want it. I don't have time to tinker with source code, compiling it, fixing it. I want to get the program and start using it. Of course, I use programs like PC-NFS, SCHEMA, ORCAD, PSPICE, and other CAD type tools. You probably don't know what this stuff is, so you can't appreciate that some people want to do useful work *with* their computers instead working *on* the computer. -- Make Japan the 51st state! I speak for myself, not the company. Phil Ngai, {ucbvax,decwrl,allegra}!amdcad!phil or phil@amd.com
allbery@ncoast.UUCP (Brandon S. Allbery) (05/15/88)
As quoted from <8430@iuvax.cs.indiana.edu> by bobmon@iuvax.cs.indiana.edu (RAMontante): +--------------- | The biggest problem I see is that many news mailers compress everything | blindly, so that an already-compressed file gets bigger. This would also be | true of a sufficiently random file, although I think most executables aren't | that random. And this compress-and-be-damned behavior is not a strength of | the system, it's a weakness. (Even compress will complain if its result is | bigger than its original; does the mailer ignore this, or are the net.gods | lying when they claim they're shipping bigger files because of the double | compression?) +--------------- When compress is invoked as compress (file) it complains. When it's invoked as: sendbatch | compress | uux -r - oopsvax!rnews it can't do so without compressing to a temp file while saving its input in a second temp file, then comparing sizes and copying the smaller of the two: wasteful of space and time. (You can't, of course, seek backwards on a pipe.) -- Brandon S. Allbery, moderator of comp.sources.misc {well!hoptoad,uunet!marque,cbosgd,sun!mandrill}!ncoast!allbery Delphi: ALLBERY MCI Mail: BALLBERY
allbery@ncoast.UUCP (Brandon S. Allbery) (05/16/88)
As quoted from <563@csccat.UUCP> by loci@csccat.UUCP (Chuck Brunow): +--------------- | In article <2096@epimass.EPI.COM> jbuck@epimass.EPI.COM (Joe Buck) writes: | >In article <299@cullsj.UUCP> jeff@cullsj.UUCP (Jeffrey C. Fried) writes: | >> I stand corrected. Since Lem-Ziv was DESIGNED for text compression, and | >>the authors do not mention its use for binaries, i never considered using it. | >>I tried it on an executable under UNIX and obtained a good reduction, for | >>reasons which are not apparent. I'm sure that there are cases where this does | | This is actually partially true. The first "compress" to appear | on the net (several years ago) only worked on text files and | dumped core on binary files. The reason you get good compressions | on binary files is probably that they haven't been stripped of | the relocation info. Strip them first and I doubt that the | compression will be so good (otherwise, throw your optimizer | into the bit bucket). Typical (large) text compression is about | 67%, whereas binaries are closer to 20%. (I use 16-bit compress). +--------------- Wrong. Consider that, for example, every call to putchar() contains some fixed code (such as a call to _flsbuf()); this, on a 32-bit address space machine, will always be the same byte sequence (on a 680x0, it's 6 bytes). Other things will also be common: printf("format", non-double-value); (which is by far the *most* common use of printf(), from what I've seen; perhaps others have seen other more common calls) has the constant assembler code on a 680x0: jsr _printf 6 bytes addql #8,a6 2 bytes (and "printf("constant")", also common, is a slightly different 8-byte value). These kind of extremely common operations can't be optimized out and are quite amenable to compression. RISC eecutables are likely to be even more amenable to compression, since many operations will assemble into lengthy byte sequences -- many of which will be partially or totally identical. Ergo: compression of executables generally works pretty well. (I regularly see 50%-60% on stripped, optimized executables on ncoast.) -- Brandon S. Allbery, moderator of comp.sources.misc {well!hoptoad,uunet!marque,cbosgd,sun!mandrill}!ncoast!allbery Delphi: ALLBERY MCI Mail: BALLBERY
jpn@teddy.UUCP (John P. Nelson) (05/16/88)
>The point is that an article which contains binary thats compressed and then >uuencode/btoa/your_favorite'd will lower the compression ration for the batch >that contains it. The overall size of the batch will be smaller if the >included binary was just uuencoded, etc. If this was TRUE, it would be a good argument. It is NOT true. Most binary files that are compressed, uuencoded, then compressed again are SMALLER than binary files that are simply uuencoded, then compressed. I have yet to see anyone post results that refute this. A few people have pointed out counter-examples: These usually involve compressing an ARC file (or other binary file with very little compressability in the first place). The few cases I have seen where using ARC (which will NOT try to compress a file that is uncompressable), followed by uuencode, followed by compress generates a larger file than uuencode/compress alone, the file lengths were within 1% of each other. If someone has seen different results, I would be interested in seeing them. I already KNOW that compressing ASCII files (source or text) then uuencoding is a bad idea: I am interested in results from BINARY FILES only! I think we should SETTLE this issue once and for all! -- john nelson UUCP: {decvax,mit-eddie}!genrad!teddy!jpn ARPA (sort of): talcott.harvard.edu!panda!teddy!jpn
dhesi@bsu-cs.UUCP (Rahul Dhesi) (05/17/88)
To avoid ambiguity, I suggest the following terminology. B = binary T = text U = uuencoding C16 = 16-bit LZW ("compress" default) C12 = 12-bit LZW (arc) C13 = 13-bit LZW (zoo, squashing) So, instead of claiming that "uuencoded binary files compressed are larger than not uuencoding" it is better to say that "BC12UC16 is worse than BC16", or "BUC16U is worse than BC16" etc. BC12UC16 means: (B) take a binary file (C12) compress using arc or 12-bit "compress" (U) uuencode it (C16) compress using 16-bit "compress" Also, since binary files differ, it's good to use some standard binary file in benchmarks, e.g. your UNIX kernel stripped of symbols, so there is some degree of consistency. -- Rahul Dhesi UUCP: <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi
loci@csccat.UUCP (Chuck Brunow) (05/18/88)
In article <3075@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: >To avoid ambiguity, I suggest the following terminology. > >file in benchmarks, e.g. your UNIX kernel stripped of symbols, so there >is some degree of consistency. >-- Using UNIX files as test cases misses the point. Unix sites rarely use binary as a mode of transfer. It would be reasonable to use MS-DOS files, as they represent the real thing. It is nearly universal that Unix sites use UUCP, in some form which will allow un-uue'd (pure binary) files. This is one reason why the question of compressing compressed files came up with in the first place: Text files can be batched and compressed to speed tranmission. Binary files are the fly in the ointment. Add the that the problem of handling assorted mailers for non-Unix sites and chaos begins. Embedded compression, (arc, gif, et al) complicate the situation even further. Numerous people have suggested FIDONET as a viable solution. Why not drop binaries from Unix sites and route them through FIDO? Then there is no problem of compressing compressed files.
mark@mamab.UUCP (Mark Woodruff) (05/21/88)
> From: loci@csccat.UUCP (Chuck Brunow) > Date: 18 May 88 06:36:52 GMT > Message-ID: <629@csccat.UUCP> > Numerous people have suggested FIDONET as a viable solution. > Why not drop binaries from Unix sites and route them through FIDO? Because FidoNet has no way of automatically redistributing, or even forwarding, files. mark Fido: sysop@363/9 UUCP: codas!rtmvax!mamab!mark --- * Origin: MaMaB--the Machine in Mark's Bedroom (Opus 1:363/9) SEEN-BY: 363/9 -- Reply via UUCP: codas!tarpit!mamab!<user> codas!rtmvax!mamab!<user> Reply via FidoNet: <user> at 1:363/9.0
allbery@ncoast.UUCP (Rich Garrett) (05/24/88)
As quoted from <4776@teddy.UUCP> by jpn@teddy.UUCP (John P. Nelson): +--------------- | >The point is that an article which contains binary thats compressed and then | >uuencode/btoa/your_favorite'd will lower the compression ration for the batch | >that contains it. The overall size of the batch will be smaller if the | >included binary was just uuencoded, etc. | | If this was TRUE, it would be a good argument. It is NOT true. | | Most binary files that are compressed, uuencoded, then compressed again | are SMALLER than binary files that are simply uuencoded, then | compressed. I have yet to see anyone post results that refute this. +--------------- Single files, yes. But the quoted message above specifically says BATCHES. Batches include messages of all kinds from multiple newsgroups; to verify whether batch compression is reduced, we have to modify sendbatch to print the compression ratio and then run sendbatch with both compressed and uncompressed uuencodes to see which results in smaller batches. (We also need a non-destructive "test" mode for sendbatch to (a) insure that the batches are otherwise identical and (b) not screw up news transmission.) This would have to be done with a number of batches and the results averaged in order to give us a reasonably accurate result. -- Brandon S. Allbery, moderator of comp.sources.misc {well!hoptoad,uunet!marque,cbosgd,sun!mandrill}!ncoast!allbery Delphi: ALLBERY MCI Mail: BALLBERY