cmaag@csd4.milw.wisc.edu (Christopher N Maag) (11/02/87)
In article <2218@mcdchg.UUCP> heiby@mcdchg.UUCP (Ron Heiby) writes: [...] >I took the uuencoded .ARC file and ran it through "compress" on my >UNIX system. The resulting file was 153,259 bytes long. Then, I >took the originally typed in document (the product of running the >file through uudecode and ARC) and ran *it* through "compress". >The resulting file was 96,373 bytes long. So, for those news links >where the administrators care about costs, we went through all kinds >of extra work to COST THEM AN EXTRA 55K BYTES! > >Come on, people. If it's ASCII text, like a document or source code, >POST IT IN CLEAR TEXT! > >Thank you very much. >-- >Ron Heiby, heiby@mcdchg.UUCP Moderator: comp.newprod & comp.unix I have seen people complain (with good reason) about this problem before. I would like to suggest that "whoever is in charge" add a line or two to the newuser announcements that would explain the problem described above. For instance: "If you submit a file to one of the newsgroups and you wish to uuencode the file, _do not_ perform any type of file compression to this file before uuencoding it. This means don't arc the file, (insert other popular compression schemes for other computer systems here). If you do compress the file, it will actually get _larger_ when it is sent than it was originally. This costs us all money." Is this group the right place to suggest something like this? If not, please direct me to the correct one. Chris. ======================================================================= Path: uwmcsd1!csd4.milw.wisc.edu!cmaag From: cmaag@csd4.milw.wisc.edu bitnet: cmaag%csd4.milw.wisc.edu@wiscvm.bitnet {seismo|nike|ucbvax|harvard|rutgers!ihnp4}!uwvax!uwmcsd1!uwmcsd4!cmaag =======================================================================
heiby@mcdchg.UUCP (Ron Heiby) (11/03/87)
Christopher N Maag (cmaag@csd4.milw.wisc.edu.UUCP) writes: > "If you submit a file to one of the newsgroups and you wish to uuencode > the file, _do not_ perform any type of file compression to this > file before uuencoding it. This means don't arc the file, (insert > other popular compression schemes for other computer systems here). > If you do compress the file, it will actually get _larger_ when it is > sent than it was originally. This costs us all money." I think Chris is going further than I suggested. I have no evidence that the problem is compressing before uuencoding, and I suspect that it has little to do with it. I was talking about the difference between sending clear text and sending compressed/uuencoded text. I think it would be interesting to check on what Chris is suggesting and get some numbers on the difference between sending uuencoded binary files vs uuencoded compressed binary files. I suspect that a uuencoded compressed binary file would actually be smaller, but the further impact of the news software's compress on the resulting files is unknown. -- Ron Heiby, heiby@mcdchg.UUCP Moderator: comp.newprod & comp.unix "I know engineers. They love to change things." McCoy
usenet@delrio.cc.umich.edu (Usenet News) (11/12/87)
In article <2255@mcdchg.UUCP> heiby@mcdchg.UUCP (Ron Heiby) writes: %Christopher N Maag (cmaag@csd4.milw.wisc.edu.UUCP) writes: %> "If you submit a file to one of the newsgroups and you wish to uuencode %> the file, _do not_ perform any type of file compression to this %> file before uuencoding it. This means don't arc the file, (insert %> other popular compression schemes for other computer systems here). %> If you do compress the file, it will actually get _larger_ when it is %> sent than it was originally. This costs us all money." % %I think Chris is going further than I suggested. I have no evidence that %the problem is compressing before uuencoding, and I suspect that it has %little to do with it. I was talking about the difference between sending %clear text and sending compressed/uuencoded text. I think it would be %interesting to check on what Chris is suggesting and get some numbers on %the difference between sending uuencoded binary files vs uuencoded compressed %binary files. I suspect that a uuencoded compressed binary file would %actually be smaller, but the further impact of the news software's compress %on the resulting files is unknown. %-- %Ron Heiby, heiby@mcdchg.UUCP Moderator: comp.newprod & comp.unix %"I know engineers. They love to change things." McCoy In fact the results *are* known... This comes up every couple months, and the plain fact is that running the compress algorithm twice on a piece of data *WILL* generate a larger file. Generally 30% larger. This will happen with both ARC files and files compressed by compress (4.0). They don't use identical algorithms, but both use modified Lempel-Ziv encoding schemes, and both react in the same way to being 'run over themselves.' Note - this is on the binary data itself. If you uuencode a compressed file, you will probably win in the long run. Figure about 40% compression, and 25% expansion, and *then* on some sites you'll get more compression during actual transit. Since the Lempel-Ziv scheme works so well on strings of printable text, in fact, it might be the optimal solution to post binaries as uuencoded compressed data...