xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (01/23/91)
peter@sugar.hackercorp.com (Peter da Silva) writes: > Apologies in advance for the low-temp setting on this article. Oh, I'll warm it right up for you, never fear. > The highest costs associated with Usenet are telecommunications costs, > and they are lower with plain text sources. Why? Because the most > expensive links are compressed, and the > compressed-uuencoded-recompressed version is quite a bit larger than > the compressed version itself. > It really is not appropriate to send stuff in uuencoded compressed > archives unless there is some technical reason plain text won't work. Well, let's address your points out of order. First, doing a shar of the original clear text code received the following report: Found 592 control chars in "'lh.doc.japanese'" Found 124 control chars in "'lh.inst.japanese'" Found 320 control chars in "'lh.n.japanese'" So, using the recommended clear text technology, three of the enclosed files would have arrived damaged. Second, "compress,uuencode,recompress" is not the best use of technology; I did a little test with the same files in just one big shar, to simplify the reporting of the results: The size of the original clear text shar: -rw-r--r-- 1 xanthian 179346 Jan 22 21:53 lha.sh As typically compressed from clear text using sixteen bit "compress" to transmit news: -rw-r--r-- 1 xanthian 76691 Jan 22 22:06 lha.sh.Z The same shar as lharc'ed and uuencoded and then typically compressed for news transmission: -rw-r--r-- 1 xanthian 58303 Jan 22 21:56 lha.lzh -rw-r--r-- 1 xanthian 80356 Jan 22 21:58 lha.lzh.uu -rw-r--r-- 1 xanthian 73077 Jan 22 21:58 lha.lzh.uu.Z So in fact, for the files being sent, there is some modest _gain_ in telecommunications efficiency by using the best compression technology on text, and then uuencoding it and letting the standard net node to node compression have its way with the files. The conclusions are thus exactly opposite to both your arguments. Don't feel bad, though, Peter, most folks don't realize how far behind best technology "compress" has fallen, and continue to spout the same superstitious nonsense you did. Actually, things are a bit worse than that yet for the clear text case, and better for the best technology case. First, the standard response to the control characters problem is to uuencode just the files with the control characters, and put them into a shar with the remaining files as clear text. This gives a still bigger shar: -rw-r--r-- 1 xanthian 187221 Jan 22 22:36 lhb.sh which is quite a bit bigger than before when typically compressed for transmission: -rw-r--r-- 1 xanthian 84945 Jan 22 22:32 lhb.sh.Z Second, there is no reason to pay shar overhead, nor to uuencode the control character containing files, with a competent archiving compression tool, so compressing the original files filewise saves that overhead: -rw-r--r-- 1 xanthian 57965 Jan 22 22:35 lhc.lzh -rw-r--r-- 1 xanthian 79890 Jan 22 22:38 lhc.lzh.uu and leads to a modestly smaller _yet_ typically compressed file for transmission: -rw-r--r-- 1 xanthian 72595 Jan 22 22:38 lhc.lzh.uu.Z So, at the end, for the particular files under discussion in this thread, best technology as opposed to the existing clear text methods transmits 72595 bytes instead of 84945 bytes, or about 85% as much. Fifteen percent off the phone bills would warm th cockles of any system manager's heart. At the recipients site, clear text requires 187221 bytes of spool space to store, as opposed to the lharc'd uuencoded file's 79890 bytes, making the latter 43% as much as the former, a huge savings in a crucial area to every site at which the data is stored. This is such fun, I always love arguing against indefensible positions. Who's next with some wimpy excuse why the source file transmission method that has been successfully used in comp.binaries.ibm.pc and comp.sources.atari.st for just ages can't possibly work in the other source groups? I have yet to see a single argument for the present methods that comes down, at the last, to anything but sheer laziness on the part of those who don't want to change their habits. Compressed, uuencoded transmission methods win on every reasonable criterion. Kent, the man from xanth. <xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us> -- By the way, it is _not_ a solution to replace compress with a filter form of lharc as the typical file compressor for telecommunications; lharc is _much_ too slow to use at every step along the way, so it needs to be done just once at the originating site to accomplish these savings.