[news.admin] REPOST lharc102A Part 01/04 BSD Unix to Amiga archives

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (01/23/91)

peter@sugar.hackercorp.com (Peter da Silva) writes:

> Apologies in advance for the low-temp setting on this article.

Oh, I'll warm it right up for you, never fear.

> The highest costs associated with Usenet are telecommunications costs,
> and they are lower with plain text sources. Why? Because the most
> expensive links are compressed, and the
> compressed-uuencoded-recompressed version is quite a bit larger than
> the compressed version itself.

> It really is not appropriate to send stuff in uuencoded compressed
> archives unless there is some technical reason plain text won't work.

Well, let's address your points out of order.

First, doing a shar of the original clear text code received the following
report:

	Found 592 control chars in "'lh.doc.japanese'"
	Found 124 control chars in "'lh.inst.japanese'"
	Found 320 control chars in "'lh.n.japanese'"

So, using the recommended clear text technology, three of the enclosed
files would have arrived damaged.

Second, "compress,uuencode,recompress" is not the best use of technology;
I did a little test with the same files in just one big shar, to simplify
the reporting of the results:

The size of the original clear text shar:

-rw-r--r--  1 xanthian   179346 Jan 22 21:53 lha.sh

As typically compressed from clear text using sixteen bit "compress" to
transmit news:

-rw-r--r--  1 xanthian    76691 Jan 22 22:06 lha.sh.Z

The same shar as lharc'ed and uuencoded and then typically compressed
for news transmission:

-rw-r--r--  1 xanthian    58303 Jan 22 21:56 lha.lzh
-rw-r--r--  1 xanthian    80356 Jan 22 21:58 lha.lzh.uu
-rw-r--r--  1 xanthian    73077 Jan 22 21:58 lha.lzh.uu.Z

So in fact, for the files being sent, there is some modest _gain_ in
telecommunications efficiency by using the best compression technology
on text, and then uuencoding it and letting the standard net node to
node compression have its way with the files.

The conclusions are thus exactly opposite to both your arguments.

Don't feel bad, though, Peter, most folks don't realize how far
behind best technology "compress" has fallen, and continue to spout
the same superstitious nonsense you did.

Actually, things are a bit worse than that yet for the clear text case,
and better for the best technology case.

First, the standard response to the control characters problem is to
uuencode just the files with the control characters, and put them into a
shar with the remaining files as clear text. This gives a still bigger
shar:

-rw-r--r--  1 xanthian   187221 Jan 22 22:36 lhb.sh

which is quite a bit bigger than before when typically compressed for
transmission:

-rw-r--r--  1 xanthian    84945 Jan 22 22:32 lhb.sh.Z

Second, there is no reason to pay shar overhead, nor to uuencode the
control character containing files, with a competent archiving
compression tool, so compressing the original files filewise saves that
overhead:

-rw-r--r--  1 xanthian    57965 Jan 22 22:35 lhc.lzh
-rw-r--r--  1 xanthian    79890 Jan 22 22:38 lhc.lzh.uu

and leads to a modestly smaller _yet_ typically compressed file for
transmission:

-rw-r--r--  1 xanthian    72595 Jan 22 22:38 lhc.lzh.uu.Z

So, at the end, for the particular files under discussion in this
thread, best technology as opposed to the existing clear text methods
transmits 72595 bytes instead of 84945 bytes, or about 85% as much.
Fifteen percent off the phone bills would warm th cockles of any system
manager's heart.

At the recipients site, clear text requires 187221 bytes of spool space
to store, as opposed to the lharc'd uuencoded file's 79890 bytes, making
the latter 43% as much as the former, a huge savings in a crucial area
to every site at which the data is stored.

This is such fun, I always love arguing against indefensible positions.

Who's next with some wimpy excuse why the source file transmission
method that has been successfully used in comp.binaries.ibm.pc and
comp.sources.atari.st for just ages can't possibly work in the other
source groups?

I have yet to see a single argument for the present methods that
comes down, at the last, to anything but sheer laziness on the part
of those who don't want to change their habits.  Compressed, uuencoded
transmission methods win on every reasonable criterion.

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>
--
By the way, it is _not_ a solution to replace compress with a filter
form of lharc as the typical file compressor for telecommunications;
lharc is _much_ too slow to use at every step along the way, so it
needs to be done just once at the originating site to accomplish these
savings.