[net.mail] Data compression to lower phone bills

dyer@spdcc.UUCP (Steve Dyer) (05/29/86)

You know, there's no free lunch.  I seem to remember that compressing
and uncompressing a batch of news takes a significant chunk of a VAX 780,
probably a more significant impact than phone line charges for many sites,
which is one reason that compressed batching of news articles isn't as popular
as it might be.
-- 
Steve Dyer
dyer@harvard.HARVARD.EDU
{linus,wanginst,bbncca,bbnccv,harvard,ima,ihnp4}!spdcc!dyer

grr@cbmvax.cbm.UUCP (George Robbins) (05/29/86)

In article <327@spdcc.UUCP> dyer@spdcc.UUCP (Steve Dyer) writes:
>You know, there's no free lunch.  I seem to remember that compressing
>and uncompressing a batch of news takes a significant chunk of a VAX 780,
[...]
>Steve Dyer dyer@harvard.HARVARD.EDU

This is partly because when compress is made as a part of news, it is
compiled with the large system defaults.  This implies a big tradeoff of
memory in favor of speed.  If you don't have the memory, or are actually
sharing it with other tasks, the resultant thrashing will put your machine
to sleep.

You can get effective compression by using a smaller number of bits either
at compile or execute time.  Of course, the other end shouldn't send you
data compressed to a greater extent than you compile your end for...
--
George Robbins - now working with,      uucp: {ihnp4|seismo|caip}!cbmvax!grr
but no way officially representing      arpa: cbmvax!grr@seismo.css.GOV
Commodore, Engineering Department       fone: 215-431-9255 (only by moonlite)

jerry@oliveb.UUCP (Jerry Aguirre) (05/31/86)

I think that those people who are not using compress because of the
additional CPU overhead are not considering the entire picture.  Yes, it
takes cpu cycles to compress a batch of news.  But remember, by making
the batches smaller you save overhead in queueing and transmitting the
batch.  Here are some timings run on a medium loaded Vax750 running
4.2BSD.  The input file is a normal batch of news, compress is version 4.

First test is with two uncompressed batches of 50K
       12.5 real         1.7 user         1.7 sys  batch 50K
       13.5 real         2.7 user         2.0 sys  uux (copy and queue)
       15.7 real         1.8 user         2.2 sys  batch 50K
       16.6 real         2.9 user         2.2 sys  uux (copy and queue)
     1012.3 real        10.1 user        20.2 sys  uucico (2x50K)
			----             ----
			19.2		 28.3 = 47.5 cpu seconds

Second test is with a single compressed batch of 100K
       45.3 real         2.8 user         3.5 sys  batch 100K
       46.1 real         9.5 user         3.2 sys  compress 100K->50K
       46.3 real         2.4 user         2.5 sys  uux (copy and queue)
      508.3 real         6.2 user        12.5 sys  uucico (50K)
			----             ----
			20.9		 21.7 = 42.6 cpu seconds

These timings are of course subject to a lot of variation for different
hardware and different versions of uucp.  But in this configuration
where cpu cycles are at a premium it actually works out to be better to
compress than not!  The actual difference is probably much better as
some of that extra uucico activity consists of DH interrupts that are
probably not being charged to the uucico process.  Also the compress
process can easily be "niced" while using nice on the uucico process
will cause problems.

Older versions of compress would run lots faster if given a smaller
number of bits.  I ran some timing tests on version 4.0 and while it
seems optimized for either 12 or 16 bits the difference in cpu usage
between the two is negligible.  If you are concerned about memory
usage then I suggest you use 12 bits.  The difference in output file
size between using 12 and 16 bits of compression is only about 6
percent.  I would also urge upgrading to version 4.0 as it is
significantly faster than older versions.

In terms of system memory usage the 46 seconds of compress memory usage
can be traded off against the extra 504 seconds of uucico memory
usage.

So, you can have your cake and eat it.  Smaller queues, reduced phone
usage, and LESS cpu cycles.

					Jerry Aguirre @ Olivetti ATC
{hplabs|fortune|idi|ihnp4|tolerant|allegra|glacier|olhqma}!oliveb!jerry