dyer@spdcc.UUCP (Steve Dyer) (05/29/86)
You know, there's no free lunch. I seem to remember that compressing
and uncompressing a batch of news takes a significant chunk of a VAX 780,
probably a more significant impact than phone line charges for many sites,
which is one reason that compressed batching of news articles isn't as popular
as it might be.
--
Steve Dyer
dyer@harvard.HARVARD.EDU
{linus,wanginst,bbncca,bbnccv,harvard,ima,ihnp4}!spdcc!dyergrr@cbmvax.cbm.UUCP (George Robbins) (05/29/86)
In article <327@spdcc.UUCP> dyer@spdcc.UUCP (Steve Dyer) writes: >You know, there's no free lunch. I seem to remember that compressing >and uncompressing a batch of news takes a significant chunk of a VAX 780, [...] >Steve Dyer dyer@harvard.HARVARD.EDU This is partly because when compress is made as a part of news, it is compiled with the large system defaults. This implies a big tradeoff of memory in favor of speed. If you don't have the memory, or are actually sharing it with other tasks, the resultant thrashing will put your machine to sleep. You can get effective compression by using a smaller number of bits either at compile or execute time. Of course, the other end shouldn't send you data compressed to a greater extent than you compile your end for... -- George Robbins - now working with, uucp: {ihnp4|seismo|caip}!cbmvax!grr but no way officially representing arpa: cbmvax!grr@seismo.css.GOV Commodore, Engineering Department fone: 215-431-9255 (only by moonlite)
jerry@oliveb.UUCP (Jerry Aguirre) (05/31/86)
I think that those people who are not using compress because of the
additional CPU overhead are not considering the entire picture. Yes, it
takes cpu cycles to compress a batch of news. But remember, by making
the batches smaller you save overhead in queueing and transmitting the
batch. Here are some timings run on a medium loaded Vax750 running
4.2BSD. The input file is a normal batch of news, compress is version 4.
First test is with two uncompressed batches of 50K
12.5 real 1.7 user 1.7 sys batch 50K
13.5 real 2.7 user 2.0 sys uux (copy and queue)
15.7 real 1.8 user 2.2 sys batch 50K
16.6 real 2.9 user 2.2 sys uux (copy and queue)
1012.3 real 10.1 user 20.2 sys uucico (2x50K)
---- ----
19.2 28.3 = 47.5 cpu seconds
Second test is with a single compressed batch of 100K
45.3 real 2.8 user 3.5 sys batch 100K
46.1 real 9.5 user 3.2 sys compress 100K->50K
46.3 real 2.4 user 2.5 sys uux (copy and queue)
508.3 real 6.2 user 12.5 sys uucico (50K)
---- ----
20.9 21.7 = 42.6 cpu seconds
These timings are of course subject to a lot of variation for different
hardware and different versions of uucp. But in this configuration
where cpu cycles are at a premium it actually works out to be better to
compress than not! The actual difference is probably much better as
some of that extra uucico activity consists of DH interrupts that are
probably not being charged to the uucico process. Also the compress
process can easily be "niced" while using nice on the uucico process
will cause problems.
Older versions of compress would run lots faster if given a smaller
number of bits. I ran some timing tests on version 4.0 and while it
seems optimized for either 12 or 16 bits the difference in cpu usage
between the two is negligible. If you are concerned about memory
usage then I suggest you use 12 bits. The difference in output file
size between using 12 and 16 bits of compression is only about 6
percent. I would also urge upgrading to version 4.0 as it is
significantly faster than older versions.
In terms of system memory usage the 46 seconds of compress memory usage
can be traded off against the extra 504 seconds of uucico memory
usage.
So, you can have your cake and eat it. Smaller queues, reduced phone
usage, and LESS cpu cycles.
Jerry Aguirre @ Olivetti ATC
{hplabs|fortune|idi|ihnp4|tolerant|allegra|glacier|olhqma}!oliveb!jerrytanner@ki4pv.UUCP (Tanner Andrews) (06/06/86)
News is distributed via a "diffusion" scheme, where articles are passed along paths which are often redundant. When the n'th copy of an article (for n != 1) arrives at a site, it is tossed out. The first copy is the only copy kept or passed along. If we merely pass along the news without uncompressing and unbatching it, we lose the ability to toss duplicates from our site. We also lose the ability to _not_ pay phone bills to re-transmit the duplicate. If you are not batching your news before transmission, your phone bills are much higher than they should be. You are losing a fair amount of the value of compression if you compress 100 short articles seperately rather than the same articles in a batch -- even if you have an iAEC-286 processor which is limited to 12-bit compression. There is also a certain amount of overhead PER FILE transmitted; for for short files the UUCP negotiation may take as long as the actual file transmission. If you batch and transmit 100 articles, there are two files transmitted (the batch + the UUX file). Transmit each article seperately, and you have to negotiate 200 times for 200 files. -- <std dsclm, copies upon request> Tanner Andrews