dyer@spdcc.UUCP (Steve Dyer) (05/29/86)
You know, there's no free lunch. I seem to remember that compressing and uncompressing a batch of news takes a significant chunk of a VAX 780, probably a more significant impact than phone line charges for many sites, which is one reason that compressed batching of news articles isn't as popular as it might be. -- Steve Dyer dyer@harvard.HARVARD.EDU {linus,wanginst,bbncca,bbnccv,harvard,ima,ihnp4}!spdcc!dyer
grr@cbmvax.cbm.UUCP (George Robbins) (05/29/86)
In article <327@spdcc.UUCP> dyer@spdcc.UUCP (Steve Dyer) writes: >You know, there's no free lunch. I seem to remember that compressing >and uncompressing a batch of news takes a significant chunk of a VAX 780, [...] >Steve Dyer dyer@harvard.HARVARD.EDU This is partly because when compress is made as a part of news, it is compiled with the large system defaults. This implies a big tradeoff of memory in favor of speed. If you don't have the memory, or are actually sharing it with other tasks, the resultant thrashing will put your machine to sleep. You can get effective compression by using a smaller number of bits either at compile or execute time. Of course, the other end shouldn't send you data compressed to a greater extent than you compile your end for... -- George Robbins - now working with, uucp: {ihnp4|seismo|caip}!cbmvax!grr but no way officially representing arpa: cbmvax!grr@seismo.css.GOV Commodore, Engineering Department fone: 215-431-9255 (only by moonlite)
jerry@oliveb.UUCP (Jerry Aguirre) (05/31/86)
I think that those people who are not using compress because of the additional CPU overhead are not considering the entire picture. Yes, it takes cpu cycles to compress a batch of news. But remember, by making the batches smaller you save overhead in queueing and transmitting the batch. Here are some timings run on a medium loaded Vax750 running 4.2BSD. The input file is a normal batch of news, compress is version 4. First test is with two uncompressed batches of 50K 12.5 real 1.7 user 1.7 sys batch 50K 13.5 real 2.7 user 2.0 sys uux (copy and queue) 15.7 real 1.8 user 2.2 sys batch 50K 16.6 real 2.9 user 2.2 sys uux (copy and queue) 1012.3 real 10.1 user 20.2 sys uucico (2x50K) ---- ---- 19.2 28.3 = 47.5 cpu seconds Second test is with a single compressed batch of 100K 45.3 real 2.8 user 3.5 sys batch 100K 46.1 real 9.5 user 3.2 sys compress 100K->50K 46.3 real 2.4 user 2.5 sys uux (copy and queue) 508.3 real 6.2 user 12.5 sys uucico (50K) ---- ---- 20.9 21.7 = 42.6 cpu seconds These timings are of course subject to a lot of variation for different hardware and different versions of uucp. But in this configuration where cpu cycles are at a premium it actually works out to be better to compress than not! The actual difference is probably much better as some of that extra uucico activity consists of DH interrupts that are probably not being charged to the uucico process. Also the compress process can easily be "niced" while using nice on the uucico process will cause problems. Older versions of compress would run lots faster if given a smaller number of bits. I ran some timing tests on version 4.0 and while it seems optimized for either 12 or 16 bits the difference in cpu usage between the two is negligible. If you are concerned about memory usage then I suggest you use 12 bits. The difference in output file size between using 12 and 16 bits of compression is only about 6 percent. I would also urge upgrading to version 4.0 as it is significantly faster than older versions. In terms of system memory usage the 46 seconds of compress memory usage can be traded off against the extra 504 seconds of uucico memory usage. So, you can have your cake and eat it. Smaller queues, reduced phone usage, and LESS cpu cycles. Jerry Aguirre @ Olivetti ATC {hplabs|fortune|idi|ihnp4|tolerant|allegra|glacier|olhqma}!oliveb!jerry
tanner@ki4pv.UUCP (Tanner Andrews) (06/06/86)
News is distributed via a "diffusion" scheme, where articles are passed along paths which are often redundant. When the n'th copy of an article (for n != 1) arrives at a site, it is tossed out. The first copy is the only copy kept or passed along. If we merely pass along the news without uncompressing and unbatching it, we lose the ability to toss duplicates from our site. We also lose the ability to _not_ pay phone bills to re-transmit the duplicate. If you are not batching your news before transmission, your phone bills are much higher than they should be. You are losing a fair amount of the value of compression if you compress 100 short articles seperately rather than the same articles in a batch -- even if you have an iAEC-286 processor which is limited to 12-bit compression. There is also a certain amount of overhead PER FILE transmitted; for for short files the UUCP negotiation may take as long as the actual file transmission. If you batch and transmit 100 articles, there are two files transmitted (the batch + the UUX file). Transmit each article seperately, and you have to negotiate 200 times for 200 files. -- <std dsclm, copies upon request> Tanner Andrews