[net.news] What a good news compressor should do - current state

jim@haring.UUCP (04/01/84)

A (to some) considerable part of the net is already functioning with
compaction, everything to the east of and including decvax and philabs.
The trans-Atlantic links to mcvax and most feeds from there to other
countries in Continental Europe use a scheme suggested by Armando
Stettner and implemented by me, which uses Berkeley compact/uncompact
married to the programs to be found in a subdirectory of the news 2.10
distribution.

This has been working well since October '83. Average data compaction is
around 33%, plus there is lower overhead with UUCP protocol, giving larger
savings than at first sight. This scheme is completely separate from the
'normal' batching scheme, and the two can be maintained side by side.
It has an option to specify the maximum size of (uncompacted) batch file
to create, so it may make several smaller files rather than one large one.
This is needed as it the success rate in trying to transfer 50K+ files in
one attempt across the Big Puddle is not high.

James A. Woods, ames-lm!jaw, makes some comparisons between the Berkeley
compact/uncompact and the older pack/unpack. He may be right, the Berkeley
ones do seem to take a long time. Their advantage is that they work on
standard input/output, which lends itself nicely to this scheme and keeps
the complexity of the system down (I shudder at the thought of making the
news system MORE complex).

There may be other, better ways of doing this, I wouldn't be surprised.
But parts of the net are already using a compaction scheme, including
two major sites in the USA, and that should be taken into consideration
when designing/proposing a new one.

Jim McKie  Centrum voor Wiskunde en Informatica, Amsterdam  ....mcvax!jim