[comp.mail.uucp] News compression and T2500's

jiro@shaman.com (Jiro Nakamura) (03/04/91)

Folks -
   I was wondering about CNews compression formats and what was the most
efficient way to transport news. If my memory serves me correctly, Cnews
using compress (1) to compress the news batches, and compress uses a 
Lempel-Ziv compression format.
   Now..... I was wondering if it might be more efficient to *not* pre-compress
the news at the *system* level, but let the modem do the compressing itself.
I think PEP compression is LZW (right?) and therefore equivalent to 
compress, but isn't v.42 LAP-M compression more efficient that LZW? If so,
then wouldn't it be better to not compress batches and let the LAP-M do
the dirty work for us?
  Any one have any comments?

	- jiro nakamura
	jiro@shaman.com
-- 
Jiro Nakamura				jiro@shaman.com
Shaman Consulting			(607) 253-0687 VOICE
"Bring your dead, dying shamans here!"	(607) 253-7809 FAX/Modem

grr@cbmvax.commodore.com (George Robbins) (03/04/91)

In article <1991Mar3.213145.6760@shaman.com> jiro@shaman.com (Jiro Nakamura) writes:
> Folks -
>    I was wondering about CNews compression formats and what was the most
> efficient way to transport news. If my memory serves me correctly, Cnews
> using compress (1) to compress the news batches, and compress uses a 
> Lempel-Ziv compression format.
>    Now..... I was wondering if it might be more efficient to *not* pre-compress
> the news at the *system* level, but let the modem do the compressing itself.

It's been previousy asserted that the number of instructions executed
per-byte in the compress program is significantly less than that involved
in typical unix character I/O processing.  If so, then minimizing the
number of characters at the rs232 interface is still an overall win.

There is also a question as to whether compression in a modem can be as
effective, since the modem (typically) has more limited memory resources
and must compress "on the fly".  Obviously, the real compression efficiency
depends on the quality of the algorithm and how well the data matches it's
assumptions.

-- 
George Robbins - now working for,     uucp:   {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing:   domain: grr@cbmvax.commodore.com
Commodore, Engineering Department     phone:  215-431-9349 (only by moonlite)

gandrews@netcom.COM (Greg Andrews) (03/05/91)

In article <1991Mar3.213145.6760@shaman.com> jiro@shaman.com (Jiro Nakamura) writes:
>
>Now..... I was wondering if it might be more efficient to *not* pre-compress
>the news at the *system* level, but let the modem do the compressing itself.
>I think PEP compression is LZW (right?) and therefore equivalent to 
>compress, but isn't v.42 LAP-M compression more efficient that LZW? If so,
>then wouldn't it be better to not compress batches and let the LAP-M do
>the dirty work for us?
>

Probably not, but it's hard to say clearly and cleanly one way or the
other.  The reason is there are several factors all happening at once.
In the T2500, V.42bis is only available in the slower modulations (V.32,
V.22bis, etc.).  In PEP mode, you have the standard LZW compression.
Remembering that the T2500 has a top RS232 speed of 19200, the most
you'll get out of either type of compression method is 19200 bps.

V.42bis is definitely a more capable method than either MNP5 or PEP
Compression, but it starts off at a slight disadvantage because it's
running on top of a slower modulation (approx 11,000 bps with V.42
error control vs. approx. 14,400 bps with PEP).

Along with those factors is the fact that the modem can only see a
small portion of the overall file.  A compression utility on the
computer may be able to perform a more thorough analysis of the
file than the modem can.  With the speed characteristics of the
T2500 (19200 bps max) you may end up with a file that is squeezed
smaller than the 2:1 ratio possible with V.42bis compression in the 
T2500.  Take the file that's been compressed >2:1 and send it over
the 13,000-14,400 bps PEP link (your mileage may vary) rather than 
the approx. 11,000 bps V.32/V.42 link, and you should end up moving 
the data faster.

Of course, this applies only to the T2500.  Modems with higher RS232
speeds may be able to get faster performance.  Whether it's better to
use the modem's compression or the computer's compression would be
a matter of which compression method can squeeze the data tighter.
With data that's mostly text (as is news), V.42bis may be able to
outperform the computer's compress utility.

In general, I believe in using the computer's horsepower to handle
compressing the data and leaving the modem's horsepower free to
move it through the phone lines.  OTOH, I also believe in doing
whatever gets the fastest results.

In the end (as I said before), it's hard to say which way is better.
The only way to tell for sure is to try some experiments and see.


-- 
.-------------------------------------------.
| Greg Andrews      |   gandrews@netcom.COM |
`-------------------------------------------'