[comp.dcom.modems] Byte's high-speed modem evaluation - bogus

brad@optilink.UUCP (Brad Yearwood) (08/06/88)

About Byte's high speed modem evaluation article:

The results of the test would seem to indicate a serious flaw in the design
of their evaluation experiment.

The performance figures Byte gives for the Telcor 2496 modem are simply
incredible.  And I mean that in the literal sense of "lacking credibility".
If the Telcor uses V.22-bis modulation as claimed, then it has a raw
channel capacity of exactly 2400 bits/second in each direction of a
full-duplex connection.

Compression can only remove redundancy from the information stream going into
the modem, and express it in some abbreviated form which can be reconstructed
in the receiving modem.

Truly random data, given even a modest sample size, will have very little
redundant information.  Given a very large sample size, it will have
essentially no redundant information.

Most electronic sources of "random" data are in fact pseudo-random.  The
manufacturers of these sources are typically forthright about this.  Pseudo-
random sources are quite adequate for their usual intended functions of
measuring error rates in data transmission or data recording.  But the
considerations that are important for measuring transmission or recording
error rates are not at all the same as those for measuring modem compression
efficiency.

Pseudo-random sources are typically periodic - their output consists of a series
of repetitions of some string of bits, usually of modest length.  The Hewlett
Packard catalog shows that one of their test units intended for use at typical
modem speeds offers sequence lengths of 63, 511, and 2047 bits.  The objective of
using a pseudo-random source in transmission error rate testing is to exercise
a generous selection of patterns of adjacent single bits and small strings of
bits, and to repeat this selection frequently enough to exercise all of the
modulation states and transitions of the modem and find general degradation
or sensitivity to specific adjacent combinations of bits or bit strings.

Now let's take a look at what happens when we use such a periodic pseudo-random
source to test the compression abilities of modems.  Let's use the longest
sequence offered by the modem-oriented unit that I found in the HP catalog:
2047 bits.  I don't have at hand either the Byte article or the specifications
for the data source they used - I don't believe that it was this particular HP
unit, but the one they used is likely to be similar.

Let's pessimistically assume byte-aligned compression.  The least common
denominator of 8 (an 8-bit byte) and 2047 (a 2047-bit pseudo-random sequence)
is 16,376 bits, or 2047 bytes.  A bytewise view of the longest sequence
available from the particular modem tester I looked up would thus repeat every
2047 bytes.  A compression technique that finds recurring strings of bytes,
stores them in a dictionary, and replays subsequent occurrences as abbreviations
(dictionary indices), with a dictionary of reasonable size, could very quickly
learn either all 2047 bytes of this string, or at least significant substrings.
Let's assume that the dictionary could recognize and hold the entire 2047-byte
string, and that abbreviations are transmitted from the dictionary as 3-byte codes.

The first occurrence of the 2047 byte string would be sent at about 2400 bits
per second, and the receiving modem would concurrently enter it into the
dictionary and signal it to its connected computer.  Subsequent occurrences
could be signalled over the telephone channel with a 3-byte abbreviation code
for each occurrence.  We can signal 100 such abbreviations per second.  Each
abbreviation is worth 2047 bytes, or 16,376 bits, of output.  Gee - looks like
we have a modem capable of 1,637,600 bits/second over an ordinary telephone
line.  If computers only had faster serial ports it would be time to short
like crazy the stocks of N.E.T. and other T1 equipment manufacturers, and of
AT&T and other T1 circuit providers.  And probably of disk drive companies, too.

The longest period I could locate among HP's bit error rate testers or other
pseudo-random sources (except for sources fixed to a few very high telecom
specific rates like 34.368 Mbits/second) was 1,048,575 bits.  This could
be a reasonable source for a compressing modem test - if I recall correctly,
Byte's standard transmission sample size was something like 80Kbytes, which
is smaller than the repetition period of this particular source.  I am quite
confident that the results would have been very different had Byte's reviewers
used as their test data a similar source, or even 80K worth of compressed Usenet
news.

If we must consider a compressing modem as a black box (as we must with
these miraculous proprietary compression schemes), then the only interesting
or useful measures of its performance are:

Interesting:

      How it performs on a reasonably large amount (maybe 100Kbytes) of
      truly _random_ data.  I'll bet lunch that the throughput of ANY
      compressing modem on this test will be no better than the
      raw channel capacity of its modulation technique.  For V.22-bis,
      this is 2400 bits/second in each direction, full duplex.  For
      V.29, this is 9600 bits/second in one direction at a time.  For
      V.32, this is 9600 bits/second in each direction, full duplex.
      For a Trailblazer it varies in small increments: 11000 bits/second
      in one direction at a time is what I see on a frequently-used 110
      mile connection.  Half- or adaptive- duplex modems such as V.29-variants
      and Trailblazers, in a full-duplex test should show throughputs at best
      a bit worse than half their one-direction raw channel capacity (and
      it can get significantly worse).

Useful:

      How it performs on a large sample of the type of data you typically
      exchange, using whatever protocol and computers you typically use
      to exchange it.

A modem evaluation could be designed to come up with a reasonable
approximation to the "Useful" measure, by selecting and describing some
sets of data which the evaluators believe to represent the types of data to
be exchanged by typical systems in typical applications.  These data
could then be exchanged between one or more pairs of typical computers using
a few typical protocols.  And on a few real-world telephone connections.
If a laboratory situation with artificially impaired transmission must be
used, the impairment should be carefully specified by someone well-informed
about the type and degree of impairments encountered on real-world telephone
connections.

Either the Telcor 2496 uses something other than V.22-bis modulation, or
Byte's data source was periodic with a period significantly shorter than
their transmission sample size, and therefore so flawed as to make their
results worthless for those modems that perform compression, or I have
just made a complete ass of myself.

This is not to say anything bad about the Telcor.  It is probably a competent
V.22-bis modem with a competent compression algorithm that will do a good but
not spectacular job on real-world data.  ASCII English text should compress
about 2:1.  My >guess< is that this modem could send uncompressed Usenet news
at around 4800 bits/second exclusive of uucp protocol overhead.  It could
probably send low-redundancy data such as LZW-compressed Usenet news at around
2400 bits/second (again, excluding uucp protocol overhead), which is just about
what you'd expect from a V.22-bis modem!

Brad Yearwood
[speaking from, but not for:]
Optilink Corp.   {voder, pyramid}!kontron!optilink!brad
Petaluma, CA
(707) 795-9444