brad@optilink.UUCP (Brad Yearwood) (08/06/88)
About Byte's high speed modem evaluation article: The results of the test would seem to indicate a serious flaw in the design of their evaluation experiment. The performance figures Byte gives for the Telcor 2496 modem are simply incredible. And I mean that in the literal sense of "lacking credibility". If the Telcor uses V.22-bis modulation as claimed, then it has a raw channel capacity of exactly 2400 bits/second in each direction of a full-duplex connection. Compression can only remove redundancy from the information stream going into the modem, and express it in some abbreviated form which can be reconstructed in the receiving modem. Truly random data, given even a modest sample size, will have very little redundant information. Given a very large sample size, it will have essentially no redundant information. Most electronic sources of "random" data are in fact pseudo-random. The manufacturers of these sources are typically forthright about this. Pseudo- random sources are quite adequate for their usual intended functions of measuring error rates in data transmission or data recording. But the considerations that are important for measuring transmission or recording error rates are not at all the same as those for measuring modem compression efficiency. Pseudo-random sources are typically periodic - their output consists of a series of repetitions of some string of bits, usually of modest length. The Hewlett Packard catalog shows that one of their test units intended for use at typical modem speeds offers sequence lengths of 63, 511, and 2047 bits. The objective of using a pseudo-random source in transmission error rate testing is to exercise a generous selection of patterns of adjacent single bits and small strings of bits, and to repeat this selection frequently enough to exercise all of the modulation states and transitions of the modem and find general degradation or sensitivity to specific adjacent combinations of bits or bit strings. Now let's take a look at what happens when we use such a periodic pseudo-random source to test the compression abilities of modems. Let's use the longest sequence offered by the modem-oriented unit that I found in the HP catalog: 2047 bits. I don't have at hand either the Byte article or the specifications for the data source they used - I don't believe that it was this particular HP unit, but the one they used is likely to be similar. Let's pessimistically assume byte-aligned compression. The least common denominator of 8 (an 8-bit byte) and 2047 (a 2047-bit pseudo-random sequence) is 16,376 bits, or 2047 bytes. A bytewise view of the longest sequence available from the particular modem tester I looked up would thus repeat every 2047 bytes. A compression technique that finds recurring strings of bytes, stores them in a dictionary, and replays subsequent occurrences as abbreviations (dictionary indices), with a dictionary of reasonable size, could very quickly learn either all 2047 bytes of this string, or at least significant substrings. Let's assume that the dictionary could recognize and hold the entire 2047-byte string, and that abbreviations are transmitted from the dictionary as 3-byte codes. The first occurrence of the 2047 byte string would be sent at about 2400 bits per second, and the receiving modem would concurrently enter it into the dictionary and signal it to its connected computer. Subsequent occurrences could be signalled over the telephone channel with a 3-byte abbreviation code for each occurrence. We can signal 100 such abbreviations per second. Each abbreviation is worth 2047 bytes, or 16,376 bits, of output. Gee - looks like we have a modem capable of 1,637,600 bits/second over an ordinary telephone line. If computers only had faster serial ports it would be time to short like crazy the stocks of N.E.T. and other T1 equipment manufacturers, and of AT&T and other T1 circuit providers. And probably of disk drive companies, too. The longest period I could locate among HP's bit error rate testers or other pseudo-random sources (except for sources fixed to a few very high telecom specific rates like 34.368 Mbits/second) was 1,048,575 bits. This could be a reasonable source for a compressing modem test - if I recall correctly, Byte's standard transmission sample size was something like 80Kbytes, which is smaller than the repetition period of this particular source. I am quite confident that the results would have been very different had Byte's reviewers used as their test data a similar source, or even 80K worth of compressed Usenet news. If we must consider a compressing modem as a black box (as we must with these miraculous proprietary compression schemes), then the only interesting or useful measures of its performance are: Interesting: How it performs on a reasonably large amount (maybe 100Kbytes) of truly _random_ data. I'll bet lunch that the throughput of ANY compressing modem on this test will be no better than the raw channel capacity of its modulation technique. For V.22-bis, this is 2400 bits/second in each direction, full duplex. For V.29, this is 9600 bits/second in one direction at a time. For V.32, this is 9600 bits/second in each direction, full duplex. For a Trailblazer it varies in small increments: 11000 bits/second in one direction at a time is what I see on a frequently-used 110 mile connection. Half- or adaptive- duplex modems such as V.29-variants and Trailblazers, in a full-duplex test should show throughputs at best a bit worse than half their one-direction raw channel capacity (and it can get significantly worse). Useful: How it performs on a large sample of the type of data you typically exchange, using whatever protocol and computers you typically use to exchange it. A modem evaluation could be designed to come up with a reasonable approximation to the "Useful" measure, by selecting and describing some sets of data which the evaluators believe to represent the types of data to be exchanged by typical systems in typical applications. These data could then be exchanged between one or more pairs of typical computers using a few typical protocols. And on a few real-world telephone connections. If a laboratory situation with artificially impaired transmission must be used, the impairment should be carefully specified by someone well-informed about the type and degree of impairments encountered on real-world telephone connections. Either the Telcor 2496 uses something other than V.22-bis modulation, or Byte's data source was periodic with a period significantly shorter than their transmission sample size, and therefore so flawed as to make their results worthless for those modems that perform compression, or I have just made a complete ass of myself. This is not to say anything bad about the Telcor. It is probably a competent V.22-bis modem with a competent compression algorithm that will do a good but not spectacular job on real-world data. ASCII English text should compress about 2:1. My >guess< is that this modem could send uncompressed Usenet news at around 4800 bits/second exclusive of uucp protocol overhead. It could probably send low-redundancy data such as LZW-compressed Usenet news at around 2400 bits/second (again, excluding uucp protocol overhead), which is just about what you'd expect from a V.22-bis modem! Brad Yearwood [speaking from, but not for:] Optilink Corp. {voder, pyramid}!kontron!optilink!brad Petaluma, CA (707) 795-9444