MMK102@psuvm.psu.edu (06/06/91)
Can someone please explain to me the details of this form of compression.
Thank you.
-------------------------------------------------------------------------
Michael Kercsmar |
MMK102 @ PSUVM (BitNet) | Post no bills
mmk102 @ psuvm.psu.edu (Internet) |
|
tnixon@hayes.uucp (06/06/91)
In article <91156.161304MMK102@psuvm.psu.edu>, <MMK102@psuvm.psu.edu> writes: > Can someone please explain to me the details of this form of compression. MNP5 is a modified adaptive Huffman-based compression scheme. The most frequently-occurring characters are encoded using fewer bits, and the less frequently-occurring characters are encoded using more bits. An MNP5 implementation basically keeps three tables: (1) the number of times each of the 256 possible characters has occurred (one entry in this table is incremented for each character sent), (2) the rank order of characters based on frequency (which is sorted after each character is sent), and (3) translation from rank order back to character (which is kept aligned with the second table). The tables are initialized with all counts 0, and the characters in ascending order by their binary values. When a character is received from the DTE, it is first encoded according to the current state of the tables, and then the tables are updated to reflect the fact that that character occurred in the data. The receiver decodes the character using its current tables, and then updates its tables the same way. Thus, the transmitter and receiver tables are always kept in synchronization. The encoding scheme is pretty simple: three bits for length (indicating the number of bits, minus one, that follow the length code), followed by 1 to 8 data bits. The most frequently occurring character is encoded as 0000, next as 0001, next as 00100, then 00101, etc., up to 11111111110; 1111111111 is followed by 1 more bit, so that you have 256 total codes, not 255. If you calculate this out, you find that the AVERAGE token size is 9 bits, which is why MNP5 is said to expand random data by 12%. There is no way to turn off compression once it is on (which there IS in V.42bis), so you're stuck sending the expanded data if this happens. A token can be split across frames (part in one frame, part in the next), and there is a way to flush out the last few bits of the last character if the line goes idle. If you need more information than this, I suggest that you contact Microcom for the MNP5 documentation. If you're thinking of implementing it, I caution you that even though Microcom has made the MNP5 specification public, it is protected by at least one patent that you would need to license. -- Toby Nixon, Principal Engineer | Voice +1-404-840-9200 Telex 151243420 Hayes Microcomputer Products Inc. | Fax +1-404-447-0178 CIS 70271,404 P.O. Box 105203 | UUCP uunet!hayes!tnixon AT&T !tnixon Atlanta, Georgia 30348 USA | Internet hayes!tnixon@uunet.uu.net