[comp.compression] Announcing LZRW2

ross@spam.ua.oz.au (Ross Williams) (06/29/91)

NOTES ON THE LZRW2 ALGORITHM
============================
Author : Ross Williams.
Date   : 29-Jun-1991.

Abstract
--------
This file announces the release of my LZRW2 algorithm which was
invented on 25-Nov-1990. The LZRW2 is a descendant of the LZRW1-A
algorithm. LZRW2 runs a little more slowly and uses a little more
memory than LZRW1-A but yields better compression and is
deterministic. LZRW2 is not patented and is of the LZ77 class and so
is unlikely to be subject to a patent challenge. The exact figures for
the Calgary corpus on C implementations on my Macintosh are
(percentage remaining, compression speed, decompression speed, memory
required during compression and decompression):

           PerRem    ComK/S    DecK/S     ComMem    DecMem
LZRW1-A     55.1 %   17 K/s     69 K/s      16 K       0 K
LZRW2       51.5 %   18 K/s     54 K/s      24 K      16 K

LZRW2 has been written and released mainly to demonstrate the phrase
table idea but may also be useful to those who are unhappy with
LZRW1-A's compression performance.

Availability
------------
The only implementation available is in C. It can be found in the
following archive within a couple of days of 29-Jun-1991:

   FTP Archive Access:
   Machine   : sirius.itd.adelaide.edu.au   [IP=129.127.40.3]
   Directory : ~pub/compression
   Files     : lzrw2.tex   - This file.
               lzrw2.c     - An implementation in C.

Motivation for LZRW2
--------------------
LZ77 algorithms derive their high decompression speed from the
simplicity of the semantics of the code they use. To process each
item, the decompressor has to either copy a single literal byte or
convert an offset into an address and perform a multi-byte copy.
Because there is no dictionary to look up, the offset/length scheme is
extremely simple and fast to decode.

However, length/offsets schemes waste a lot of code space. For
example:

   1) Many offsets refer to the same string values.
   2) The length range may be too long for some strings.

More sophisticated algorithms (e.g. LZ78 class algorithms) are more
miserly with their code space and construct elaborate dictionaries to
map between individual strings and the code space values.

Although it may seem that one has to choose between a simple mapping
and a complex dictionary, I believe that there is a grey region of
extremely simple hardly-dictionary structures that bridge the gap.
LZRW2 uses such a structure which I call a phrase table.

To increase speed LZRW1-A (and LZRW1) only updates its hash table at
the end of each phrase (where here a phrase is defined to be either a
single-byte literal item or a copy item). This technique works well
because it means that the work that the algorithm puts into updating
its table is inversely proportional to how well it is compressing.
When it is compressing badly, it will be moving one literal byte at a
time and updating its table frequently. When compressing well, it will
be moving at a few (copy) bytes at a time, only updating its table
once every few bytes. The technique loses just a few percent
compression, but increases speed significantly.

A consequence of the technique is that MOST of the offset space is
rendered unusable because the compression algorithm will never attempt
to match at a particular offset unless that offset (or a corresponding
pointer) appears in the hash table, and if the hash able is updated
only every few bytes then most offsets won't make it into the table.

To exploit the redundant offset space, a phrase table mechanism can be
employed. A phrase table is a table of pointers to the first byte of
the most recent 4096 phrases processed by the algorithm. The phrase
table is organized as a circular array with a pointer to the next
position. Both the compressor and decompressor maintain the phrase
table which is very cheap to update, requiring only that once per
phrase a pointer to the start of the phrase just encoded/decoded be
overwritten at the next position in the phrase table and the position
pointer incremented circularly.

As in LZRW1-A, the compressor maintains a hash table (the decompressor
doesn't have to), but in LZRW2, the hash table entries are indexes to
the phrase table.

LZRW2 is identical to LZRW1-A except that instead of transmitting a
length and an offset, LZRW2 transmits a length and a phrase table
index.

The effect of the phrase table is not to fill in the gaps in the
history positions to which the compressor can point, but rather to
accept the gaps as good for speed and use the phrase table to increase
the reach of the compressor in pointing far back. LZRW1 and LZRW1-A
can only reach back 4095 BYTES whereas LZRW2 can reach back 4096
PHRASES which is a lot more, particularly as the average length of
phrases can get up in the 4 to 6 byte range.

Whether the circular phrase table idea is new or not (and as far as I
know, it is), the LZRW2 algorithm acts as a good example of its
application.

For more details, see the code itself.

Benchmark
---------
Here are the results of applying LZRW2.C compiled under THINK C 4.0
and running on a Mac-SE (8MHz 68000) to the standard Calgary corpus.

    +----------------------------------------------------------------+
    | DATA COMPRESSION TEST                                          |
    | =====================                                          |
    | Time of run     : Sat 29-Jun-1991 01:24PM                      |
    | Timing accuracy : One part in 100                              |
    | Context length  : 262144 bytes (= 256.0000K)                   |
    | Test suite      : Calgary Corpus Suite                         |
    | Files in suite  : 14                                           |
    | Algorithm       : LZRW2                                        |
    | Note: All averages are calculated from the un-rounded values.  |
    +----------------------------------------------------------------+
    | File Name   Length  CxB  ComLen  %Remn  Bits  Com K/s  Dec K/s |
    | ----------  ------  ---  ------  -----  ----  -------  ------- |
    | rpus:Bib.D  111261    1   58726   52.8  4.22    16.98    52.36 |
    | us:Book1.D  768771    3  491413   63.9  5.11    14.82    47.04 |
    | us:Book2.D  610856    3  331932   54.3  4.35    17.10    51.28 |
    | rpus:Geo.D  102400    1   84118   82.1  6.57    10.81    41.67 |
    | pus:News.D  377109    2  215652   57.2  4.57    15.20    50.68 |
    | pus:Obj1.D   21504    1   13032   60.6  4.85    13.13    50.15 |
    | pus:Obj2.D  246814    1  119078   48.2  3.86    17.81    55.01 |
    | s:Paper1.D   53161    1   28269   53.2  4.25    17.16    51.92 |
    | s:Paper2.D   82199    1   46789   56.9  4.55    16.58    49.96 |
    | rpus:Pic.D  513216    2  123311   24.0  1.92    33.17    71.63 |
    | us:Progc.D   39611    1   19959   50.4  4.03    17.65    53.36 |
    | us:Progl.D   71646    1   28571   39.9  3.19    22.63    59.13 |
    | us:Progp.D   49379    1   19544   39.6  3.17    22.52    59.45 |
    | us:Trans.D   93695    1   35364   37.7  3.02    22.87    60.89 |
    +----------------------------------------------------------------+
    | Average     224401    1  115411   51.5  4.12    18.46    53.89 |
    +----------------------------------------------------------------+

--<End of Release Notes for LZRW2>--