[comp.compression] SAKDC availability?

jrc@concurrent.co.uk (John Connett) (05/29/91)

Are there any freely available implementations of Ross Williams' SAKDC?

How does it compare with other compressors?

John
jrc@concurrent.co.uk

ross@spam.ua.oz.au (Ross Williams) (05/29/91)

>Are there any freely available implementations of Ross Williams' SAKDC?
>How does it compare with other compressors?

SAKDC is at this stage not very accessible or portable. Reasons:

   * It is written in Ada.
   * It has some VMS specific code.
   * From memory, it is about 5000 lines of code.
   * It was written using a preprocessor called FunnelWeb (like Knuth's Web).
   * FunnelWeb is written in Ada.
   * FunnelWeb is highly VMS specific (e.g. it $crmpsc the input file).
   * The whole lot is on a VAX backup tape somewhere along with a
     whole lot of other stuff.
   * I don't have easy access to a VMS VAX or a VAX Ada compiler.

I will release  the sources eventually (probably in a  year or so). If
anyone wants  them earlier,  they are  welcome to  throw money  at me.
Alternatively they could re-implement the algorithm based on my book.

As  far  as I  am  aware,  SAKDC gives  the  best  compression of  ANY
algorithm in existence (if you can  do better - scream out!). Evidence
for this can be  found in my book and the Bell book.  Table B-1 of the
Bell  book gives  the compression  performance on  the Calgary  (Bell)
corpus for 18 algorithms. The best  compression is given by the Markov
algorithms of  which PPMC is the  best, yielding 2.48 bits  per symbol
(the next closest is  2.74). From the O20000 column of  Table 39 of my
book (and omitting the files  paper3..paper6 as Bell Cleary and Witten
did)  SAKDC yielded  2.464 bits  per symbol.  This was  using a  large
amount of  memory. For  a small  amount of  memory (200  nodes), SAKDC
yields about 18% (absolute) better  compression than PPMC (3.344b/s vs
4.784b/s (includes paper3-paper6)).

Most of the credit for the  performance of SAKDC belongs to Cleary and
Witten who  devised the  basic engine  of SAKDC  (PPM) and  Moffat who
tuned the engine (PPMC). I just gave the engine a thorough tune up and
introduced structural adaptivity. You can read all about it in Chapter
four of my book...

Of course SAKDC runs as fast as a sedated sloth - what did you expect?
The  best speed  ever obtained  was about  300 bytes  per second  on a
Vax750. I am sure it could be  made much faster though - at the time I
was not much  interested in speed. Moffat has clocked  PPMC (which has
similar data structures) at 4K/sec on  a Sun3 so it should be possible
to get SAKDC up around there too.

SAKDC  stands  for  Swiss  Army Knife  Data  Compression  because  the
algorithm has so  many parameters.

Ross Williams
ross@spam.ua.oz.au