jrc@concurrent.co.uk (John Connett) (05/29/91)
Are there any freely available implementations of Ross Williams' SAKDC? How does it compare with other compressors? John jrc@concurrent.co.uk
ross@spam.ua.oz.au (Ross Williams) (05/29/91)
>Are there any freely available implementations of Ross Williams' SAKDC? >How does it compare with other compressors? SAKDC is at this stage not very accessible or portable. Reasons: * It is written in Ada. * It has some VMS specific code. * From memory, it is about 5000 lines of code. * It was written using a preprocessor called FunnelWeb (like Knuth's Web). * FunnelWeb is written in Ada. * FunnelWeb is highly VMS specific (e.g. it $crmpsc the input file). * The whole lot is on a VAX backup tape somewhere along with a whole lot of other stuff. * I don't have easy access to a VMS VAX or a VAX Ada compiler. I will release the sources eventually (probably in a year or so). If anyone wants them earlier, they are welcome to throw money at me. Alternatively they could re-implement the algorithm based on my book. As far as I am aware, SAKDC gives the best compression of ANY algorithm in existence (if you can do better - scream out!). Evidence for this can be found in my book and the Bell book. Table B-1 of the Bell book gives the compression performance on the Calgary (Bell) corpus for 18 algorithms. The best compression is given by the Markov algorithms of which PPMC is the best, yielding 2.48 bits per symbol (the next closest is 2.74). From the O20000 column of Table 39 of my book (and omitting the files paper3..paper6 as Bell Cleary and Witten did) SAKDC yielded 2.464 bits per symbol. This was using a large amount of memory. For a small amount of memory (200 nodes), SAKDC yields about 18% (absolute) better compression than PPMC (3.344b/s vs 4.784b/s (includes paper3-paper6)). Most of the credit for the performance of SAKDC belongs to Cleary and Witten who devised the basic engine of SAKDC (PPM) and Moffat who tuned the engine (PPMC). I just gave the engine a thorough tune up and introduced structural adaptivity. You can read all about it in Chapter four of my book... Of course SAKDC runs as fast as a sedated sloth - what did you expect? The best speed ever obtained was about 300 bytes per second on a Vax750. I am sure it could be made much faster though - at the time I was not much interested in speed. Moffat has clocked PPMC (which has similar data structures) at 4K/sec on a Sun3 so it should be possible to get SAKDC up around there too. SAKDC stands for Swiss Army Knife Data Compression because the algorithm has so many parameters. Ross Williams ross@spam.ua.oz.au