dwells@fits.cx.nrao.edu (Don Wells) (04/07/91)
Archive-name: compression/astro/bsplit-compress/1991-04-03 Archive-directory: fits.cx.nrao.edu:/FITS/HST/ [192.33.115.8] Original-posting-by: dwells@fits.cx.nrao.edu (Don Wells) Original-subject: Re: Atronomical data compression Reposted-by: emv@msen.com (Edward Vielmetti, MSEN) In article <4638@dftsrv.gsfc.nasa.gov> warnock@stars.gsfc.nasa.gov (Archie Warnock) writes: In article <1991Mar27.021241.6339@magnus.acs.ohio-state.edu>, henden@hpuxa.acs.ohio-state.edu (Arne A. Henden) writes... > One technique that we wanted to try, but have never taken >the time to program, is to use bit plane compression. For the I've looked into a variant on this idea - just by dividing the image into the high- and low-order bytes and comparing the compression factor this way with that for the entire (virgin) image. Used a couple of standard PC-type compression programs like PKZIP. It helped, but not as much as I'd have hoped. Typically, the resulting compressed images were about 10% - 15% smaller than if I just left the image alone. You might do better by breaking things up into individual bit-planes, but the last few planes would be so noisy, you might not. Last November a German astronomer asked me to compress several HST images. The result of my experiments is in the anonFTP server on fits.cx.nrao.edu [192.33.115.8] in directory /FITS/HST. The first of the six files which I processed is: 5158080 Oct 19 14:42 w0bs0102t_cvt.c0h 4380234 Nov 15 16:29 w0bs0102t_cvt.c0h.Z 3088384 Nov 16 00:16 w0bs0102t_cvt.c0h.tar The .Z is just "compress" [LZW], which got 15% in this case. The ".tar" contains: 931 Nov 15 23:56 1990 README 674 Nov 16 00:08 1990 Makefile 923 Nov 15 22:53 1990 bmerge.c 917 Nov 15 22:14 1990 bsplit.c 495707 Nov 16 00:11 1990 w0bs0102t_cvt.c0h.0.Z 2579040 Nov 16 00:11 1990 w0bs0102t_cvt.c0h.1 The original file has been split by bsplit.c into even and odd bytes. The even bytes compressed by 80%, but the odd (noisy low order) bytes were incompressible. Program bmerge.c can zipper the 3.1_MB of files back together to re-create the the original 5.2_MB file. In this case the technique removed 40% of the original bits (half of 80%). For a FP file you could get 25% immediately by splitting into four streams so that the favorable statistics of the exponent bytes could be exploited. In a binary table (visibility data) it would pay to split the rows into many separate byte streams to exploit the differing statistics of the various columns and of the bytes inside those columns. The multiple bytestream notion is a special case of the idea of splitting the stream into bitstreams and compressing them separately. Bottom line seems to be that it's easy (and fairly fast) to get the first 50% or so. Recoding from 16-bit numbers to 8-bit differences gets you that much, and only costs a single addition per pixel to restore. The hard work starts if you want more. I agree with Archie's remarks about the virtues of finite differences. My purpose in this posting is to point out that he may have been a bit too pessimistic about the efficacy of simple even-odd bytestream compression. -- Donald C. Wells Associate Scientist dwells@nrao.edu National Radio Astronomy Observatory +1-804-296-0277 Edgemont Road Fax= +1-804-296-0278 Charlottesville, Virginia 22903-2475 USA 78:31.1W, 38:02.2N