[comp.compression] Number theory compression ?

beville@motcid.UUCP (Anthony T. Beville) (04/08/91)

drenze@umaxc.weeg.uiowa.edu (Douglas Renze) writes:

>In article <1991Apr7.020617.12261@nntp-server.caltech.edu> toddpw@nntp-server.caltech.edu (Todd P. Whitesel) writes:
>>Has anybody tried developing a compression or encryption scheme that uses the
>>digits or bits of an irrational number as a key or pseudorandom generator?

I have been pondering just this idea for quite some time, but
have never seriously investigated it.  

It seems to me that a scheme using the the digits of
and irrational number ( or a repeating rational number, for that
matter) might be able to yeild some really good compression
ratios.  

For any given sequence of bits longer than some number,  
merely (!) find a matching sequence deep within the bowels 
of pi or sqrt(2) or some fraction or whatever, and represent
that sequence with a greatly reduced number of bits.  

The compression would have to be a very time-consuming process,
and would yield varied results depending upon how much time was 
spent.

The decompression, however, would probably be a snap.

Has anyone done anything like this?  I would really like
to see some ideas on this topic!


-- 
| Tony Beville                          |Motorola Inc.              |
| Phone: (708) 632-6622                 |1501 W. Shure Drive        |
| uunet!{motcid,mcdchg}!beville         |Arlington Heights,IL 60004 |
| {motcid,mcdchg}!beville@uunet.uu.net  |Mail Stop: IL27-1252       |

ar12@prism.gatech.EDU (REGISTER,ANDREW H) (04/10/91)

In article <4931@pink1.UUCP>, beville@motcid.UUCP (Anthony T. Beville) writes:
> For any given sequence of bits longer than some number,  
> merely (!) find a matching sequence deep within the bowels 
> of pi or sqrt(2) or some fraction or whatever, and represent
> that sequence with a greatly reduced number of bits.  
> 

As best I can tell, this is the kind of thing that goes on when you play the 
fractal game (as Barnsley calls it).  The generation of a picture is based
on a set of attractors each with a strength, a probability and a affine (I 
think I spelled that correctly) transformation that is basically a scale and
a rotation.  I would be pretty sure that you could represent an irrational number
based on a set of linear (ie. not 2-D) attractors.  As such, you could then
compress a number like PI (no flames please) accurate to an infinite number of bits just
by describing the attractors.  I know of no work in this area but the trick is
to find just the right attractors!  If you claim to know how to do that, you can
get a lot of money from venture capitalists.  All the work I have seen is in the
image compression arena.
-- 
Andy Register  Internet: ar12@prism.gatech.edu   Bitnet: aregiste@gtri01.bitnet

-- Sometimes the Bears Win, Sometimes the Bulls Win --
    -------- But the Pigs *Always* Lose --------              (author unknown)

vd09+@andrew.cmu.edu (Vincent M. Del Vecchio) (04/10/91)

Considering that each input sequence has to yield a unique output, it seems to
me that the amount of data that you would need to have to specify where within
pi or sqrt(2) your sequence occurs would probably be fairly large.  In fact, I
don't think it would necessarily be any smaller than the input sequence, unless
you could find a sort of "region" of pi which contains many sequences that
people are interested in compressing, and sort of prioritize various regions
based on how desirable it is to many the sequences in that region
compressible.... Sounds like a sort of massive Huffman encoding, doesn't it?
Anyway, it's an interesting idea, though I imagine it wouldn't work very well
in practice, at least not without a lot of thought. 

-Vincent Del Vecchio
vd09@andrew.cmu.edu

gandalf@csli.Stanford.EDU (Juergen Wagner) (04/10/91)

It remains to be shown that for a particular irrational number x, any
positive integer n, and any sequence of digits of length n, there is a
substring of the decimal representation of x which matches that sequence.
Before trying a compression algorithm of the described kind, we should
guarantee that the process of looking for the substring really terminates.

--Juergen Wagner
(gandalf@csli.stanford.edu)
(wagner@iao.fhg.de)

sigma@sun.ipl.rpi.edu (Kevin Martin) (04/11/91)

gandalf@csli.Stanford.EDU (Juergen Wagner) writes:
>It remains to be shown that for a particular irrational number x, any
>positive integer n, and any sequence of digits of length n, there is a
>substring of the decimal representation of x which matches that sequence.
>Before trying a compression algorithm of the described kind, we should
>guarantee that the process of looking for the substring really terminates.

Which is exactly the catch.  The process would NOT terminate unless you put
in some sort of cutoff where the program gives up and goes to a different
algorithm.  The decimal representation of x has an infinite number of
digits.  You're searching an infinite space for a given string, so it
doesn't seem you could be guaranteed to find it.

Now, if x were an irrational number consisting of the "incompressible"
string 0123456789101112131415161718192021... in its decimal representation,
all strings are in fact represented in the infinite expansion.  However,
the numbers indicating where in the sequence to find the string you want
would be at least as large as the original string!  For example, the string
'77' occurs at position '135'.  Actually, a string such as '71' occurs at
position '26' but that doesn't provide very impressive compression
considering the other possible results.

No, none of the above is rigorous.  You get what you pay for.

-- 
Kevin Martin
sigma@ipl.rpi.edu

ttobler@unislc.uucp (Trent Tobler) (04/13/91)

From article <4931@pink1.UUCP>, by beville@motcid.UUCP (Anthony T. Beville):
> drenze@umaxc.weeg.uiowa.edu (Douglas Renze) writes:
> 
> I have been pondering just this idea for quite some time, but
> have never seriously investigated it.  
> 
> It seems to me that a scheme using the the digits of
> and irrational number ( or a repeating rational number, for that
> matter) might be able to yeild some really good compression
> ratios.  
> 
> For any given sequence of bits longer than some number,  
> merely (!) find a matching sequence deep within the bowels 
> of pi or sqrt(2) or some fraction or whatever, and represent
> that sequence with a greatly reduced number of bits.  

The problem with this is that it takes as much information to encode the
postion within sequence as it would represent the sequence itself.

For more on this, look into any book on entropy of information.

--
     Trent Tobler - ttolber@csulx.weber.edu